Skip to main content

Foundations

Basic Logic and Proof Techniques

The proof vocabulary behind serious mathematics and ML theory: implication, quantifiers, contradiction, contrapositive, induction, construction, cases, and counterexamples.

CoreTier 2StableSupporting~50 min

Why This Matters

Every theorem on this site is built from a small set of proof moves. If those moves feel mysterious, learning theory, optimization, probability, algorithms, and statistics all become harder than they need to be.

The goal is not to memorize proof names. The goal is to look at a statement and ask:

  • What are the assumptions?
  • What must be shown?
  • Which quantifier or implication is the bottleneck?
  • Would a direct proof, contrapositive, contradiction, induction, construction, or counterexample make the structure simpler?

Propositions And Implications

Definition

Proposition

A proposition is a statement that is either true or false. "The model overfits" is not precise enough by itself. "The training error is lower than the validation error by at least 0.20.2" is closer to a mathematical proposition.

Definition

Implication

The implication PQP\Rightarrow Q means: whenever PP is true, QQ must be true. It is false only in the case where PP is true and QQ is false.

PPQQPQP\Rightarrow Q
truetruetrue
truefalsefalse
falsetruetrue
falsefalsetrue

The last two rows are called vacuous truth. If the premise is false, the implication has not been violated.

Definition

Contrapositive

The contrapositive of PQP\Rightarrow Q is ¬Q¬P\neg Q\Rightarrow \neg P. The original implication and its contrapositive are logically equivalent.

Definition

Converse and Inverse

The converse of PQP\Rightarrow Q is QPQ\Rightarrow P. The inverse is ¬P¬Q\neg P\Rightarrow \neg Q. Neither is equivalent to the original implication in general.

Quantifiers

Definition

Universal Quantifier

xS, P(x)\forall x\in S,\ P(x) means P(x)P(x) holds for every element of SS.

Definition

Existential Quantifier

xS, P(x)\exists x\in S,\ P(x) means at least one element of SS satisfies P(x)P(x).

Negating quantifiers is one of the highest-yield proof skills:

¬(xS, P(x))xS such that ¬P(x),\neg(\forall x\in S,\ P(x)) \equiv \exists x\in S\ \text{such that}\ \neg P(x), ¬(xS, P(x))xS, ¬P(x).\neg(\exists x\in S,\ P(x)) \equiv \forall x\in S,\ \neg P(x).

For nested quantifiers, negate from the outside inward. For example, the negation of

ϵ>0, δ>0, x, xa<δf(x)f(a)<ϵ\forall \epsilon>0,\ \exists \delta>0,\ \forall x,\ |x-a|<\delta\Rightarrow |f(x)-f(a)|<\epsilon

is

ϵ>0, δ>0, xsuch thatxa<δ and f(x)f(a)ϵ.\exists \epsilon>0,\ \forall \delta>0,\ \exists x \quad\text{such that}\quad |x-a|<\delta\ \text{and}\ |f(x)-f(a)|\geq \epsilon.

That is the logical shape of discontinuity.

Choosing A Proof Technique

Statement shapeFirst proof move to tryWhy
PQP\Rightarrow Qdirect proofassume PP, derive QQ
PQP\Rightarrow Q where ¬Q\neg Q is concretecontrapositivederive ¬P\neg P from ¬Q\neg Q
"No object has property X"contradictionassume a witness exists and break something
"There exists..."constructionbuild the witness
"For all natural numbers nn..."inductionprove base case and step
"For all objects..." and claim seems falsecounterexample searchone counterexample disproves it
multiple regimescasescover exhaustive alternatives

Direct Proof

Assume the hypotheses and derive the conclusion.

Example

The sum of two even integers is even

Let a=2ma=2m and b=2nb=2n for integers m,nm,n. Then a+b=2(m+n)a+b=2(m+n), so a+ba+b is even.

Direct proof is usually best when definitions immediately unpack into useful algebra.

Contrapositive

To prove PQP\Rightarrow Q, prove ¬Q¬P\neg Q\Rightarrow \neg P.

Example

If n squared is even, then n is even

Prove the contrapositive: if nn is odd, then n2n^2 is odd. Write n=2k+1n=2k+1. Then

n2=4k2+4k+1=2(2k2+2k)+1,n^2=4k^2+4k+1=2(2k^2+2k)+1,

which is odd.

Contrapositive is especially useful when the conclusion has a simple negation, such as "not injective," "not continuous," or "not linearly independent."

Contradiction

To prove a statement PP, assume ¬P\neg P and derive an impossibility.

Example

The square root of 2 is irrational

Assume 2=a/b\sqrt{2}=a/b where a,ba,b are integers with no common factor. Then 2b2=a22b^2=a^2, so aa is even. Write a=2ka=2k. Then 2b2=4k22b^2=4k^2, so b2=2k2b^2=2k^2, hence bb is even. This contradicts the assumption that aa and bb had no common factor.

Contradiction is powerful, but it can hide structure. If a contrapositive proof works cleanly, it is often more informative.

Induction

Definition

Weak Induction

To prove P(n)P(n) for all nn0n\geq n_0, prove the base case P(n0)P(n_0) and prove P(k)P(k+1)P(k)\Rightarrow P(k+1) for every kn0k\geq n_0.

Definition

Strong Induction

To prove P(n)P(n) for all nn0n\geq n_0, prove enough base cases and prove that P(n0),P(n0+1),,P(k)P(n_0),P(n_0+1),\ldots,P(k) together imply P(k+1)P(k+1).

Theorem

Principle Of Mathematical Induction

Statement

If P(n0)P(n_0) is true and P(k)P(k+1)P(k)\Rightarrow P(k+1) for all kn0k\geq n_0, then P(n)P(n) is true for every nn0n\geq n_0.

Intuition

The base case starts the chain. The inductive step says every true case forces the next case. Together they cover all later natural numbers.

Proof Sketch

Suppose some nn0n\geq n_0 fails. By the well-ordering principle, the set of failing nn has a smallest element mm. Since the base case holds, m>n0m>n_0. Then P(m1)P(m-1) is true by minimality of mm, so the inductive step gives P(m)P(m), a contradiction.

Why It Matters

Induction proves formulas for sums, loop invariants, recursion correctness, convergence bounds after TT steps, and dynamic programming recurrences.

Failure Mode

The most common invalid induction proof assumes what it needs to prove. The inductive hypothesis gives P(k)P(k) only for the previous case, not for every future case.

Construction, Cases, And Counterexamples

Definition

Constructive Proof

To prove xS\exists x\in S with property P(x)P(x), explicitly define such an xx and verify the property.

Definition

Proof By Cases

Split the situation into exhaustive cases and prove the claim in each case. The cases must cover everything and should not silently overlap in a way that loses logic.

Definition

Counterexample

To disprove a universal claim x, P(x)\forall x,\ P(x), find one xx such that ¬P(x)\neg P(x).

Counterexamples are not "negative examples." They are surgical logic tools. A single valid counterexample defeats a universal statement.

ML And Math Examples

AreaProof move you will see
generalization boundsunion bound, concentration, contradiction, quantifier control
convex optimizationdirect proof from definitions, separating hyperplanes, KKT implications
linear algebraconstruction of bases, contradiction for independence, induction on dimension
probabilitylaw-of-total-probability decompositions, counterexamples to independence claims
algorithmsinduction on iterations, loop invariants, cases by input branch
learning theorycontrapositive and reductions for lower bounds

Common Confusions

Watch Out

Contrapositive is not the converse

The contrapositive of PQP\Rightarrow Q is ¬Q¬P\neg Q\Rightarrow \neg P and is equivalent to the original. The converse QPQ\Rightarrow P is a different claim.

Watch Out

A proof by examples is not a proof of a universal statement

Checking many examples can build intuition, but it does not prove x, P(x)\forall x,\ P(x). A universal proof must cover every allowed object.

Watch Out

Induction needs a base case

The step P(k)P(k+1)P(k)\Rightarrow P(k+1) can be true even if no case ever starts. The base case is what activates the chain.

Watch Out

Negating an implication changes it into an and statement

The negation of PQP\Rightarrow Q is P¬QP\wedge \neg Q, not ¬P¬Q\neg P\Rightarrow \neg Q.

Quick Drills

Negate: "For every model, there exists a dataset on which it performs well."

Answer: There exists a model such that for every dataset, it does not perform well.

Identify the proof move: "Assume the algorithm does not terminate; then construct an infinite strictly decreasing sequence of nonnegative integers."

Answer: contradiction, usually with well-ordering or a descent measure.

Find the flaw: "The theorem holds for n=1,2,3n=1,2,3, so it holds for all nn."

Answer: examples are not induction. You still need a step proving P(k)P(k+1)P(k)\Rightarrow P(k+1).

Contrapositive: "If a matrix has full column rank, then its columns are linearly independent."

Contrapositive: If the columns are linearly dependent, then the matrix does not have full column rank.

What To Remember

  • An implication fails only when the premise is true and the conclusion is false.
  • Contrapositive is equivalent to the original; converse is not.
  • Negating quantifiers swaps \forall and \exists.
  • Induction needs both a base case and a valid step.
  • To disprove "for all," find one counterexample.
  • Good proof technique follows statement structure, not vibes.

Exercises

ExerciseCore

Problem

Write the negation of: "For every ϵ>0\epsilon>0, there exists δ>0\delta>0 such that xa<δ|x-a|<\delta implies f(x)f(a)<ϵ|f(x)-f(a)|<\epsilon."

ExerciseCore

Problem

Prove by induction that i=1ni=n(n+1)/2\sum_{i=1}^n i=n(n+1)/2 for all n1n\geq 1.

ExerciseAdvanced

Problem

Disprove: "If ab=0ab=0, then a=0a=0 and b=0b=0."

ExerciseAdvanced

Problem

Give a proof template for a PAC-style statement of the form: with probability at least 1δ1-\delta, every hypothesis in a finite class satisfies a deviation bound.

References

Canonical:

  • Velleman, How to Prove It, 3rd ed., Chapters 1-6.
  • Hammack, Book of Proof, 3rd ed., Chapters 1-10.
  • Bloch, Proofs and Fundamentals (2011), Chapters 1-3.

Further:

  • Rosen, Discrete Mathematics and Its Applications, logic and proof chapters.
  • Sipser, Introduction to the Theory of Computation, proof examples in automata and computability.

Next Topics

Last reviewed: April 25, 2026

Canonical graph

Required before and derived from this topic

These links come from prerequisite edges in the curriculum graph. Editorial suggestions are shown here only when the target page also cites this page as a prerequisite.

Required prerequisites

0

No direct prerequisites are declared; this is treated as an entry point.

Derived topics

16

+11 more on the derived-topics page.