Foundations
Vectors, Matrices, and Linear Maps
Vector spaces, linear maps, matrix representation, rank, nullity, and the rank-nullity theorem. The algebraic backbone of ML.
Prerequisites
Why This Matters
A linear map sends the unit square to a parallelogram; columns of A are the images of basis vectors
Neural networks are compositions of linear maps and pointwise nonlinearities. PCA is an eigenvalue problem on a matrix. Gradient descent operates in vector spaces. Linear algebra is the computational substrate of modern ML.
Three pieces of notation recur throughout: for the ambient vector space, for forming linear combinations, and the idea of a for picking coordinates.
Core Definitions
Vector Space
A vector space over a field (typically ) is a set with operations and satisfying: commutativity and associativity of addition, existence of additive identity and inverses, distributivity, and . The elements of are vectors.
Linear Independence and Basis
Vectors are linearly independent if and only if implies all . A basis is a maximal linearly independent set, equivalently a linearly independent set that spans . The dimension is the number of elements in any basis.
Linear Map
A function between vector spaces is linear if and only if for all scalars and vectors . The kernel (null space) is . The image (range) is .
Matrix Representation
Given bases for and for , a linear map is represented by the matrix where column contains the coordinates of in the basis of . Matrix multiplication corresponds to composition of linear maps.
Rank and Nullity
The rank of a matrix is , equivalently the number of linearly independent columns (or rows). The nullity is .
Change of Basis
If is the matrix whose columns are the new basis vectors expressed in the old basis, then the coordinates transform as . For a linear map between different spaces, with change-of-basis matrices on and on , the new matrix representation is . The familiar similarity formula is the special case of an endomorphism () using the same basis on source and target (); applying it to a general linear map between distinct spaces (or even to the same space with two different bases) is a category error.
Main Theorems
Rank-Nullity Theorem
Statement
For any linear map with finite-dimensional:
Equivalently, for an matrix : .
Intuition
The domain splits into two complementary parts: the kernel (what kills) and a complement (what maps faithfully onto the image). Their dimensions must add up to .
Proof Sketch
Let be a basis for . Extend to a basis for . Show that is a basis for : it spans because any can be written in terms of these (the contribute nothing), and it is linearly independent because , which forces all .
Why It Matters
The rank-nullity theorem constrains the geometry of linear systems. A system has a solution iff , and the solution is unique iff . In ML: the rank of a data matrix determines how many independent features exist. The singular value decomposition provides a canonical way to compute rank numerically.
Failure Mode
Requires finite-dimensional . In infinite-dimensional spaces (e.g., function spaces in kernel methods), the statement needs modification. The index of an operator generalizes rank-nullity but involves subtleties about closedness of the range.
Common Confusions
Matrix multiplication is not commutative
in general, even when both products are defined. This reflects the fact that composition of linear maps is not commutative. The order matters: means "apply first, then ."
Rank equals column rank equals row rank
Column rank (dimension of column space) always equals row rank (dimension of row space). This is not obvious and requires proof. It means .
Canonical Examples
Projection matrix
Let acting on . Then is the -axis, is the -axis. Rank is 1, nullity is 1, and .
Exercises
Problem
Let be a matrix with . What is the dimension of ? Can have a unique solution?
Problem
Prove that .
References
Canonical:
- Axler, Linear Algebra Done Right (2024), Chapters 1-3
- Strang, Introduction to Linear Algebra (2016), Chapters 1-4
- Halmos, Finite-Dimensional Vector Spaces (1958), Chapters 1-3 (vector spaces and linear maps)
For ML context:
- Deisenroth, Faisal, Ong, Mathematics for Machine Learning (2020), Chapter 2
- Horn & Johnson, Matrix Analysis (2013), Chapter 0 (review of linear algebra fundamentals)
- Boyd & Vandenberghe, Introduction to Applied Linear Algebra (2018), Chapters 1-6 (vectors, matrices, and linear independence)
Last reviewed: April 26, 2026
Canonical graph
Required before and derived from this topic
These links come from prerequisite edges in the curriculum graph. Editorial suggestions are shown here only when the target page also cites this page as a prerequisite.
Required prerequisites
1- Sets, Functions, and Relationslayer 0A · tier 1
Derived topics
11- Differentiation in Rⁿlayer 0A · tier 1
- Eigenvalues and Eigenvectorslayer 0A · tier 1
- Inner Product Spaces and Orthogonalitylayer 0A · tier 1
- Linear Independencelayer 0A · tier 1
- Matrix Normslayer 0A · tier 1
+6 more on the derived-topics page.
Graph-backed continuations