Vectors, Matrices, and Linear Maps

Sneiderman, Robby

Foundations

Vectors, Matrices, and Linear Maps

Vector spaces, linear maps, matrix representation, rank, nullity, and the rank-nullity theorem. The algebraic backbone of ML.

CoreTier 1StableSupporting~45 min

Prerequisites

Sets Functions and Relations

Quiz (36)Pulse Check Prereq Map

Why This Matters

A linear map sends the unit square to a parallelogram; columns of A are the images of basis vectors

Neural networks are compositions of linear maps and pointwise nonlinearities. PCA is an eigenvalue problem on a matrix. Gradient descent operates in vector spaces. Linear algebra is the computational substrate of modern ML.

Three pieces of notation recur throughout: for the ambient vector space, for forming linear combinations, and the idea of a for picking coordinates.

Core Definitions

Definition

Vector Space $V$

A vector space over a field $\mathbb{F}$ (typically $\mathbb{R}$ ) is a set $V$ with operations $+: V \times V \to V$ and $\cdot: \mathbb{F} \times V \to V$ satisfying: commutativity and associativity of addition, existence of additive identity $0$ and inverses, distributivity, and $1 \cdot v = v$ . The elements of $V$ are vectors.

Definition

Linear Independence and Basis

Vectors $v_1, \ldots, v_k \in V$ are linearly independent if and only if $\sum_{i=1}^k \alpha_i v_i = 0$ implies all $\alpha_i = 0$ . A basis is a maximal linearly independent set, equivalently a linearly independent set that spans $V$ . The dimension $\dim(V)$ is the number of elements in any basis.

Definition

Linear Map $T : V \to W$

A function $T: V \to W$ between vector spaces is linear if and only if $T(\alpha u + \beta v) = \alpha T(u) + \beta T(v)$ for all scalars $\alpha, \beta$ and vectors $u, v$ . The kernel (null space) is $\ker(T) = \{v \in V : T(v) = 0\}$ . The image (range) is $\text{im}(T) = \{T(v) : v \in V\}$ .

Definition

Matrix Representation

Given bases $\{e_1, \ldots, e_n\}$ for $V$ and $\{f_1, \ldots, f_m\}$ for $W$ , a linear map $T: V \to W$ is represented by the $m \times n$ matrix $A$ where column $j$ contains the coordinates of $T(e_j)$ in the basis of $W$ . Matrix multiplication $AB$ corresponds to composition of linear maps.

Definition

Rank and Nullity $rank (A), null (A)$

The rank of a matrix $A$ is $\text{rank}(A) = \dim(\text{im}(A))$ , equivalently the number of linearly independent columns (or rows). The nullity is $\text{null}(A) = \dim(\ker(A))$ .

Definition

Change of Basis

If $P$ is the matrix whose columns are the new basis vectors expressed in the old basis, then the coordinates transform as $x_{\text{new}} = P^{-1} x_{\text{old}}$ . For a linear map $A: V \to W$ between different spaces, with change-of-basis matrices $P$ on $V$ and $Q$ on $W$ , the new matrix representation is $Q^{-1} A P$ . The familiar similarity formula $P^{-1} A P$ is the special case of an endomorphism ( $V = W$ ) using the same basis on source and target ( $P = Q$ ); applying it to a general linear map between distinct spaces (or even to the same space with two different bases) is a category error.

Main Theorems

Theorem

Rank-Nullity Theorem

Statement

For any linear map $T: V \to W$ with $V$ finite-dimensional:

$\dim(V) = \text{rank}(T) + \text{null}(T)$

Equivalently, for an $m \times n$ matrix $A$ : $n = \text{rank}(A) + \text{null}(A)$ .

Intuition

The domain $V$ splits into two complementary parts: the kernel (what $T$ kills) and a complement (what $T$ maps faithfully onto the image). Their dimensions must add up to $\dim(V)$ .

Proof Sketch

Let $\{u_1, \ldots, u_k\}$ be a basis for $\ker(T)$ . Extend to a basis $\{u_1, \ldots, u_k, w_1, \ldots, w_r\}$ for $V$ . Show that $\{T(w_1), \ldots, T(w_r)\}$ is a basis for $\text{im}(T)$ : it spans because any $T(v)$ can be written in terms of these (the $u_i$ contribute nothing), and it is linearly independent because $\sum \alpha_i T(w_i) = 0 \implies T(\sum \alpha_i w_i) = 0 \implies \sum \alpha_i w_i \in \ker(T)$ , which forces all $\alpha_i = 0$ .

Why It Matters

The rank-nullity theorem constrains the geometry of linear systems. A system $Ax = b$ has a solution iff $b \in \text{im}(A)$ , and the solution is unique iff $\ker(A) = \{0\}$ . In ML: the rank of a data matrix determines how many independent features exist. The singular value decomposition provides a canonical way to compute rank numerically.

Failure Mode

Requires finite-dimensional $V$ . In infinite-dimensional spaces (e.g., function spaces in kernel methods), the statement needs modification. The index of an operator generalizes rank-nullity but involves subtleties about closedness of the range.

report a correction →

Common Confusions

Watch Out

Matrix multiplication is not commutative

$AB \neq BA$ in general, even when both products are defined. This reflects the fact that composition of linear maps is not commutative. The order matters: $AB$ means "apply $B$ first, then $A$ ."

Watch Out

Rank equals column rank equals row rank

Column rank (dimension of column space) always equals row rank (dimension of row space). This is not obvious and requires proof. It means $\text{rank}(A) = \text{rank}(A^T)$ .

Canonical Examples

Example

Projection matrix

Let $P = \begin{pmatrix} 1 & 0 \\ 0 & 0 \end{pmatrix}$ acting on $\mathbb{R}^2$ . Then $\text{im}(P)$ is the $x$ -axis, $\ker(P)$ is the $y$ -axis. Rank is 1, nullity is 1, and $1 + 1 = 2 = \dim(\mathbb{R}^2)$ .

Exercises

ExerciseCore

Problem

Let $A$ be a $5 \times 3$ matrix with $\text{rank}(A) = 2$ . What is the dimension of $\ker(A)$ ? Can $Ax = b$ have a unique solution?

ExerciseAdvanced

Problem

Prove that $\text{rank}(AB) \leq \min(\text{rank}(A), \text{rank}(B))$ .

References

Canonical:

Axler, Linear Algebra Done Right (2024), Chapters 1-3
Strang, Introduction to Linear Algebra (2016), Chapters 1-4
Halmos, Finite-Dimensional Vector Spaces (1958), Chapters 1-3 (vector spaces and linear maps)

For ML context:

Deisenroth, Faisal, Ong, Mathematics for Machine Learning (2020), Chapter 2
Horn & Johnson, Matrix Analysis (2013), Chapter 0 (review of linear algebra fundamentals)
Boyd & Vandenberghe, Introduction to Applied Linear Algebra (2018), Chapters 1-6 (vectors, matrices, and linear independence)

Last reviewed: April 26, 2026

Canonical graph

Required before and derived from this topic

These links come from prerequisite edges in the curriculum graph. Editorial suggestions are shown here only when the target page also cites this page as a prerequisite.

Full prerequisite chain All derived topics

Required prerequisites

1

Sets, Functions, and Relationslayer 0A · tier 1

Derived topics

11

Differentiation in Rⁿlayer 0A · tier 1
Eigenvalues and Eigenvectorslayer 0A · tier 1
Inner Product Spaces and Orthogonalitylayer 0A · tier 1
Linear Independencelayer 0A · tier 1
Matrix Normslayer 0A · tier 1

+6 more on the derived-topics page.

Graph-backed continuations

Matrix Norms Inner Product Spaces and Orthogonality Eigenvalues and Eigenvectors Autoencoders Complex Numbers for Fourier Convolutional Neural Networks Differentiation in Rⁿ Linear Independence Matrix Multiplication Algorithms Non-Euclidean and Hyperbolic Geometry Vector Calculus Chain Rule