Foreword

Started 06/06/2024

In hopes of understanding linear algebra, I rewrite what I learn in ways that got me to understanding what’s happening.

Prerequisities

Set Notation

Let say we have set

x = \{ 1, 2, 3 \}

Or, with the set notation $\in$

x \in \{ 1, 2, 3 \}

We can say “what is the set within x that are even integers?” which would be

y = \{ 2 \}

Or, we can say it’s a subset with the notation $\subset$

y \subset \{ 2 \}

And for anything not in the set we would use the notation $\notin$

3 \notin x

Implication & Equivalence

We can imply a burger has pork

\text{ burger } \implies \text{ contains pork }

But it doesn’t mean if it has pork its a burger

\text{ contains pork } \implies \text{ food dish }

If we had something like

\text{ degatchi's super hot } \iff \text{ degatchi is never not hot }

Then we know its always true and we call this equivalence

Matricies

Multiplication

To get the product of 2 matrices the the number of columns in the left factor matches the number of rows in the right factor otherwise we end up not having corresponding varibles to multiply with.

So if we have

mn \cdot np

The inner variables, n, must be equal because the way we multiply matrices is

\text{row } \cdot \text{ column}

and if there isn’t a coloumn to multiply our row then we cannot create the system.

The dimensions of the new matrix is

mp

As an example,

X matrix is 2x3 of size

X = \begin{Bmatrix} x_{11} & x_{12} & x_{13} \\ x_{21} & x_{22} & x_{23} \\ \end{Bmatrix}

Whereas Y is 2x3

Y = \begin{Bmatrix} y_{11} & y_{12} & y_{13} \\ y_{21} & y_{22} & y_{23} \\ \end{Bmatrix}

And so multiplying them (2x3) * (2x3) wouldn’t work bc 3 and 2 are not equal so when we multiply the 3 rows by 2 columns we don’t have a column for the final row!

\begin{split} X \cdot Y &= \begin{Bmatrix} x_{11} & x_{12} & x_{13} \\ x_{21} & x_{22} & x_{23} \end{Bmatrix} \cdot \begin{Bmatrix} y_{11} & y_{12} & y_{13} \\ y_{21} & y_{22} & y_{23} \\ \end{Bmatrix} \\ \end{split}

What we expect is the result

\begin{split} &= \begin{Bmatrix} x_{11}y_{11} & x_{12}y_{21} & x_{13}? \\ x_{21}y_{11} & x_{22}y_{21} & x_{23}? \end{Bmatrix} \end{split}

Since we multiply row * column we never touch $y_{13}$ and $y_{23}$ and so our multiplication is incorrect.

But there is a solution!

We can represent our Y matrix as it’s inverse to have the ns match with the special transpose operator $T$

and from here we can perform the multiplication of row * column :D

I’m going to use actual numbers so you can see the effect of $T$

\begin{split} X \cdot Y &= \begin{Bmatrix} x_{11} & x_{12} & x_{13} \\ x_{21} & x_{22} & x_{23} \end{Bmatrix} \cdot \begin{Bmatrix} 1 & 3 & 5 \\ 2 & 4 & 6 \end{Bmatrix}^T \\ &= \begin{Bmatrix} x_{11} & x_{12} & x_{13} \\ x_{21} & x_{22} & x_{23} \end{Bmatrix} \cdot \begin{Bmatrix} 1 & 2 \\ 3 & 4 \\ 5 & 6 \end{Bmatrix} \\ &= \begin{Bmatrix} x_{11}1 & x_{12}3 & x_{13}5 \\ x_{21}2 & x_{22}4 & x_{23}6 \end{Bmatrix} \end{split}

Yay!

Our new matrix is 2x3, as predicted from 2x3 * 2x3

Symmetric Matrices

These matrices are square matrices that are symmetric around their main diagonals.

For our example the diagonal would be $\{ 1, 2, 3 \}$

\begin{Bmatrix} 1 & 5 & 7 \\ 5 & 2 & 8 \\ 7 & 8 & 3 \\ \end{Bmatrix}

bc of the symmetry the matrix is always equal to it’s transpose!

Triangle Matrices

Upper + Lower triangular matrices are square matrices where the elements either above the main diagonal or below it are all equal to zero, e.g.

\begin{Bmatrix} 1 & 5 & 7 \\ 0 & 2 & 8 \\ 0 & 0 & 3 \\ \end{Bmatrix}

\begin{Bmatrix} 1 & 0 & 0 \\ 5 & 2 & 0 \\ 7 & 8 & 3 \\ \end{Bmatrix}

Diagonal Matrices

These are square matrices where the main diagonal are the only non-zero elements!

\begin{Bmatrix} 1 & 0 & 0 \\ 0 & 2 & 0 \\ 0 & 0 & 3 \\ \end{Bmatrix}

Identity Matrices

These are diagonal matrices with all the non-zero elements equal to 1

\begin{Bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \\ \end{Bmatrix}

Guassian Elimination

Guassian elimination is about trying to get the identity matrix, not about solving for variables.

Lets say we have

\begin{Bmatrix} 3 & 1 & 1 \\ 1 & 2 & 0 \\ \end{Bmatrix}

We want to isolate each var. So for our first row we want to isolate 3, first by eliminating row 1 column 2, 1.

We start by multiplying the top row by 2 since we want to eliminate the 2nd column of row 1.

\begin{Bmatrix} 6 & 2 & 2 \\ 1 & 2 & 0 \\ \end{Bmatrix}

Then we subtract the 2nd row from the first

\begin{Bmatrix} 5 & 0 & 2 \\ 1 & 2 & 0 \\ \end{Bmatrix}

Now that we have isolated row 1 column 1 we want to isolate row 2 column 2.

We apply the same technique, first by multiplying the 2nd row by the number we want to isolate, 5.

\begin{Bmatrix} 5 & 0 & 2 \\ 5 & 10 & 0 \\ \end{Bmatrix}

Subtract 2nd row by first row

\begin{Bmatrix} 5 & 0 & 2 \\ 0 & 10 & -2 \\ \end{Bmatrix}

Then we divide each row by their isolated element

\begin{Bmatrix} 1 & 0 & \dfrac{2}{5} \\ \\ 0 & 1 & \dfrac{-2}{10} \\ \end{Bmatrix}

and we can see our identity matrix and their values

\begin{Bmatrix} x_{11} & 0 \\ 0 & x_{22} \\ \end{Bmatrix} = \begin{Bmatrix} \dfrac{2}{5} \\ \\ \dfrac{-2}{10} \\ \end{Bmatrix}

But we really only care about the identity matrix.

The order of the factors doesn’t matter. The product is always the identity matrix.

We can verify this via

\begin{Bmatrix} 1 & 0 \\ \\ 0 & 1 \\ \end{Bmatrix} \begin{Bmatrix} \dfrac{2}{5} \\ \\ \dfrac{-2}{10} \\ \end{Bmatrix} = \begin{Bmatrix} \dfrac{2}{5} \\ \\ \dfrac{-2}{10} \\ \end{Bmatrix} \begin{Bmatrix} 1 & 0 \\ \\ 0 & 1 \\ \end{Bmatrix}

Determinants

A determinant is essentially the volume inside the dimensional matrix. If we have a matrix of 2x2 we an think of it as a square. The determinant tells us it’s scale and how much we can fit inside of it.

The inverse of a matrix exists as long as its determinant isn’t zero.

det \begin{Bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \\ \end{Bmatrix} = a_{11}a_{22} - a_{12}a_{21}

Think of the first column is positive and the second is negative.

so for example,

det \begin{Bmatrix} 3 & 0 \\ 0 & 2 \\ \end{Bmatrix} = 3 \cdot 2 - 0 \cdot 0 = 6

Since it’s determinant isn’t 0 it has an inverse!

Sarrus’ Rule

But what about for a 3x3 matrix? How do we get the determinant of that?

We essentially got to think multiplying right diagonals as positive terms and then subtracting left diagonals.

det \begin{Bmatrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ a_{31} & a_{32} & a_{33} \\ \end{Bmatrix}

\begin{split} = a_{11}a_{22}a_{33} + a_{12}a_{23}a_{31} + a_{13}a_{21}a_{32} - a_{13}a_{22}a_{31} - a_{12}a_{21}a_{33} - a_{11}a_{23}a_{32} \end{split}

Dimensions > 3

There are 3 rules to solve dimensions higher than 3.

Firslty, notice how the first index element is always from row 1 — going from one to the number of dimensions.

Secondly, for every right index, their orders are all permutations:

$a_{11}a_{22}a_{33} = \{ 1, 2, 3 \}$
$a_{12}a_{23}a_{31} = \{ 2, 3, 1 \}$
$a_{13}a_{21}a_{32} = \{ 3, 1, 2 \}$
$a_{13}a_{22}a_{31} = \{ 3, 2, 1 \}$
$a_{12}a_{21}a_{33} = \{ 2, 1, 3 \}$
$a_{11}a_{23}a_{32} = \{ 1, 3, 2 \}$

Lastly, we needto keep track of all the column indexes and make sure they’re in increasing order, e.g. if [3, 1, 2] we would say:

2 and 3 must switch
3 and 1 must switch

And if the total number of switches is even, we write the term as positive. If it is odd we write it as negative.

Vectors

Vectors can be thought as points from the origin, e.g. (3, 5) is an arrow from the origin, (0, 0). Then we can move this arrow, aka a vector, and it will still be the same length and direction.

We can also represent (3, 5) as $\begin{Bmatrix} 3 \\ 5 \end{Bmatrix}$ which we can call a column vector. Row vectors would be $\begin{Bmatrix} 3 & 5 \end{Bmatrix}$ .

We also call the set of all n * 1 matrices $\R^n$ , e.g.

\R^2 = \begin{Bmatrix} a_1 \\ a_2 \end{Bmatrix}

\R^3 = \begin{Bmatrix} a_1 \\ a_2 \\ a_3 \end{Bmatrix}

\R^n = \begin{Bmatrix} a_1 \\ a_2 \\ \vdots \\ a_n \end{Bmatrix}

When we think of $\R^2$ this means anything in the set of 2 dimensions, e.g., $\{ (1, 0), (9, 27), (69, 420) \}$ bc they’re all in the 2 dimensional field.

Independence & Dependence

Linear Independence is when there is only one solution. No matter what the vectors do they can never return back to the origin. Sometimes this is called one-dimensional independece. Usually to get these vectors to the origin you will need to multiply each vector by 0.

For example,

\begin{Bmatrix} a_1 \\ a_2 \end{Bmatrix} = c_1 \begin{Bmatrix} a_1 \\ a_2 \end{Bmatrix} + c_2 \begin{Bmatrix} a_1 \\ a_2 \end{Bmatrix} + c_3 \begin{Bmatrix} 0 \\ 0 \\ 1 \end{Bmatrix}

Linear dependence is the opposite. The vectors are able to work together to get back to the origin by manipulating the directions.

Basis

In ML, principal component analysis (PCA) is used to transform data to a new basis. This is why this is so important to understand.

A basis for lienar independent matrices is the set of matrices without the coefficients being multiplied (if there are any).

E.g.,

\begin{Bmatrix} x_1 \\ x_2 \end{Bmatrix} = c_1 \begin{Bmatrix} a_1 \\ a_2 \end{Bmatrix} c_2 \begin{Bmatrix} b_1 \\ b_2 \end{Bmatrix}

Our basis would be of rank 2

\text{ basis of } \begin{Bmatrix} x_1 \\ x_2 \end{Bmatrix} = \left\{ \begin{Bmatrix} a_1 \\ a_2 \end{Bmatrix}, \begin{Bmatrix} b_1 \\ b_2 \end{Bmatrix} \right\}

If we had

\begin{Bmatrix} x_1 \\ x_2 \end{Bmatrix} = c_1 \begin{Bmatrix} a_1 \\ a_2 \end{Bmatrix} c_2 \begin{Bmatrix} a_1 \\ a_2 \end{Bmatrix} = c_1c_2 \begin{Bmatrix} a_1 \\ a_2 \end{Bmatrix}

Then our rank would be 1

The rank is rank is the number of linearly independent vectors among the columns of the matrix M.

If there are multiple solutions to the matrix then there is no basis — think of it as no solid foundation to build upon bc the numbers aren’t set in stone, like concrete, so the basis, aka our foundation, falls apart.

Tldr; the basis is a set of vectors that can create an vector within the realm of $\R^m$ . All of them are linearly indepdent. The way you can create any vector within that realm is from multiplying the set of vectors with a coefficient.

How do I turn the following basis into $\begin{Bmatrix} 10 \\ 27 \end{Bmatrix}$

\text{ basis of } = \left\{ \begin{Bmatrix} 3 \\ 1 \end{Bmatrix}, \begin{Bmatrix} 1 \\ 2 \end{Bmatrix} \right\}

We have to use scalars as the coefficients to the matrices and solve for $\begin{Bmatrix} 10 \\ 27 \end{Bmatrix}$

This will look like

a \begin{Bmatrix} 3 \\ 0 \end{Bmatrix} + b \begin{Bmatrix} 0 \\ 3 \end{Bmatrix} = \begin{Bmatrix} 9 \\ 27 \end{Bmatrix}

Where a = 3 and b = 9 to give us

\begin{split} \begin{Bmatrix} 9 \\ 27 \end{Bmatrix} &= 3 \begin{Bmatrix} 3 \\ 0 \end{Bmatrix} + 9 \begin{Bmatrix} 0 \\ 3 \end{Bmatrix} \\ &= \begin{Bmatrix} 9 \\ 0 \end{Bmatrix} + \begin{Bmatrix} 0 \\ 27 \end{Bmatrix} \\ &= \begin{Bmatrix} 9 \\ 27 \end{Bmatrix} \end{split}

If there are multiple ways to creating an arbitrary vector within $\R^n$ then there is no basis bc a basis is linearly indepdent not linearly dependent.

To solidify this, lets look at the set

\text{ our proposed basis } = \left\{ \begin{Bmatrix} 1 \\ 0 \\ 0 \end{Bmatrix}, \begin{Bmatrix} 0 \\ 1 \\ 0 \end{Bmatrix} \right\}

This set could never create

A = \begin{Bmatrix} 0 \\ 0 \\ 1 \end{Bmatrix}

Bc, no scalar could multiply the z dimension (3rd row) in either of the set’s vectors to create 1 in A.

Just bc a set of vectors is linearly independent doesn’t mean it forms a basis. However, a basis implies that the matrices are linear indepdent:

\text{basis } \implies \left\{ \begin{Bmatrix} 1 \\ 0 \\ 0 \end{Bmatrix}, \begin{Bmatrix} 0 \\ 1 \\ 0 \end{Bmatrix} \right\}

Basis tldr; can you make arbitrary vector within $\R^m$ with the set of vertical vectors and their coefficient scalars? If you can make multiple, it’s not a base, if you can make one it’s linearly indepdent and a base!

Linear independece is about finding a path back to the origin. Bases are about finding a path to a vector in $\R^m$

Subspaces

A subspace is a subset of vectors within a set of vectors, for instance within $\R^3$

So our format is this,

\left\{ \begin{Bmatrix} x \\ x^2 \\ 0 \end{Bmatrix} \right\}

For a subspace to be established it must conform to this standard. So if we had for example where c = 2,

\begin{split} v &= c \begin{Bmatrix} 1 \\ 1^2 \\ 0 \end{Bmatrix} \\ &= 2 \begin{Bmatrix} 1 \\ 1 \\ 0 \end{Bmatrix} \\ &= \begin{Bmatrix} 2 \\ 2 \\ 0 \end{Bmatrix} \end{split}

v is doesn’t adhere to the standard in row 2 $x^2$ since $2 \not = 2^2$

v \notin \left\{ \begin{Bmatrix} x \\ x^2 \\ 0 \end{Bmatrix} \right\}

Whereas if we had our subspace standard be

\notin \left\{ \begin{Bmatrix} x \\ x \\ 0 \end{Bmatrix} \right\}

Then v would be adhering to the standard and thus apart of the subspace.

Linear Span

A linear span is the set of all possible linear combinations of the vectors within the $\R^n$ using scalar coefficients from the field

?????

Linear Transformation

Linear transformations takes vectors from V and turns them into vectors in W. If we transform a group of vectors from V we start to “map out” some vectors in W.

If S is a subspace of V then L(S) is the image of S. Think of it as shining a light through the vector space V and then seeing how much of the vector space behind it W is lit up — like a shadow — also called the range of L

Lets say we want to transform 3 dimensions to 2

L : \R^3 \to \R^2

Then transforming $\R^3$ to $\R^2$ involves this subspace of $\R^2$

L(v) = \begin{Bmatrix}v_1 \\ v_2 - v_3 \\ \end{Bmatrix}

Where $V = \begin{Bmatrix}v_1 \\ v_2 \\ v_3 \end{Bmatrix}$ And $S = \begin{Bmatrix}c \\ 2c \\ 0 \\ \end{Bmatrix}$

Then

L\begin{Bmatrix}c \\ 2c \\ 0 \\ \end{Bmatrix} = \begin{Bmatrix}c \\ 2c - 0 \\ \end{Bmatrix} = \begin{Bmatrix}c \\ 2c \\ \end{Bmatrix}

Where

\begin{Bmatrix}c \\ 2c \\ \end{Bmatrix}

Is the image of the subspace S

The kernal, $ker(L)$ is the set of vectors in V that give the zero vector in W, $\overrightharpoon{0}_w$

So, for

L : \R^3 \to \R^2

We want to find which vectors in $\R^3$ give us the zero vector $\overrightharpoon{0}_w$

L\begin{Bmatrix}v_1 \\ v_2 \\ v_3 \\ \end{Bmatrix} = \begin{Bmatrix}v_1 \\ v_2 - v_3 \\ \end{Bmatrix} = \begin{Bmatrix}0 \\ 0 \\ \end{Bmatrix}

So $v_1 = 0$ and $v_2 = v_3$

if we multiply a 2x3 matrix by a 3x1 matrix, remembering mn * np we would end up with mp, 2x1 matrix, dropping a dimension!

A computer stores all transformations as 3x3 matrices — bc it can only display 3 dimensions.

Images

For example, lets find the image of

c_1 \begin{Bmatrix} 3 \\ 1 \\ \end{Bmatrix} + c_2 \begin{Bmatrix} 1 \\ 2 \\ \end{Bmatrix}

by using the 2x2 linear transformation matrix

\begin{Bmatrix} 8 & -3 \\ 2 & 1 \\ \end{Bmatrix}

We can solve it like this

\begin{split} f\left(c_1 \begin{Bmatrix} 3 \\ 1 \\ \end{Bmatrix} + c_2 \begin{Bmatrix} 1 \\ 2 \\ \end{Bmatrix} \right) &= \begin{Bmatrix} 8 & -3 \\ 2 & 1 \\ \end{Bmatrix} \left\lbrack c_1 \begin{Bmatrix} 3 \\ 1 \\ \end{Bmatrix} + c_2 \begin{Bmatrix} 1 \\ 2 \\ \end{Bmatrix} \right\rbrack \\ &= c_1 \begin{Bmatrix} 8 & -3 \\ 2 & 1 \\ \end{Bmatrix} \begin{Bmatrix} 3 \\ 1 \\ \end{Bmatrix} + c_2 \begin{Bmatrix} 8 & -3 \\ 2 & 1 \\ \end{Bmatrix} \begin{Bmatrix} 1 \\ 2 \\ \end{Bmatrix} \\ &= c_1 \begin{Bmatrix} 8 \cdot 3 & (-3) \cdot 1 \\ 2 \cdot 3 & 1 \cdot 1 \\ \end{Bmatrix} + c_2 \begin{Bmatrix} 8 \cdot 1 & (-3) \cdot 2 \\ 2 \cdot 1 & 1 \cdot 2 \\ \end{Bmatrix} \\ &= c_1 \begin{Bmatrix} 21 \\ 7 \\ \end{Bmatrix} + c_2 \begin{Bmatrix} 2 \\ 4 \\ \end{Bmatrix} \\ &= c_1 \left\lbrack 7 \begin{Bmatrix} 3 \\ 1 \\ \end{Bmatrix} \right\rbrack + c_2 \left\lbrack 2 \begin{Bmatrix} 1 \\ 2 \\ \end{Bmatrix} \right\rbrack \\ \end{split}

And so we can see the answer can be expressed using the multiples of the original two vectors — transforming all points of the $x_1x_2$ plane.

Kernal

Rank

We can see in this matrix

V = \begin{Bmatrix} 1 & 4 & 4 \\ 2 & 5 & 8 \\ 3 & 6 & 12 \\ \end{Bmatrix}

that the third column = first column * 4

\begin{Bmatrix}4 \\ 8 \\ 12 \end{Bmatrix} = 4 \begin{Bmatrix}1 \\ 2 \\ 3 \end{Bmatrix}

which means that V has a rank of 2, linear indepdent vectors.

Eigenvalues & Eigenvectors

These come in handy when doing physics and statistics.

An eigenvalue, $\lambda$ , is a scalar that is used to scale a matrix, e.g., $\lambda \overrightharpoon{x}, \lambda \in \R$

Good Explainations

Multiply Matrices

One way is to take the dot (multiplication) product, e.g. $u \cdot v = u_1v_1 + u_2v_2 + ... + u_nv_n$

Linear combinations combines scalars are vectors in their own matrices. They sare the same as the dot product except u is replaced with scalars in a matrix k and v are vectors in a matrix of R^n.

Extra

Vectors express magnitude + direction
Scalars express only magnitude. They are numbers that “scale” up or down.

Fractions

\begin{split} -1 + \dfrac{3}{2} &= -\dfrac{2}{2} + \dfrac{3}{2} \\ \\ &= \dfrac{-2 + 3}{2} \\ \\ &= \dfrac{1}{2} \\ \end{split}

Another one is

\dfrac{1}{3} \cdot \dfrac{1}{3} = \dfrac{1}{9}

This one tripped me up — what does this equal?

20\left(\dfrac{5}{4}\right)

Really means 1 of itself $+ \dfrac{1}{4}$ of it since

\begin{cases} 20\left(\dfrac{4}{4}\right) = 20 \\ \\ 20\left(\dfrac{1}{4}\right) = 5 \end{cases}

Then add them together and you get 25.

If you had $-\sqrt{2}$ and you wanted to turn it into 1 then you would need to

\begin{split} \dfrac{-1}{\sqrt{2}} \cdot -\sqrt{2} &= \dfrac{-1 \cdot -\sqrt{2}}{\sqrt{2}} \\ \\ &= \dfrac{\sqrt{2}}{\sqrt{2}} \\ \\ &= 1 \end{split}

bc the sqrt takes the place of the numerator

If we had $\begin{split} \dfrac{-1}{3} \cdot 3 &= \dfrac{-1}{3} \cdot \dfrac{3}{1} \\ \\ &= \dfrac{-1 \cdot 3}{3 \cdot 1} \\ \\ &= \dfrac{-3}{3} \\ \\ &= \dfrac{-1}{1} \\ \\ &= -1 \end{split}$ bc the division and multiplication by the same number cancels the demominator out to become 1 then $\dfrac{-5}{1}$

Matrix Multiplication

Matrix multiplication

\begin{split} \begin{Bmatrix} a & b \\ c & d \\ \end{Bmatrix} \cdot \begin{Bmatrix} e & f \\ g & h \\ \end{Bmatrix} \end{split}

is really saying

\begin{split} \begin{Bmatrix} a & b \\ \end{Bmatrix} \cdot \begin{Bmatrix} e \\ g \\ \end{Bmatrix} \end{split}

And when we think of mn * np this multiplication is (1 * 2) * (2 * 1) which means the resulting matrix will be a mp or 1 * 1.

So we think of row * column.

Unintuitvely, we need to think of the multiplication as the dot product instead of the intuitive a * e is the element then b * g is the other element. This is why we sum everything up for the new element.

The dot product is the product of all the multiplication, aka “dots” since Latex multiplication signs are called “dot”s.

As a side note

\begin{split} A = \begin{Bmatrix} a & b \\ c & d \\ \end{Bmatrix} \end{split}

Can be represented as A = [a b; c d].

An identity matrix (always a square) is what x1 = x is for matrix, AI = A, i.e.,

\begin{split} \begin{Bmatrix} a & b \\ c & d \\ \end{Bmatrix} \cdot \begin{Bmatrix} 1 & 0 \\ 0 & 1 \\ \end{Bmatrix} &= \begin{Bmatrix} a & b \\ c & d \\ \end{Bmatrix} \end{split}

Even for different sized matrices the identity needs to be square

\begin{split} \begin{Bmatrix} a & b & c \\ d & e & f \\ \end{Bmatrix} \cdot \begin{Bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \\ \end{Bmatrix} &= \begin{Bmatrix} a & b & c \\ d & e & f \\ \end{Bmatrix} \end{split}

Matrix multiplication isn’t commutative bc the order we perform multiplication amtters AB != BAbc if we have a 2x3 * 3x3 we can multiply that but if its reversed where 3x3 * 2x3 then from mn * np doesn’t work bc n doesn’t match.

Similarly, we cannot definite $A^n$ for non-square matrices bc if we have 2x3 and we multiply 2x3 * 2x3 * 2x3 we know from mn * np 2x3 * 2x3 wont work because the ns don’t match (2, 3).

However if

A^0 = I \text{ (Identity matrix)}

We cannot divide matrices

Factor Out

For

\begin{split} \begin{Bmatrix} \dfrac{1}{3} & \dfrac{1}{3} & \dfrac{1}{3} \\ \\ \dfrac{1}{3} & \dfrac{1}{3} & \dfrac{1}{3} \\ \\ \dfrac{1}{3} & \dfrac{1}{3} & \dfrac{1}{3} \end{Bmatrix} = \dfrac{1}{3} \begin{Bmatrix} 1 & 1 & 1 \\ 1 & 1 & 1 \\ 1 & 1 & 1 \end{Bmatrix} \\ \end{split}

If we had A^2 then knowing what we have above

\begin{split} \dfrac{1}{3} \begin{Bmatrix} 1 & 1 & 1 \\ 1 & 1 & 1 \\ 1 & 1 & 1 \\ \end{Bmatrix} \cdot \dfrac{1}{3} \begin{Bmatrix} 1 & 1 & 1 \\ 1 & 1 & 1 \\ 1 & 1 & 1 \\ \end{Bmatrix} &= \left( \dfrac{1}{3} \cdot \dfrac{1}{3} \right) \cdot \begin{Bmatrix} 1 & 1 & 1 \\ 1 & 1 & 1 \\ 1 & 1 & 1 \\ \end{Bmatrix} \\ &= \dfrac{1}{9} \cdot \begin{Bmatrix} 1 & 1 & 1 \\ 1 & 1 & 1 \\ 1 & 1 & 1 \\ \end{Bmatrix} \end{split}

Sigma

The sigma notation $\displaystyle\sum_{i=1}^n 2^n = 2^1 + 2^2 + ... + 2^{10}$ is a for loop in programming.

Whatever is to the right of sigma means perform this operation per loop. It means the summing of the the term $a_{iK} b_{Kj}$ from k = 1 to k = n. Think as k as the current number in the iteration and n as the max number. For some reason math fellas use k instead of the engineering notation of i.

Inverse Matrix

Some matrices have a corresponding inverse matrix.

We know scalars have an inverse, for example $2 \cdot \dfrac{1}{2} = \dfrac{2}{1} \cdot \dfrac{1}{2} = 1$ or $\dfrac{1}{6} \cdot 6 = 1$

But what is the matrix equivalent of 1?

The identity matrix, a matrix with 1s across it’s diagonal and 0s everywhere else.

So, $A \cdot A^{-1} = A^{-1} \cdot A = I$

But we also we can go back

(A^{-1})^{-1} = A

There is another interesting property

(AB)^{-1} = B^{-1}A^{-1}

But why are they swapped?

If we had $f(\vec{x}) = AB\vec{x}$ then we would be multiplying $B\vec{x}$ first

Since the last thing we did with the normal function was multiply by A the first thing we would do with the inverse is mutliply by A w/ the vec. The inverse would need the $A^{-1}\vec{x}$ to be first bc

f^{-1}(\vec{x}) = B^{-1}A^{-1}\vec{x}

When

Calculating Inverse Matrices 2x2

If we had a matrix then we find the inverse by

\begin{pmatrix} a & b \\ c & d \\ \end{pmatrix}^{-1} = \dfrac{1}{ad - bc} \begin{pmatrix} d & -b \\ -c & a \\ \end{pmatrix}^{-1}

Then we can get the identity matrix via

\begin{pmatrix} a & b \\ c & d \\ \end{pmatrix} \begin{pmatrix} a & b \\ c & d \\ \end{pmatrix}^{-1} = \begin{pmatrix} 1 & 0 \\ 0 & 1 \\ \end{pmatrix}^{-1}

bc this is the same as, remember, the original value multiplied it’s inverse $\dfrac{2}{1} \cdot \dfrac{1}{2}$ .

3x3

What if we had a larger matrix? Then what?

A = \begin{pmatrix} 4 & 3 & 0 \\ 1 & 2 & 0 \\ 0 & 0 & 1 \\ \end{pmatrix}^{-1}

We put it in an augmented matrix

[A | I] \to [I|A^{-1}]

Lets explain this. When we row reduce this to match the identity then the right side in the augmented matrix actually becomes our inverse matrix.

= \begin{pmatrix} 4 & 3 & 0 & | & 1 & 0 & 0 \\ 1 & 2 & 0 & | & 0 & 1 & 0 \\ 0 & 0 & 1 & | & 0 & 0 & 1 \\ \end{pmatrix}

We first move the $R_2$ to $R_1$ to make 1 at $x_{11}$ easier to work with

= \begin{pmatrix} 1 & 2 & 0 & | & 0 & 1 & 0 \\ 4 & 3 & 0 & | & 1 & 0 & 0 \\ 0 & 0 & 1 & | & 0 & 0 & 1 \\ \end{pmatrix}

To turn 4 in row 2 into 0 we need to $R_2 - 4R_1$

= \begin{pmatrix} 1 & 2 & 0 & | & 0 & 1 & 0 \\ 0 & -5 & 0 & | & 1 & -4 & 0 \\ 0 & 0 & 1 & | & 0 & 0 & 1 \\ \end{pmatrix}

Then we multiply $-\dfrac{1}{5}R_2$ to turn -5 into 1

= \begin{pmatrix} 1 & 2 & 0 & | & 0 & 0 & 1 \\ \\ 0 & 1 & 0 & | & \dfrac{-1}{5} & \dfrac{4}{5} & 0 \\ \\ 0 & 0 & 1 & | & 0 & 0 & 1 \\ \end{pmatrix}

Then remove the 2 in the first row we need to $R_1 - 2R_2$

And now we get the identity matrix of the left and the inverse of A, our original matrix $A^{-1}$ and so $AA^{-1} = I$

= \begin{pmatrix} 1 & 0 & 0 & | & \dfrac{2}{5} & \dfrac{3}{5} & 0 \\ \\ 0 & 1 & 0 & | & \dfrac{-1}{5} & \dfrac{4}{5} & 0 \\ \\ 0 & 0 & 1 & | & 0 & 0 & 1 \\ \end{pmatrix}

If the reduce row echelon form (rref) of the matrix has a row of zeros then the given matrix is non-invertible.

Gaussian Elimination

\begin{split} \begin{bmatrix} 1 & 1 & 2 & | & 3 \\ -1 & 3 & -5 & | & 7 \\ 2 & -2 & 7 & | & 1 \\ \end{bmatrix} \end{split}

To get to a row echelon form we would start by trying to make $x_{21} = -1$ into $x_{21} = 0$

We start by $R_{2} + R_{1}$ bc -1 + 1 = 0 so

\begin{split} &= \begin{bmatrix} 1 & 1 & 2 & | & 3 \\ (-1 + 1) & (3 + 1) & (-5 + 2) & | & (7 + 3) \\ 2 & -2 & 7 & | & 1 \\ \end{bmatrix} \\ &= \begin{bmatrix} 1 & 1 & 2 & | & 3 \\ 0 & 4 & -3 & | & 10 \\ 2 & -2 & 7 & | & 1 \end{bmatrix} \end{split}

Then we want $x_{31} = 2$ to be 0. So, $R_{3} - 2R_{1}$ bc 2 - (2*1) = 0 so

\begin{split} &= \begin{bmatrix} 1 & 1 & 2 & | & 3 \\ 0 & 4 & -3 & | & 10 \\ (2 - 2(1)) & (-2 - 2(1)) & (7 - 2(2)) & | & (1 - 2(3)) \\ \end{bmatrix}\\ &= \begin{bmatrix} 1 & 1 & 2 & | & 3 \\ 0 & 4 & -3 & | & 10 \\ 0 & -4 & 3 & | & -5 \end{bmatrix} \end{split}

Then we want $x_{32} = -4$ to be 0.

Since we don’t want to interfere with our progress in the first column we have to use the 2nd row bc it also has a 0 in the first column.

So,

R_{3} + 2R_{1}

bc -4 + 4 = 0

\begin{split} &= \begin{bmatrix} 1 & 1 & 2 & | & 3 \\ 0 & 4 & -3 & | & 10 \\ 0 & (-4 + 4) & (3 + (-3)) & | & (-5 - 10) \end{bmatrix} \\ &= \begin{bmatrix} 1 & 1 & 2 & | & 3 \\ 0 & 4 & -3 & | & 10 \\ 0 & 0 & 0 & | & -15 \end{bmatrix} \end{split}

And bang! That’s Guassian elimination to transform the matrix into row echelon form.

But wait! We see the final column has all zeros so we call this linear system is inconsistent, which really means that the planes (equations) do not meet — there is no point common to all 3, meaning there is no solution!

Consistent systems are those where there is a common point between the equations. We would have $R_{33} = 1$ and so you know z = -15 or if $R_{33} = 0, R_{44} = 0$ then that would of worked too bc z can be any number and have infinite solutions

Modifying A Single Matrix Row

If we had

\begin{split} &= \begin{bmatrix} -1 & 2 & 3 & | & 0 \\ 0 & -2 & -10 & | & 0 \\ 0 & 0 & 0 & | & 0 \end{bmatrix} \\ \end{split}

We want to turn -2 into 1. We can do this by $\dfrac{-R_2}{2}$ meaning we multiply our row by the opposite so -2 * 2 then halving the row.

\begin{split} &= \begin{bmatrix} -1 & 2 & 3 & | & 0 \\ 0 & 1 & 5 & | & 0 \\ 0 & 0 & 0 & | & 0 \end{bmatrix} \\ \end{split}

Then we can multiply the top row by -1 to turn -11 into 1

\begin{split} &= \begin{bmatrix} 1 & -2 & -3 & | & 0 \\ 0 & 1 & 5 & | & 0 \\ 0 & 0 & 0 & | & 0 \end{bmatrix} \\ \end{split}

This tells us y + 5z = 0 which gives y = -5z which means we can provide any number to create it thus resulting in infinite solutions.

We can say this matrix is $(A | O)$ meaing that | split difference of the right and left. The Left side is the non-zeros A and the O is the zero matrix.

Since all the constants on the far right are zero we can say this lienar system is homogeneous, meaning they’re all the same. In non-square matrices there will be a row with no leading 1 so it has infinite solutions bc reduce row echelon has a diagonal 1 but rectanges don’t so there will be an open 0 row. We can think of the unknowns (x, y, z) relative to the 1s, if they don’t match then there are infinite number of solutions.

Pivots

Before we get into vector form.

In each row, the first 1 from the left of the row is called the pivot.

\begin{bmatrix} x & y & z & w & | & b \\ \hline 1 & 1 & 0 & -10 & | & -9 \\ 0 & 0 & 1 & -7 & | & -7 \\ 0 & 0 & 0 & 0 & | & 0 \end{bmatrix} \\

See how we have 1 at $R_{12}$ behind $R_{11}$ ? This isn’t a pivot because it’s after the the 1 at $R_{11}$ and so that column is actually a free variable.

Each pivot indicates that’s the column variable you’re solving for. So, the first row’s pivot $R_{11}$ is in $x$ ‘s column meaning we’re solving for x.

So this equation would be $x = -y + 10w - 9$ bc we move the y and -10w to the right hand side augmented matrxi of the matrix, this is why -9 remains the same.

And so since y column has no pivot we actually say y can be any number or y = y.

The next pivot is $R_{23}$

Matrix To Vector Form

This augmented matrix of reduce row echelon form has 2 pivots in column x_1 w/ $R_{11}$ and column x_2 w/ $R_{22}$

Columns x_3 and x_4 have what we called a free variable bc they have no pivots

\begin{split} \begin{bmatrix} x_1 & x_2 & x_3 & x_4 & | & b \\ \hline 1 & 0 & 3 & -1 & | & -3 \\ 0 & 1 & 2 & -1 & | & -2 \\ 0 & 0 & 0 & 0 & | & 0 \end{bmatrix} \\ \end{split}

And so we want to turn this augmented matrix back to the written equations by grabbing each row and isolating the pivot of the column it’s in x_n. We isolate it by moving all the other vars into the b column w/ just normal algebra. Let’s show what the equation would look like pre-movement to b column:

\begin{cases} x_1 + 3x_3 - x_4 = -3 \\ x_2 + 2x_3 - x_4 = -2 \end{cases}

Then lets move it over to b column where the pivot var will be isolated.

The non-pivot columns have free variables, meaning they can be whatever they want. So $x_3$ and $x_4$ can be 1 to make it easier for ourselves :)

\begin{cases} x_1 = -3x_3 + x_4 -3 \\ x_2 = -2x_3 + x_4 -2 \\ x_3 = x_3 \text{ or } s \\ x_4 = x_4 \text{ or } t \end{cases}

And now we need to write them as one equation as a vector.

\begin{split} \vec{x} &= \begin{bmatrix} x_1 \\ x_2 \\ x_3 \\ x_4 \end{bmatrix} \\ \\ &= \begin{bmatrix} -3x_3 + x_4 -3 \\ -2x_3 + x_4 -2 \\ s \\ t \end{bmatrix} \\ \\ \end{split}

Then we turn them into their own vectors of $x_3$ , $x_4$ and constants (since that’s all we have).

\begin{split} &= \begin{bmatrix} -3x_3 \\ -2x_3 \\ x_3 \\ 0 \end{bmatrix} + \begin{bmatrix} x_4 \\ x_4 \\ 0 \\ x_4 \end{bmatrix} + \begin{bmatrix} -3 \\ -2 \\ 0 \\ 0 \end{bmatrix} \end{split}

And then we can find our solution by factoring out the like terms

\begin{split} &= x_3 \begin{bmatrix} -3 \\ -2 \\ 1 \\ 0 \end{bmatrix} + x_4 \begin{bmatrix} 1 \\ 1 \\ 0 \\ 1 \end{bmatrix} + \begin{bmatrix} -3 \\ -2 \\ 0 \\ 0 \end{bmatrix} \end{split}

$x_3$ and $x_4$ can be 1 or 0 remember, since they’re free variables! :)

Dot / Inner Product

\begin{split} u = \begin{bmatrix} u_1 \\ \vdots \\ u_n \end{bmatrix}, v = \begin{bmatrix} v_1 \\ \vdots \\ v_n \end{bmatrix} \end{split}

The dot product of u and v,

u \cdot v

Can be viewed as the transpose matrix multiplcation

= u^{T}v = \begin{bmatrix} u_1 \\ \vdots \\ u_n \end{bmatrix}^T \begin{bmatrix} v_1 \\ \vdots \\ v_n \end{bmatrix} = u_1 v_1 + \cdots + u_n v_n

For mn * np

Its really just a short-hand way of saying do matrix multiplication BUT it results in a real number scalar, not a vector!

If the dot product is zero then the vectors are perpendicular (right angle in 2D and 3D) or orthogonal (right angle in n > 3 dimensions)!

Vector Norm

The norm or length of a vector is the absolute sign but doubled up.

Lets say we want to know the norm of vector $v = \begin{bmatrix} x \\ y \\ \end{bmatrix}$

||v|| = \sqrt{x^2 + y^2}

What about a vector of n dimensions such as?

= v = \begin{bmatrix} v_1 \\ \vdots \\ v_n \end{bmatrix}

We would say

||v|| = \sqrt{(v_1)^2 + \cdots + (v_n)^2}

But we can also write it as the sqrt of the dot product because if it’s multiplying itself then the values within the matrix are all squared

||v|| = \sqrt{v \cdot v}

This is called the Euclidean norm.

But why wouldn’t squaring v work?

||v|| \not = \sqrt{v^2}

bc we dont ouput a scalar!!! we have merely squared all the values in the matrix and so it still remains a vector, not our desired scalar used for our distance. We can’t have a matrix distance, silly retard :P

If we wanted to find the distance between two vectors we would say

||u - v|| = d(u, v)

This is what GPS systems and satellites would use to detect their relative distances.

If we had

||u||^2

Then to evalute it we know

||u|| = \sqrt{u \cdot u}

Therefore, we remove the sqrt of the dot product which means the square is reversed

||u||^2 = u \cdot u

Norm Difference

||u - v|| = ||(7, -2) - (-5, 3)|| = ||(12, -5)|| = \sqrt{12^2 + (-5)^2}

Mental model for the norm.

Let say $u = (2, -7)^T$

How would we think about this?

\left|\left| \dfrac{1}{||u||} u \right|\right|

First we would want to solve for $||u||$

||u|| = \sqrt{u \cdot u} = \sqrt{2^2 + (-7)^2} = \sqrt{53}

And so we have

\dfrac{1}{\sqrt{53}}\begin{pmatrix}2\\-7\end{pmatrix}

And so we simply want to find the norm of

\left|\left| \dfrac{1}{\sqrt{53}}\begin{pmatrix}2\\-7\end{pmatrix} \right|\right|

Which when thinking of the dot product is

= \sqrt{\dfrac{1}{\sqrt{53}}\begin{pmatrix}2\\-7\end{pmatrix} \cdot \dfrac{1}{\sqrt{53}}\begin{pmatrix}2\\-7\end{pmatrix}}

And using the commutative properity

= \sqrt{\dfrac{1}{\sqrt{53}}\dfrac{1}{\sqrt{53}}\begin{pmatrix}2\\-7\end{pmatrix}}\begin{pmatrix}2\\-7\end{pmatrix}

1/53 * 1/53 = 1/53 bc 1 * 1 = 1 with the same denominator and we use the dot product for our matrices

= \sqrt{\dfrac{1}{\sqrt{53}}(2^2 + (-7)^2)}

Simplify

= \sqrt{\dfrac{1}{\sqrt{53}}(53)} = \sqrt{1} = 1

Scalar Modulus

For scalar k we define the modulus of k denoted $|k|$ as

|k| = \sqrt{k^2}

So then we can

\begin{split} ||ku|| &= \sqrt{ku \cdot ku} \\ &= \sqrt{k^2 (u \cdot u)} \\ &= \sqrt{k^2} \sqrt{u \cdot u} \\ &= |k|||u|| \end{split}

Normalising Vectors

The process of finding a unit vector in the direction of the given vector u is called normalising.

The unit vector in the direction of the vector u is normally denoted by $\hat{u}$ (pronounced ‘u hat’) meaning it is a vector of lenth 1, that is

\hat{u} = \dfrac{1}{||u||}u

Linear Algebra

Foreword

Prerequisities

Set Notation

Implication & Equivalence

Matricies

Multiplication

Symmetric Matrices

Triangle Matrices

Diagonal Matrices

Identity Matrices

Guassian Elimination

Determinants

Sarrus’ Rule

Dimensions > 3

Vectors

Independence & Dependence

Basis

Subspaces

Linear Span

Linear Transformation

Images

Kernal

Rank

Eigenvalues & Eigenvectors

Good Explainations

Multiply Matrices

Extra

Fractions

Matrix Multiplication

Factor Out

Sigma

Inverse Matrix

Calculating Inverse Matrices 2x2

3x3

Gaussian Elimination

Modifying A Single Matrix Row

Pivots

Matrix To Vector Form

Dot / Inner Product

Vector Norm

Scalar Modulus

Normalising Vectors