skip to content
← Go back

Linear Algebra

Foreword

Started 06/06/2024

In hopes of understanding linear algebra, I rewrite what I learn in ways that got me to understanding what’s happening.

Prerequisities

Set Notation

Let say we have set

x={1,2,3}x = \{ 1, 2, 3 \}

Or, with the set notation \in

x{1,2,3}x \in \{ 1, 2, 3 \}

We can say “what is the set within x that are even integers?” which would be

y={2}y = \{ 2 \}

Or, we can say it’s a subset with the notation \subset

y{2}y \subset \{ 2 \}

And for anything not in the set we would use the notation \notin

3x3 \notin x

Implication & Equivalence

We can imply a burger has pork

 burger      contains pork \text{ burger } \implies \text{ contains pork }

But it doesn’t mean if it has pork its a burger

 contains pork      food dish \text{ contains pork } \implies \text{ food dish }

If we had something like

 degatchi’s super hot      degatchi is never not hot \text{ degatchi's super hot } \iff \text{ degatchi is never not hot }

Then we know its always true and we call this equivalence


Matricies

Multiplication

To get the product of 2 matrices the the number of columns in the left factor matches the number of rows in the right factor otherwise we end up not having corresponding varibles to multiply with.

So if we have

mnnpmn \cdot np

The inner variables, n, must be equal because the way we multiply matrices is

row  column\text{row } \cdot \text{ column}

and if there isn’t a coloumn to multiply our row then we cannot create the system.

The dimensions of the new matrix is

mpmp

As an example,

X matrix is 2x3 of size

X={x11x12x13x21x22x23} X = \begin{Bmatrix} x_{11} & x_{12} & x_{13} \\ x_{21} & x_{22} & x_{23} \\ \end{Bmatrix}

Whereas Y is 2x3

Y={y11y12y13y21y22y23} Y = \begin{Bmatrix} y_{11} & y_{12} & y_{13} \\ y_{21} & y_{22} & y_{23} \\ \end{Bmatrix}

And so multiplying them (2x3) * (2x3) wouldn’t work bc 3 and 2 are not equal so when we multiply the 3 rows by 2 columns we don’t have a column for the final row!

XY={x11x12x13x21x22x23}{y11y12y13y21y22y23} \begin{split} X \cdot Y &= \begin{Bmatrix} x_{11} & x_{12} & x_{13} \\ x_{21} & x_{22} & x_{23} \end{Bmatrix} \cdot \begin{Bmatrix} y_{11} & y_{12} & y_{13} \\ y_{21} & y_{22} & y_{23} \\ \end{Bmatrix} \\ \end{split}

What we expect is the result

={x11y11x12y21x13?x21y11x22y21x23?} \begin{split} &= \begin{Bmatrix} x_{11}y_{11} & x_{12}y_{21} & x_{13}? \\ x_{21}y_{11} & x_{22}y_{21} & x_{23}? \end{Bmatrix} \end{split}

Since we multiply row * column we never touch y13y_{13} and y23y_{23} and so our multiplication is incorrect.

But there is a solution!

We can represent our Y matrix as it’s inverse to have the ns match with the special transpose operator TT

and from here we can perform the multiplication of row * column :D

I’m going to use actual numbers so you can see the effect of TT

XY={x11x12x13x21x22x23}{135246}T={x11x12x13x21x22x23}{123456}={x111x123x135x212x224x236} \begin{split} X \cdot Y &= \begin{Bmatrix} x_{11} & x_{12} & x_{13} \\ x_{21} & x_{22} & x_{23} \end{Bmatrix} \cdot \begin{Bmatrix} 1 & 3 & 5 \\ 2 & 4 & 6 \end{Bmatrix}^T \\ &= \begin{Bmatrix} x_{11} & x_{12} & x_{13} \\ x_{21} & x_{22} & x_{23} \end{Bmatrix} \cdot \begin{Bmatrix} 1 & 2 \\ 3 & 4 \\ 5 & 6 \end{Bmatrix} \\ &= \begin{Bmatrix} x_{11}1 & x_{12}3 & x_{13}5 \\ x_{21}2 & x_{22}4 & x_{23}6 \end{Bmatrix} \end{split}

Yay!

Our new matrix is 2x3, as predicted from 2x3 * 2x3

Symmetric Matrices

These matrices are square matrices that are symmetric around their main diagonals.

For our example the diagonal would be {1,2,3}\{ 1, 2, 3 \}

{157528783} \begin{Bmatrix} 1 & 5 & 7 \\ 5 & 2 & 8 \\ 7 & 8 & 3 \\ \end{Bmatrix}

bc of the symmetry the matrix is always equal to it’s transpose!

Triangle Matrices

Upper + Lower triangular matrices are square matrices where the elements either above the main diagonal or below it are all equal to zero, e.g.

{157028003} \begin{Bmatrix} 1 & 5 & 7 \\ 0 & 2 & 8 \\ 0 & 0 & 3 \\ \end{Bmatrix}

or

{100520783} \begin{Bmatrix} 1 & 0 & 0 \\ 5 & 2 & 0 \\ 7 & 8 & 3 \\ \end{Bmatrix}

Diagonal Matrices

These are square matrices where the main diagonal are the only non-zero elements!

{100020003} \begin{Bmatrix} 1 & 0 & 0 \\ 0 & 2 & 0 \\ 0 & 0 & 3 \\ \end{Bmatrix}

Identity Matrices

These are diagonal matrices with all the non-zero elements equal to 1

{100010001} \begin{Bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \\ \end{Bmatrix}

Guassian Elimination

Guassian elimination is about trying to get the identity matrix, not about solving for variables.

Lets say we have

{311120} \begin{Bmatrix} 3 & 1 & 1 \\ 1 & 2 & 0 \\ \end{Bmatrix}

We want to isolate each var. So for our first row we want to isolate 3, first by eliminating row 1 column 2, 1.

We start by multiplying the top row by 2 since we want to eliminate the 2nd column of row 1.

{622120} \begin{Bmatrix} 6 & 2 & 2 \\ 1 & 2 & 0 \\ \end{Bmatrix}

Then we subtract the 2nd row from the first

{502120} \begin{Bmatrix} 5 & 0 & 2 \\ 1 & 2 & 0 \\ \end{Bmatrix}

Now that we have isolated row 1 column 1 we want to isolate row 2 column 2.

We apply the same technique, first by multiplying the 2nd row by the number we want to isolate, 5.

{5025100} \begin{Bmatrix} 5 & 0 & 2 \\ 5 & 10 & 0 \\ \end{Bmatrix}

Subtract 2nd row by first row

{5020102} \begin{Bmatrix} 5 & 0 & 2 \\ 0 & 10 & -2 \\ \end{Bmatrix}

Then we divide each row by their isolated element

{102501210} \begin{Bmatrix} 1 & 0 & \dfrac{2}{5} \\ \\ 0 & 1 & \dfrac{-2}{10} \\ \end{Bmatrix}

and we can see our identity matrix and their values

{x1100x22}={25210} \begin{Bmatrix} x_{11} & 0 \\ 0 & x_{22} \\ \end{Bmatrix} = \begin{Bmatrix} \dfrac{2}{5} \\ \\ \dfrac{-2}{10} \\ \end{Bmatrix}

But we really only care about the identity matrix.

The order of the factors doesn’t matter. The product is always the identity matrix.

We can verify this via

{1001}{25210}={25210}{1001} \begin{Bmatrix} 1 & 0 \\ \\ 0 & 1 \\ \end{Bmatrix} \begin{Bmatrix} \dfrac{2}{5} \\ \\ \dfrac{-2}{10} \\ \end{Bmatrix} = \begin{Bmatrix} \dfrac{2}{5} \\ \\ \dfrac{-2}{10} \\ \end{Bmatrix} \begin{Bmatrix} 1 & 0 \\ \\ 0 & 1 \\ \end{Bmatrix}

Determinants

A determinant is essentially the volume inside the dimensional matrix. If we have a matrix of 2x2 we an think of it as a square. The determinant tells us it’s scale and how much we can fit inside of it.

The inverse of a matrix exists as long as its determinant isn’t zero.

det{a11a12a21a22}=a11a22a12a21 det \begin{Bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \\ \end{Bmatrix} = a_{11}a_{22} - a_{12}a_{21}

Think of the first column is positive and the second is negative.

so for example,

det{3002}=3200=6 det \begin{Bmatrix} 3 & 0 \\ 0 & 2 \\ \end{Bmatrix} = 3 \cdot 2 - 0 \cdot 0 = 6

Since it’s determinant isn’t 0 it has an inverse!

Sarrus’ Rule

But what about for a 3x3 matrix? How do we get the determinant of that?

We essentially got to think multiplying right diagonals as positive terms and then subtracting left diagonals.

det{a11a12a13a21a22a23a31a32a33} det \begin{Bmatrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ a_{31} & a_{32} & a_{33} \\ \end{Bmatrix} =a11a22a33+a12a23a31+a13a21a32a13a22a31a12a21a33a11a23a32 \begin{split} = a_{11}a_{22}a_{33} + a_{12}a_{23}a_{31} + a_{13}a_{21}a_{32} - a_{13}a_{22}a_{31} - a_{12}a_{21}a_{33} - a_{11}a_{23}a_{32} \end{split}

Dimensions > 3

There are 3 rules to solve dimensions higher than 3.

Firslty, notice how the first index element is always from row 1 — going from one to the number of dimensions.

Secondly, for every right index, their orders are all permutations:

  • a11a22a33={1,2,3}a_{11}a_{22}a_{33} = \{ 1, 2, 3 \}
  • a12a23a31={2,3,1}a_{12}a_{23}a_{31} = \{ 2, 3, 1 \}
  • a13a21a32={3,1,2}a_{13}a_{21}a_{32} = \{ 3, 1, 2 \}
  • a13a22a31={3,2,1}a_{13}a_{22}a_{31} = \{ 3, 2, 1 \}
  • a12a21a33={2,1,3}a_{12}a_{21}a_{33} = \{ 2, 1, 3 \}
  • a11a23a32={1,3,2}a_{11}a_{23}a_{32} = \{ 1, 3, 2 \}

Lastly, we needto keep track of all the column indexes and make sure they’re in increasing order, e.g. if [3, 1, 2] we would say:

  1. 2 and 3 must switch
  2. 3 and 1 must switch

And if the total number of switches is even, we write the term as positive. If it is odd we write it as negative.


Vectors

Vectors can be thought as points from the origin, e.g. (3, 5) is an arrow from the origin, (0, 0). Then we can move this arrow, aka a vector, and it will still be the same length and direction.

We can also represent (3, 5) as {35}\begin{Bmatrix} 3 \\ 5 \end{Bmatrix} which we can call a column vector. Row vectors would be {35}\begin{Bmatrix} 3 & 5 \end{Bmatrix}.

We also call the set of all n * 1 matrices Rn\R^n, e.g.

R2={a1a2}\R^2 = \begin{Bmatrix} a_1 \\ a_2 \end{Bmatrix} R3={a1a2a3}\R^3 = \begin{Bmatrix} a_1 \\ a_2 \\ a_3 \end{Bmatrix} Rn={a1a2an}\R^n = \begin{Bmatrix} a_1 \\ a_2 \\ \vdots \\ a_n \end{Bmatrix}

When we think of R2\R^2 this means anything in the set of 2 dimensions, e.g., {(1,0),(9,27),(69,420)}\{ (1, 0), (9, 27), (69, 420) \} bc they’re all in the 2 dimensional field.

Independence & Dependence

Linear Independence is when there is only one solution. No matter what the vectors do they can never return back to the origin. Sometimes this is called one-dimensional independece. Usually to get these vectors to the origin you will need to multiply each vector by 0.

For example,

{a1a2}=c1{a1a2}+c2{a1a2}+c3{001} \begin{Bmatrix} a_1 \\ a_2 \end{Bmatrix} = c_1 \begin{Bmatrix} a_1 \\ a_2 \end{Bmatrix} + c_2 \begin{Bmatrix} a_1 \\ a_2 \end{Bmatrix} + c_3 \begin{Bmatrix} 0 \\ 0 \\ 1 \end{Bmatrix}

Linear dependence is the opposite. The vectors are able to work together to get back to the origin by manipulating the directions.

Basis

In ML, principal component analysis (PCA) is used to transform data to a new basis. This is why this is so important to understand.

A basis for lienar independent matrices is the set of matrices without the coefficients being multiplied (if there are any).

E.g.,

{x1x2}=c1{a1a2}c2{b1b2} \begin{Bmatrix} x_1 \\ x_2 \end{Bmatrix} = c_1 \begin{Bmatrix} a_1 \\ a_2 \end{Bmatrix} c_2 \begin{Bmatrix} b_1 \\ b_2 \end{Bmatrix}

Our basis would be of rank 2

 basis of {x1x2}={{a1a2},{b1b2}} \text{ basis of } \begin{Bmatrix} x_1 \\ x_2 \end{Bmatrix} = \left\{ \begin{Bmatrix} a_1 \\ a_2 \end{Bmatrix}, \begin{Bmatrix} b_1 \\ b_2 \end{Bmatrix} \right\}

If we had

{x1x2}=c1{a1a2}c2{a1a2}=c1c2{a1a2} \begin{Bmatrix} x_1 \\ x_2 \end{Bmatrix} = c_1 \begin{Bmatrix} a_1 \\ a_2 \end{Bmatrix} c_2 \begin{Bmatrix} a_1 \\ a_2 \end{Bmatrix} = c_1c_2 \begin{Bmatrix} a_1 \\ a_2 \end{Bmatrix}

Then our rank would be 1

The rank is rank is the number of linearly independent vectors among the columns of the matrix M.


If there are multiple solutions to the matrix then there is no basis — think of it as no solid foundation to build upon bc the numbers aren’t set in stone, like concrete, so the basis, aka our foundation, falls apart.

Tldr; the basis is a set of vectors that can create an vector within the realm of Rm\R^m. All of them are linearly indepdent. The way you can create any vector within that realm is from multiplying the set of vectors with a coefficient.

How do I turn the following basis into {1027}\begin{Bmatrix} 10 \\ 27 \end{Bmatrix}

 basis of ={{31},{12}} \text{ basis of } = \left\{ \begin{Bmatrix} 3 \\ 1 \end{Bmatrix}, \begin{Bmatrix} 1 \\ 2 \end{Bmatrix} \right\}

We have to use scalars as the coefficients to the matrices and solve for {1027}\begin{Bmatrix} 10 \\ 27 \end{Bmatrix}

This will look like

a{30}+b{03}={927} a \begin{Bmatrix} 3 \\ 0 \end{Bmatrix} + b \begin{Bmatrix} 0 \\ 3 \end{Bmatrix} = \begin{Bmatrix} 9 \\ 27 \end{Bmatrix}

Where a = 3 and b = 9 to give us

{927}=3{30}+9{03}={90}+{027}={927} \begin{split} \begin{Bmatrix} 9 \\ 27 \end{Bmatrix} &= 3 \begin{Bmatrix} 3 \\ 0 \end{Bmatrix} + 9 \begin{Bmatrix} 0 \\ 3 \end{Bmatrix} \\ &= \begin{Bmatrix} 9 \\ 0 \end{Bmatrix} + \begin{Bmatrix} 0 \\ 27 \end{Bmatrix} \\ &= \begin{Bmatrix} 9 \\ 27 \end{Bmatrix} \end{split}

If there are multiple ways to creating an arbitrary vector within Rn\R^n then there is no basis bc a basis is linearly indepdent not linearly dependent.


To solidify this, lets look at the set

 our proposed basis ={{100},{010}} \text{ our proposed basis } = \left\{ \begin{Bmatrix} 1 \\ 0 \\ 0 \end{Bmatrix}, \begin{Bmatrix} 0 \\ 1 \\ 0 \end{Bmatrix} \right\}

This set could never create

A={001} A = \begin{Bmatrix} 0 \\ 0 \\ 1 \end{Bmatrix}

Bc, no scalar could multiply the z dimension (3rd row) in either of the set’s vectors to create 1 in A.

Just bc a set of vectors is linearly independent doesn’t mean it forms a basis. However, a basis implies that the matrices are linear indepdent:

basis     {{100},{010}} \text{basis } \implies \left\{ \begin{Bmatrix} 1 \\ 0 \\ 0 \end{Bmatrix}, \begin{Bmatrix} 0 \\ 1 \\ 0 \end{Bmatrix} \right\}

Basis tldr; can you make arbitrary vector within Rm\R^m with the set of vertical vectors and their coefficient scalars? If you can make multiple, it’s not a base, if you can make one it’s linearly indepdent and a base!

Linear independece is about finding a path back to the origin. Bases are about finding a path to a vector in Rm\R^m

Subspaces

A subspace is a subset of vectors within a set of vectors, for instance within R3\R^3

So our format is this,

{{xx20}} \left\{ \begin{Bmatrix} x \\ x^2 \\ 0 \end{Bmatrix} \right\}

For a subspace to be established it must conform to this standard. So if we had for example where c = 2,

v=c{1120}=2{110}={220} \begin{split} v &= c \begin{Bmatrix} 1 \\ 1^2 \\ 0 \end{Bmatrix} \\ &= 2 \begin{Bmatrix} 1 \\ 1 \\ 0 \end{Bmatrix} \\ &= \begin{Bmatrix} 2 \\ 2 \\ 0 \end{Bmatrix} \end{split}

v is doesn’t adhere to the standard in row 2 x2x^2 since 2222 \not = 2^2

v{{xx20}} v \notin \left\{ \begin{Bmatrix} x \\ x^2 \\ 0 \end{Bmatrix} \right\}

Whereas if we had our subspace standard be

{{xx0}} \notin \left\{ \begin{Bmatrix} x \\ x \\ 0 \end{Bmatrix} \right\}

Then v would be adhering to the standard and thus apart of the subspace.

Linear Span

A linear span is the set of all possible linear combinations of the vectors within the Rn\R^n using scalar coefficients from the field

?????


Linear Transformation

Linear transformations takes vectors from V and turns them into vectors in W. If we transform a group of vectors from V we start to “map out” some vectors in W.

If S is a subspace of V then L(S) is the image of S. Think of it as shining a light through the vector space V and then seeing how much of the vector space behind it W is lit up — like a shadow — also called the range of L

Lets say we want to transform 3 dimensions to 2

L:R3R2 L : \R^3 \to \R^2

Then transforming R3\R^3 to R2\R^2 involves this subspace of R2\R^2

L(v)={v1v2v3} L(v) = \begin{Bmatrix}v_1 \\ v_2 - v_3 \\ \end{Bmatrix}

Where V={v1v2v3}V = \begin{Bmatrix}v_1 \\ v_2 \\ v_3 \end{Bmatrix} And S={c2c0}S = \begin{Bmatrix}c \\ 2c \\ 0 \\ \end{Bmatrix}

Then

L{c2c0}={c2c0}={c2c} L\begin{Bmatrix}c \\ 2c \\ 0 \\ \end{Bmatrix} = \begin{Bmatrix}c \\ 2c - 0 \\ \end{Bmatrix} = \begin{Bmatrix}c \\ 2c \\ \end{Bmatrix}

Where

{c2c} \begin{Bmatrix}c \\ 2c \\ \end{Bmatrix}

Is the image of the subspace S


The kernal, ker(L)ker(L) is the set of vectors in V that give the zero vector in W, 0w\overrightharpoon{0}_w

So, for

L:R3R2 L : \R^3 \to \R^2

We want to find which vectors in R3\R^3 give us the zero vector 0w\overrightharpoon{0}_w

L{v1v2v3}={v1v2v3}={00} L\begin{Bmatrix}v_1 \\ v_2 \\ v_3 \\ \end{Bmatrix} = \begin{Bmatrix}v_1 \\ v_2 - v_3 \\ \end{Bmatrix} = \begin{Bmatrix}0 \\ 0 \\ \end{Bmatrix}

So v1=0v_1 = 0 and v2=v3v_2 = v_3


if we multiply a 2x3 matrix by a 3x1 matrix, remembering mn * np we would end up with mp, 2x1 matrix, dropping a dimension!

A computer stores all transformations as 3x3 matrices — bc it can only display 3 dimensions.

Images

For example, lets find the image of

c1{31}+c2{12} c_1 \begin{Bmatrix} 3 \\ 1 \\ \end{Bmatrix} + c_2 \begin{Bmatrix} 1 \\ 2 \\ \end{Bmatrix}

by using the 2x2 linear transformation matrix

{8321} \begin{Bmatrix} 8 & -3 \\ 2 & 1 \\ \end{Bmatrix}

We can solve it like this

f(c1{31}+c2{12})={8321}[c1{31}+c2{12}]=c1{8321}{31}+c2{8321}{12}=c1{83(3)12311}+c2{81(3)22112}=c1{217}+c2{24}=c1[7{31}]+c2[2{12}] \begin{split} f\left(c_1 \begin{Bmatrix} 3 \\ 1 \\ \end{Bmatrix} + c_2 \begin{Bmatrix} 1 \\ 2 \\ \end{Bmatrix} \right) &= \begin{Bmatrix} 8 & -3 \\ 2 & 1 \\ \end{Bmatrix} \left\lbrack c_1 \begin{Bmatrix} 3 \\ 1 \\ \end{Bmatrix} + c_2 \begin{Bmatrix} 1 \\ 2 \\ \end{Bmatrix} \right\rbrack \\ &= c_1 \begin{Bmatrix} 8 & -3 \\ 2 & 1 \\ \end{Bmatrix} \begin{Bmatrix} 3 \\ 1 \\ \end{Bmatrix} + c_2 \begin{Bmatrix} 8 & -3 \\ 2 & 1 \\ \end{Bmatrix} \begin{Bmatrix} 1 \\ 2 \\ \end{Bmatrix} \\ &= c_1 \begin{Bmatrix} 8 \cdot 3 & (-3) \cdot 1 \\ 2 \cdot 3 & 1 \cdot 1 \\ \end{Bmatrix} + c_2 \begin{Bmatrix} 8 \cdot 1 & (-3) \cdot 2 \\ 2 \cdot 1 & 1 \cdot 2 \\ \end{Bmatrix} \\ &= c_1 \begin{Bmatrix} 21 \\ 7 \\ \end{Bmatrix} + c_2 \begin{Bmatrix} 2 \\ 4 \\ \end{Bmatrix} \\ &= c_1 \left\lbrack 7 \begin{Bmatrix} 3 \\ 1 \\ \end{Bmatrix} \right\rbrack + c_2 \left\lbrack 2 \begin{Bmatrix} 1 \\ 2 \\ \end{Bmatrix} \right\rbrack \\ \end{split}

And so we can see the answer can be expressed using the multiples of the original two vectors — transforming all points of the x1x2x_1x_2 plane.

Kernal


Rank

We can see in this matrix

V={1442583612} V = \begin{Bmatrix} 1 & 4 & 4 \\ 2 & 5 & 8 \\ 3 & 6 & 12 \\ \end{Bmatrix}

that the third column = first column * 4

{4812}=4{123}\begin{Bmatrix}4 \\ 8 \\ 12 \end{Bmatrix} = 4 \begin{Bmatrix}1 \\ 2 \\ 3 \end{Bmatrix}

which means that V has a rank of 2, linear indepdent vectors.


Eigenvalues & Eigenvectors

These come in handy when doing physics and statistics.

An eigenvalue, λ\lambda, is a scalar that is used to scale a matrix, e.g., λx,λR\lambda \overrightharpoon{x}, \lambda \in \R



Good Explainations

Multiply Matrices

One way is to take the dot (multiplication) product, e.g. uv=u1v1+u2v2+...+unvnu \cdot v = u_1v_1 + u_2v_2 + ... + u_nv_n

Linear combinations combines scalars are vectors in their own matrices. They sare the same as the dot product except u is replaced with scalars in a matrix k and v are vectors in a matrix of R^n.

Extra

  • Vectors express magnitude + direction
  • Scalars express only magnitude. They are numbers that “scale” up or down.

Fractions

1+32=22+32=2+32=12 \begin{split} -1 + \dfrac{3}{2} &= -\dfrac{2}{2} + \dfrac{3}{2} \\ \\ &= \dfrac{-2 + 3}{2} \\ \\ &= \dfrac{1}{2} \\ \end{split}

Another one is

1313=19\dfrac{1}{3} \cdot \dfrac{1}{3} = \dfrac{1}{9}

This one tripped me up — what does this equal?

20(54)20\left(\dfrac{5}{4}\right)

Really means 1 of itself +14+ \dfrac{1}{4} of it since

{20(44)=2020(14)=5 \begin{cases} 20\left(\dfrac{4}{4}\right) = 20 \\ \\ 20\left(\dfrac{1}{4}\right) = 5 \end{cases}

Then add them together and you get 25.


If you had 2-\sqrt{2} and you wanted to turn it into 1 then you would need to

122=122=22=1 \begin{split} \dfrac{-1}{\sqrt{2}} \cdot -\sqrt{2} &= \dfrac{-1 \cdot -\sqrt{2}}{\sqrt{2}} \\ \\ &= \dfrac{\sqrt{2}}{\sqrt{2}} \\ \\ &= 1 \end{split}

bc the sqrt takes the place of the numerator


If we had 133=1331=1331=33=11=1 \begin{split} \dfrac{-1}{3} \cdot 3 &= \dfrac{-1}{3} \cdot \dfrac{3}{1} \\ \\ &= \dfrac{-1 \cdot 3}{3 \cdot 1} \\ \\ &= \dfrac{-3}{3} \\ \\ &= \dfrac{-1}{1} \\ \\ &= -1 \end{split} bc the division and multiplication by the same number cancels the demominator out to become 1 then 51\dfrac{-5}{1}


Matrix Multiplication

Matrix multiplication

{abcd}{efgh} \begin{split} \begin{Bmatrix} a & b \\ c & d \\ \end{Bmatrix} \cdot \begin{Bmatrix} e & f \\ g & h \\ \end{Bmatrix} \end{split}

is really saying

{ab}{eg} \begin{split} \begin{Bmatrix} a & b \\ \end{Bmatrix} \cdot \begin{Bmatrix} e \\ g \\ \end{Bmatrix} \end{split}

And when we think of mn * np this multiplication is (1 * 2) * (2 * 1) which means the resulting matrix will be a mp or 1 * 1.

So we think of row * column.

Unintuitvely, we need to think of the multiplication as the dot product instead of the intuitive a * e is the element then b * g is the other element. This is why we sum everything up for the new element.

The dot product is the product of all the multiplication, aka “dots” since Latex multiplication signs are called “dot”s.

As a side note

A={abcd} \begin{split} A = \begin{Bmatrix} a & b \\ c & d \\ \end{Bmatrix} \end{split}

Can be represented as A = [a b; c d].

  • An identity matrix (always a square) is what x1 = x is for matrix, AI = A, i.e.,
{abcd}{1001}={abcd} \begin{split} \begin{Bmatrix} a & b \\ c & d \\ \end{Bmatrix} \cdot \begin{Bmatrix} 1 & 0 \\ 0 & 1 \\ \end{Bmatrix} &= \begin{Bmatrix} a & b \\ c & d \\ \end{Bmatrix} \end{split}

Even for different sized matrices the identity needs to be square

{abcdef}{100010001}={abcdef} \begin{split} \begin{Bmatrix} a & b & c \\ d & e & f \\ \end{Bmatrix} \cdot \begin{Bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \\ \end{Bmatrix} &= \begin{Bmatrix} a & b & c \\ d & e & f \\ \end{Bmatrix} \end{split}

Matrix multiplication isn’t commutative bc the order we perform multiplication amtters AB != BAbc if we have a 2x3 * 3x3 we can multiply that but if its reversed where 3x3 * 2x3 then from mn * np doesn’t work bc n doesn’t match.

Similarly, we cannot definite AnA^n for non-square matrices bc if we have 2x3 and we multiply 2x3 * 2x3 * 2x3 we know from mn * np 2x3 * 2x3 wont work because the ns don’t match (2, 3).

However if

A0=I (Identity matrix)A^0 = I \text{ (Identity matrix)}

We cannot divide matrices

Factor Out

For

{131313131313131313}=13{111111111} \begin{split} \begin{Bmatrix} \dfrac{1}{3} & \dfrac{1}{3} & \dfrac{1}{3} \\ \\ \dfrac{1}{3} & \dfrac{1}{3} & \dfrac{1}{3} \\ \\ \dfrac{1}{3} & \dfrac{1}{3} & \dfrac{1}{3} \end{Bmatrix} = \dfrac{1}{3} \begin{Bmatrix} 1 & 1 & 1 \\ 1 & 1 & 1 \\ 1 & 1 & 1 \end{Bmatrix} \\ \end{split}

If we had A^2 then knowing what we have above

13{111111111}13{111111111}=(1313){111111111}=19{111111111} \begin{split} \dfrac{1}{3} \begin{Bmatrix} 1 & 1 & 1 \\ 1 & 1 & 1 \\ 1 & 1 & 1 \\ \end{Bmatrix} \cdot \dfrac{1}{3} \begin{Bmatrix} 1 & 1 & 1 \\ 1 & 1 & 1 \\ 1 & 1 & 1 \\ \end{Bmatrix} &= \left( \dfrac{1}{3} \cdot \dfrac{1}{3} \right) \cdot \begin{Bmatrix} 1 & 1 & 1 \\ 1 & 1 & 1 \\ 1 & 1 & 1 \\ \end{Bmatrix} \\ &= \dfrac{1}{9} \cdot \begin{Bmatrix} 1 & 1 & 1 \\ 1 & 1 & 1 \\ 1 & 1 & 1 \\ \end{Bmatrix} \end{split}

Sigma

The sigma notation i=1n2n=21+22+...+210\displaystyle\sum_{i=1}^n 2^n = 2^1 + 2^2 + ... + 2^{10} is a for loop in programming.

Whatever is to the right of sigma means perform this operation per loop. It means the summing of the the term aiKbKja_{iK} b_{Kj} from k = 1 to k = n. Think as k as the current number in the iteration and n as the max number. For some reason math fellas use k instead of the engineering notation of i.

Inverse Matrix

Some matrices have a corresponding inverse matrix.

We know scalars have an inverse, for example 212=2112=12 \cdot \dfrac{1}{2} = \dfrac{2}{1} \cdot \dfrac{1}{2} = 1 or 166=1\dfrac{1}{6} \cdot 6 = 1

But what is the matrix equivalent of 1?

The identity matrix, a matrix with 1s across it’s diagonal and 0s everywhere else.

So, AA1=A1A=IA \cdot A^{-1} = A^{-1} \cdot A = I

But we also we can go back

(A1)1=A(A^{-1})^{-1} = A

There is another interesting property

(AB)1=B1A1(AB)^{-1} = B^{-1}A^{-1}

But why are they swapped?

If we had f(x)=ABxf(\vec{x}) = AB\vec{x} then we would be multiplying BxB\vec{x} first

Since the last thing we did with the normal function was multiply by A the first thing we would do with the inverse is mutliply by A w/ the vec. The inverse would need the A1xA^{-1}\vec{x} to be first bc

f1(x)=B1A1xf^{-1}(\vec{x}) = B^{-1}A^{-1}\vec{x}

When

Calculating Inverse Matrices 2x2

If we had a matrix then we find the inverse by

(abcd)1=1adbc(dbca)1 \begin{pmatrix} a & b \\ c & d \\ \end{pmatrix}^{-1} = \dfrac{1}{ad - bc} \begin{pmatrix} d & -b \\ -c & a \\ \end{pmatrix}^{-1}

Then we can get the identity matrix via

(abcd)(abcd)1=(1001)1 \begin{pmatrix} a & b \\ c & d \\ \end{pmatrix} \begin{pmatrix} a & b \\ c & d \\ \end{pmatrix}^{-1} = \begin{pmatrix} 1 & 0 \\ 0 & 1 \\ \end{pmatrix}^{-1}

bc this is the same as, remember, the original value multiplied it’s inverse 2112\dfrac{2}{1} \cdot \dfrac{1}{2} .

3x3

What if we had a larger matrix? Then what?

A=(430120001)1 A = \begin{pmatrix} 4 & 3 & 0 \\ 1 & 2 & 0 \\ 0 & 0 & 1 \\ \end{pmatrix}^{-1}

We put it in an augmented matrix

[AI][IA1][A | I] \to [I|A^{-1}]

Lets explain this. When we row reduce this to match the identity then the right side in the augmented matrix actually becomes our inverse matrix.

=(430100120010001001) = \begin{pmatrix} 4 & 3 & 0 & | & 1 & 0 & 0 \\ 1 & 2 & 0 & | & 0 & 1 & 0 \\ 0 & 0 & 1 & | & 0 & 0 & 1 \\ \end{pmatrix}

We first move the R2R_2 to R1R_1 to make 1 at x11x_{11} easier to work with

=(120010430100001001) = \begin{pmatrix} 1 & 2 & 0 & | & 0 & 1 & 0 \\ 4 & 3 & 0 & | & 1 & 0 & 0 \\ 0 & 0 & 1 & | & 0 & 0 & 1 \\ \end{pmatrix}

To turn 4 in row 2 into 0 we need to R24R1R_2 - 4R_1

=(120010050140001001) = \begin{pmatrix} 1 & 2 & 0 & | & 0 & 1 & 0 \\ 0 & -5 & 0 & | & 1 & -4 & 0 \\ 0 & 0 & 1 & | & 0 & 0 & 1 \\ \end{pmatrix}

Then we multiply 15R2-\dfrac{1}{5}R_2 to turn -5 into 1

=(12000101015450001001) = \begin{pmatrix} 1 & 2 & 0 & | & 0 & 0 & 1 \\ \\ 0 & 1 & 0 & | & \dfrac{-1}{5} & \dfrac{4}{5} & 0 \\ \\ 0 & 0 & 1 & | & 0 & 0 & 1 \\ \end{pmatrix}

Then remove the 2 in the first row we need to R12R2R_1 - 2R_2

And now we get the identity matrix of the left and the inverse of A, our original matrix A1A^{-1} and so AA1=IAA^{-1} = I

=(1002535001015450001001) = \begin{pmatrix} 1 & 0 & 0 & | & \dfrac{2}{5} & \dfrac{3}{5} & 0 \\ \\ 0 & 1 & 0 & | & \dfrac{-1}{5} & \dfrac{4}{5} & 0 \\ \\ 0 & 0 & 1 & | & 0 & 0 & 1 \\ \end{pmatrix}

If the reduce row echelon form (rref) of the matrix has a row of zeros then the given matrix is non-invertible.


Gaussian Elimination

[112313572271] \begin{split} \begin{bmatrix} 1 & 1 & 2 & | & 3 \\ -1 & 3 & -5 & | & 7 \\ 2 & -2 & 7 & | & 1 \\ \end{bmatrix} \end{split}

To get to a row echelon form we would start by trying to make x21=1x_{21} = -1 into x21=0x_{21} = 0

We start by R2+R1R_{2} + R_{1} bc -1 + 1 = 0 so

=[1123(1+1)(3+1)(5+2)(7+3)2271]=[1123043102271] \begin{split} &= \begin{bmatrix} 1 & 1 & 2 & | & 3 \\ (-1 + 1) & (3 + 1) & (-5 + 2) & | & (7 + 3) \\ 2 & -2 & 7 & | & 1 \\ \end{bmatrix} \\ &= \begin{bmatrix} 1 & 1 & 2 & | & 3 \\ 0 & 4 & -3 & | & 10 \\ 2 & -2 & 7 & | & 1 \end{bmatrix} \end{split}

Then we want x31=2x_{31} = 2 to be 0. So, R32R1R_{3} - 2R_{1} bc 2 - (2*1) = 0 so

=[112304310(22(1))(22(1))(72(2))(12(3))]=[1123043100435] \begin{split} &= \begin{bmatrix} 1 & 1 & 2 & | & 3 \\ 0 & 4 & -3 & | & 10 \\ (2 - 2(1)) & (-2 - 2(1)) & (7 - 2(2)) & | & (1 - 2(3)) \\ \end{bmatrix}\\ &= \begin{bmatrix} 1 & 1 & 2 & | & 3 \\ 0 & 4 & -3 & | & 10 \\ 0 & -4 & 3 & | & -5 \end{bmatrix} \end{split}

Then we want x32=4x_{32} = -4 to be 0.

Since we don’t want to interfere with our progress in the first column we have to use the 2nd row bc it also has a 0 in the first column.

So,

R3+2R1R_{3} + 2R_{1}

bc -4 + 4 = 0

=[1123043100(4+4)(3+(3))(510)]=[11230431000015] \begin{split} &= \begin{bmatrix} 1 & 1 & 2 & | & 3 \\ 0 & 4 & -3 & | & 10 \\ 0 & (-4 + 4) & (3 + (-3)) & | & (-5 - 10) \end{bmatrix} \\ &= \begin{bmatrix} 1 & 1 & 2 & | & 3 \\ 0 & 4 & -3 & | & 10 \\ 0 & 0 & 0 & | & -15 \end{bmatrix} \end{split}

And bang! That’s Guassian elimination to transform the matrix into row echelon form.

But wait! We see the final column has all zeros so we call this linear system is inconsistent, which really means that the planes (equations) do not meet — there is no point common to all 3, meaning there is no solution!

Consistent systems are those where there is a common point between the equations. We would have R33=1R_{33} = 1 and so you know z = -15 or if R33=0,R44=0R_{33} = 0, R_{44} = 0 then that would of worked too bc z can be any number and have infinite solutions

Modifying A Single Matrix Row

If we had

=[1230021000000] \begin{split} &= \begin{bmatrix} -1 & 2 & 3 & | & 0 \\ 0 & -2 & -10 & | & 0 \\ 0 & 0 & 0 & | & 0 \end{bmatrix} \\ \end{split}

We want to turn -2 into 1. We can do this by R22\dfrac{-R_2}{2} meaning we multiply our row by the opposite so -2 * 2 then halving the row.

=[123001500000] \begin{split} &= \begin{bmatrix} -1 & 2 & 3 & | & 0 \\ 0 & 1 & 5 & | & 0 \\ 0 & 0 & 0 & | & 0 \end{bmatrix} \\ \end{split}

Then we can multiply the top row by -1 to turn -11 into 1

=[123001500000] \begin{split} &= \begin{bmatrix} 1 & -2 & -3 & | & 0 \\ 0 & 1 & 5 & | & 0 \\ 0 & 0 & 0 & | & 0 \end{bmatrix} \\ \end{split}

This tells us y + 5z = 0 which gives y = -5z which means we can provide any number to create it thus resulting in infinite solutions.

We can say this matrix is (AO)(A | O) meaing that | split difference of the right and left. The Left side is the non-zeros A and the O is the zero matrix.

Since all the constants on the far right are zero we can say this lienar system is homogeneous, meaning they’re all the same. In non-square matrices there will be a row with no leading 1 so it has infinite solutions bc reduce row echelon has a diagonal 1 but rectanges don’t so there will be an open 0 row. We can think of the unknowns (x, y, z) relative to the 1s, if they don’t match then there are infinite number of solutions.

Pivots

Before we get into vector form.

In each row, the first 1 from the left of the row is called the pivot.

[xyzwb1101090017700000] \begin{bmatrix} x & y & z & w & | & b \\ \hline 1 & 1 & 0 & -10 & | & -9 \\ 0 & 0 & 1 & -7 & | & -7 \\ 0 & 0 & 0 & 0 & | & 0 \end{bmatrix} \\

See how we have 1 at R12R_{12} behind R11R_{11}? This isn’t a pivot because it’s after the the 1 at R11R_{11} and so that column is actually a free variable.

Each pivot indicates that’s the column variable you’re solving for. So, the first row’s pivot R11R_{11} is in xx‘s column meaning we’re solving for x.

So this equation would be x=y+10w9x = -y + 10w - 9 bc we move the y and -10w to the right hand side augmented matrxi of the matrix, this is why -9 remains the same.

And so since y column has no pivot we actually say y can be any number or y = y.

The next pivot is R23R_{23}

Matrix To Vector Form

This augmented matrix of reduce row echelon form has 2 pivots in column x_1 w/ R11R_{11} and column x_2 w/ R22R_{22}

Columns x_3 and x_4 have what we called a free variable bc they have no pivots

[x1x2x3x4b103130121200000] \begin{split} \begin{bmatrix} x_1 & x_2 & x_3 & x_4 & | & b \\ \hline 1 & 0 & 3 & -1 & | & -3 \\ 0 & 1 & 2 & -1 & | & -2 \\ 0 & 0 & 0 & 0 & | & 0 \end{bmatrix} \\ \end{split}

And so we want to turn this augmented matrix back to the written equations by grabbing each row and isolating the pivot of the column it’s in x_n. We isolate it by moving all the other vars into the b column w/ just normal algebra. Let’s show what the equation would look like pre-movement to b column:

{x1+3x3x4=3x2+2x3x4=2 \begin{cases} x_1 + 3x_3 - x_4 = -3 \\ x_2 + 2x_3 - x_4 = -2 \end{cases}

Then lets move it over to b column where the pivot var will be isolated.

The non-pivot columns have free variables, meaning they can be whatever they want. So x3x_3 and x4x_4 can be 1 to make it easier for ourselves :)

{x1=3x3+x43x2=2x3+x42x3=x3 or sx4=x4 or t \begin{cases} x_1 = -3x_3 + x_4 -3 \\ x_2 = -2x_3 + x_4 -2 \\ x_3 = x_3 \text{ or } s \\ x_4 = x_4 \text{ or } t \end{cases}

And now we need to write them as one equation as a vector.

x=[x1x2x3x4]=[3x3+x432x3+x42st] \begin{split} \vec{x} &= \begin{bmatrix} x_1 \\ x_2 \\ x_3 \\ x_4 \end{bmatrix} \\ \\ &= \begin{bmatrix} -3x_3 + x_4 -3 \\ -2x_3 + x_4 -2 \\ s \\ t \end{bmatrix} \\ \\ \end{split}

Then we turn them into their own vectors of x3x_3, x4x_4 and constants (since that’s all we have).

=[3x32x3x30]+[x4x40x4]+[3200] \begin{split} &= \begin{bmatrix} -3x_3 \\ -2x_3 \\ x_3 \\ 0 \end{bmatrix} + \begin{bmatrix} x_4 \\ x_4 \\ 0 \\ x_4 \end{bmatrix} + \begin{bmatrix} -3 \\ -2 \\ 0 \\ 0 \end{bmatrix} \end{split}

And then we can find our solution by factoring out the like terms

=x3[3210]+x4[1101]+[3200] \begin{split} &= x_3 \begin{bmatrix} -3 \\ -2 \\ 1 \\ 0 \end{bmatrix} + x_4 \begin{bmatrix} 1 \\ 1 \\ 0 \\ 1 \end{bmatrix} + \begin{bmatrix} -3 \\ -2 \\ 0 \\ 0 \end{bmatrix} \end{split}

x3x_3 and x4x_4 can be 1 or 0 remember, since they’re free variables! :)

Dot / Inner Product

u=[u1un],v=[v1vn] \begin{split} u = \begin{bmatrix} u_1 \\ \vdots \\ u_n \end{bmatrix}, v = \begin{bmatrix} v_1 \\ \vdots \\ v_n \end{bmatrix} \end{split}

The dot product of u and v,

uvu \cdot v

Can be viewed as the transpose matrix multiplcation

=uTv=[u1un]T[v1vn]=u1v1++unvn = u^{T}v = \begin{bmatrix} u_1 \\ \vdots \\ u_n \end{bmatrix}^T \begin{bmatrix} v_1 \\ \vdots \\ v_n \end{bmatrix} = u_1 v_1 + \cdots + u_n v_n

For mn * np

Its really just a short-hand way of saying do matrix multiplication BUT it results in a real number scalar, not a vector!

If the dot product is zero then the vectors are perpendicular (right angle in 2D and 3D) or orthogonal (right angle in n > 3 dimensions)!

Vector Norm

The norm or length of a vector is the absolute sign but doubled up.

Lets say we want to know the norm of vector v=[xy]v = \begin{bmatrix} x \\ y \\ \end{bmatrix}

v=x2+y2||v|| = \sqrt{x^2 + y^2}

What about a vector of n dimensions such as?

=v=[v1vn] = v = \begin{bmatrix} v_1 \\ \vdots \\ v_n \end{bmatrix}

We would say

v=(v1)2++(vn)2||v|| = \sqrt{(v_1)^2 + \cdots + (v_n)^2}

But we can also write it as the sqrt of the dot product because if it’s multiplying itself then the values within the matrix are all squared

v=vv||v|| = \sqrt{v \cdot v}

This is called the Euclidean norm.

But why wouldn’t squaring v work?

vv2||v|| \not = \sqrt{v^2}

bc we dont ouput a scalar!!! we have merely squared all the values in the matrix and so it still remains a vector, not our desired scalar used for our distance. We can’t have a matrix distance, silly retard :P


If we wanted to find the distance between two vectors we would say

uv=d(u,v)||u - v|| = d(u, v)

This is what GPS systems and satellites would use to detect their relative distances.

If we had

u2||u||^2

Then to evalute it we know

u=uu||u|| = \sqrt{u \cdot u}

Therefore, we remove the sqrt of the dot product which means the square is reversed

u2=uu||u||^2 = u \cdot u

Norm Difference

uv=(7,2)(5,3)=(12,5)=122+(5)2||u - v|| = ||(7, -2) - (-5, 3)|| = ||(12, -5)|| = \sqrt{12^2 + (-5)^2}

Mental model for the norm.

Let say u=(2,7)Tu = (2, -7)^T

How would we think about this?

1uu \left|\left| \dfrac{1}{||u||} u \right|\right|

First we would want to solve for u||u||

u=uu=22+(7)2=53||u|| = \sqrt{u \cdot u} = \sqrt{2^2 + (-7)^2} = \sqrt{53}

And so we have

153(27)\dfrac{1}{\sqrt{53}}\begin{pmatrix}2\\-7\end{pmatrix}

And so we simply want to find the norm of

153(27)\left|\left| \dfrac{1}{\sqrt{53}}\begin{pmatrix}2\\-7\end{pmatrix} \right|\right|

Which when thinking of the dot product is

=153(27)153(27)= \sqrt{\dfrac{1}{\sqrt{53}}\begin{pmatrix}2\\-7\end{pmatrix} \cdot \dfrac{1}{\sqrt{53}}\begin{pmatrix}2\\-7\end{pmatrix}}

And using the commutative properity

=153153(27)(27)= \sqrt{\dfrac{1}{\sqrt{53}}\dfrac{1}{\sqrt{53}}\begin{pmatrix}2\\-7\end{pmatrix}}\begin{pmatrix}2\\-7\end{pmatrix}

1/53 * 1/53 = 1/53 bc 1 * 1 = 1 with the same denominator and we use the dot product for our matrices

=153(22+(7)2)= \sqrt{\dfrac{1}{\sqrt{53}}(2^2 + (-7)^2)}

Simplify

=153(53)=1=1= \sqrt{\dfrac{1}{\sqrt{53}}(53)} = \sqrt{1} = 1

Scalar Modulus

For scalar k we define the modulus of k denoted k|k| as

k=k2|k| = \sqrt{k^2}

So then we can

ku=kuku=k2(uu)=k2uu=ku \begin{split} ||ku|| &= \sqrt{ku \cdot ku} \\ &= \sqrt{k^2 (u \cdot u)} \\ &= \sqrt{k^2} \sqrt{u \cdot u} \\ &= |k|||u|| \end{split}

Normalising Vectors

The process of finding a unit vector in the direction of the given vector u is called normalising.

The unit vector in the direction of the vector u is normally denoted by u^\hat{u} (pronounced ‘u hat’) meaning it is a vector of lenth 1, that is

u^=1uu\hat{u} = \dfrac{1}{||u||}u