In hopes of understanding linear algebra, I rewrite what I learn in ways that got me to understanding what’s happening.
Prerequisities
Set Notation
Let say we have set
x={1,2,3}
Or, with the set notation ∈
x∈{1,2,3}
We can say “what is the set within x that are even integers?” which would be
y={2}
Or, we can say it’s a subset with the notation ⊂
y⊂{2}
And for anything not in the set we would use the notation ∈/
3∈/x
Implication & Equivalence
We can imply a burger has pork
burger ⟹ contains pork
But it doesn’t mean if it has pork its a burger
contains pork ⟹ food dish
If we had something like
degatchi’s super hot ⟺ degatchi is never not hot
Then we know its always true and we call this equivalence
Matricies
Multiplication
To get the product of 2 matrices the the number of columns in the left factor matches the number of rows in the right factor otherwise we end up not having corresponding varibles to multiply with.
So if we have
mn⋅np
The inner variables, n, must be equal because the way we multiply matrices is
row ⋅ column
and if there isn’t a coloumn to multiply our row then we cannot create the system.
The dimensions of the new matrix is
mp
As an example,
X matrix is 2x3 of size
X={x11x21x12x22x13x23}
Whereas Y is 2x3
Y={y11y21y12y22y13y23}
And so multiplying them (2x3) * (2x3) wouldn’t work bc 3 and 2 are not equal so when we multiply the 3 rows by 2 columns we don’t have a column for the final row!
A determinant is essentially the volume inside the dimensional matrix. If we have a matrix of 2x2 we an think of it as a square. The determinant tells us it’s scale and how much we can fit inside of it.
The inverse of a matrix exists as long as its determinant isn’t zero.
det{a11a21a12a22}=a11a22−a12a21
Think of the first column is positive and the second is negative.
so for example,
det{3002}=3⋅2−0⋅0=6
Since it’s determinant isn’t 0 it has an inverse!
Sarrus’ Rule
But what about for a 3x3 matrix? How do we get the determinant of that?
We essentially got to think multiplying right diagonals as positive terms and then subtracting left diagonals.
There are 3 rules to solve dimensions higher than 3.
Firslty, notice how the first index element is always from row 1 — going from one to the number of dimensions.
Secondly, for every right index, their orders are all permutations:
a11a22a33={1,2,3}
a12a23a31={2,3,1}
a13a21a32={3,1,2}
a13a22a31={3,2,1}
a12a21a33={2,1,3}
a11a23a32={1,3,2}
Lastly, we needto keep track of all the column indexes and make sure they’re in increasing order, e.g. if [3, 1, 2] we would say:
2 and 3 must switch
3 and 1 must switch
And if the total number of switches is even, we write the term as positive. If it is odd we write it as negative.
Vectors
Vectors can be thought as points from the origin, e.g. (3, 5) is an arrow from the origin, (0, 0). Then we can move this arrow, aka a vector, and it will still be the same length and direction.
We can also represent (3, 5) as {35} which we can call a column vector. Row vectors would be {35}.
We also call the set of all n * 1 matrices Rn, e.g.
When we think of R2 this means anything in the set of 2 dimensions, e.g., {(1,0),(9,27),(69,420)} bc they’re all in the 2 dimensional field.
Independence & Dependence
Linear Independence is when there is only one solution. No matter what the vectors do they can never return back to the origin. Sometimes this is called one-dimensional independece. Usually to get these vectors to the origin you will need to multiply each vector by 0.
The rank is rank is the number of linearly independent vectors among the columns of the matrix M.
If there are multiple solutions to the matrix then there is no basis — think of it as no solid foundation to build upon bc the numbers aren’t set in stone, like concrete, so the basis, aka our foundation, falls apart.
Tldr; the basis is a set of vectors that can create an vector within the realm of Rm. All of them are linearly indepdent. The way you can create any vector within that realm is from multiplying the set of vectors with a coefficient.
How do I turn the following basis into {1027}
basis of ={{31},{12}}
We have to use scalars as the coefficients to the matrices and solve for {1027}
This will look like
a{30}+b{03}={927}
Where a = 3 and b = 9 to give us
{927}=3{30}+9{03}={90}+{027}={927}
If there are multiple ways to creating an arbitrary vector within Rn then there is no basis bc a basis is linearly indepdent not linearly dependent.
Bc, no scalar could multiply the z dimension (3rd row) in either of the set’s vectors to create 1 in A.
Just bc a set of vectors is linearly independent doesn’t mean it forms a basis. However, a basis implies that the matrices are linear indepdent:
basis ⟹⎩⎨⎧⎩⎨⎧100⎭⎬⎫,⎩⎨⎧010⎭⎬⎫⎭⎬⎫
Basis tldr; can you make arbitrary vector within Rm with the set of vertical vectors and their coefficient scalars? If you can make multiple, it’s not a base, if you can make one it’s linearly indepdent and a base!
Linear independece is about finding a path back to the origin. Bases are about finding a path to a vector in Rm
Subspaces
A subspace is a subset of vectors within a set of vectors, for instance within R3
So our format is this,
⎩⎨⎧⎩⎨⎧xx20⎭⎬⎫⎭⎬⎫
For a subspace to be established it must conform to this standard. So if we had for example where c = 2,
v=c⎩⎨⎧1120⎭⎬⎫=2⎩⎨⎧110⎭⎬⎫=⎩⎨⎧220⎭⎬⎫
v is doesn’t adhere to the standard in row 2x2 since 2=22
v∈/⎩⎨⎧⎩⎨⎧xx20⎭⎬⎫⎭⎬⎫
Whereas if we had our subspace standard be
∈/⎩⎨⎧⎩⎨⎧xx0⎭⎬⎫⎭⎬⎫
Then v would be adhering to the standard and thus apart of the subspace.
Linear Span
A linear span is the set of all possible linear combinations of the vectors within the Rn using scalar coefficients from the field
?????
Linear Transformation
Linear transformations takes vectors from V and turns them into vectors in W. If we transform a group of vectors from V we start to “map out” some vectors in W.
If S is a subspace of V then L(S) is the image of S. Think of it as shining a light through the vector space V and then seeing how much of the vector space behind it W is lit up — like a shadow — also called the range of L
Lets say we want to transform 3 dimensions to 2
L:R3→R2
Then transforming R3 to R2 involves this subspace of R2
L(v)={v1v2−v3}
Where V=⎩⎨⎧v1v2v3⎭⎬⎫
And S=⎩⎨⎧c2c0⎭⎬⎫
Then
L⎩⎨⎧c2c0⎭⎬⎫={c2c−0}={c2c}
Where
{c2c}
Is the image of the subspace S
The kernal, ker(L) is the set of vectors in V that give the zero vector in W, 0w
So, for
L:R3→R2
We want to find which vectors in R3 give us the zero vector 0w
L⎩⎨⎧v1v2v3⎭⎬⎫={v1v2−v3}={00}
So v1=0 and v2=v3
if we multiply a 2x3 matrix by a 3x1 matrix, remembering mn * np we would end up with mp, 2x1 matrix, dropping a dimension!
A computer stores all transformations as 3x3 matrices — bc it can only display 3 dimensions.
And so we can see the answer can be expressed using the multiples of the original two vectors — transforming all points of the x1x2 plane.
Kernal
Rank
We can see in this matrix
V=⎩⎨⎧1234564812⎭⎬⎫
that the third column = first column * 4
⎩⎨⎧4812⎭⎬⎫=4⎩⎨⎧123⎭⎬⎫
which means that V has a rank of 2, linear indepdent vectors.
Eigenvalues & Eigenvectors
These come in handy when doing physics and statistics.
An eigenvalue, λ, is a scalar that is used to scale a matrix, e.g., λx,λ∈R
Good Explainations
Multiply Matrices
One way is to take the dot (multiplication) product, e.g. u⋅v=u1v1+u2v2+...+unvn
Linear combinations combines scalars are vectors in their own matrices. They sare the same as the dot product except u is replaced with scalars in a matrix k and v are vectors in a matrix of R^n.
Extra
Vectors express magnitude + direction
Scalars express only magnitude. They are numbers that “scale” up or down.
Fractions
−1+23=−22+23=2−2+3=21
Another one is
31⋅31=91
This one tripped me up — what does this equal?
20(45)
Really means 1 of itself +41 of it since
⎩⎨⎧20(44)=2020(41)=5
Then add them together and you get 25.
If you had
−2 and you wanted to turn it into 1 then you would need to
2−1⋅−2=2−1⋅−2=22=1
bc the sqrt takes the place of the numerator
If we had
3−1⋅3=3−1⋅13=3⋅1−1⋅3=3−3=1−1=−1 bc the division and multiplication by the same number cancels the demominator out to become 1 then 1−5
Matrix Multiplication
Matrix multiplication
{acbd}⋅{egfh}
is really saying
{ab}⋅{eg}
And when we think of mn * np this multiplication is (1 * 2) * (2 * 1) which means the resulting matrix will be a mp or 1 * 1.
So we think of row * column.
Unintuitvely, we need to think of the multiplication as the dot product instead of the intuitive a * e is the element then b * g is the other element. This is why we sum everything up for the new element.
The dot product is the product of all the multiplication, aka “dots” since Latex multiplication signs are called “dot”s.
As a side note
A={acbd}
Can be represented as A = [a b; c d].
An identity matrix (always a square) is what x1 = x is for matrix, AI = A, i.e.,
{acbd}⋅{1001}={acbd}
Even for different sized matrices the identity needs to be square
{adbecf}⋅⎩⎨⎧100010001⎭⎬⎫={adbecf}
Matrix multiplication isn’t commutative bc the order we perform multiplication amtters AB != BAbc if we have a 2x3 * 3x3 we can multiply that but if its reversed where 3x3 * 2x3 then from mn * np doesn’t work bc n doesn’t match.
Similarly, we cannot definite An for non-square matrices bc if we have 2x3 and we multiply 2x3 * 2x3 * 2x3 we know from mn * np2x3 * 2x3 wont work because the ns don’t match (2, 3).
The sigma notationi=1∑n2n=21+22+...+210 is a for loop in programming.
Whatever is to the right of sigma means perform this operation per loop. It means the summing of the the term aiKbKj from k = 1 to k = n. Think as k as the current number in the iteration and n as the max number. For some reason math fellas use k instead of the engineering notation of i.
Inverse Matrix
Some matrices have a corresponding inverse matrix.
We know scalars have an inverse, for example 2⋅21=12⋅21=1 or 61⋅6=1
But what is the matrix equivalent of 1?
The identity matrix, a matrix with 1s across it’s diagonal and 0s everywhere else.
So, A⋅A−1=A−1⋅A=I
But we also we can go back
(A−1)−1=A
There is another interesting property
(AB)−1=B−1A−1
But why are they swapped?
If we had f(x)=ABx then we would be multiplying Bx first
Since the last thing we did with the normal function was multiply by A the first thing we would do with the inverse is mutliply by A w/ the vec.
The inverse would need the A−1x to be first bc
f−1(x)=B−1A−1x
When
Calculating Inverse Matrices 2x2
If we had a matrix then we find the inverse by
(acbd)−1=ad−bc1(d−c−ba)−1
Then we can get the identity matrix via
(acbd)(acbd)−1=(1001)−1
bc this is the same as, remember, the original value multiplied it’s inverse
12⋅21.
3x3
What if we had a larger matrix? Then what?
A=⎝⎛410320001⎠⎞−1
We put it in an augmented matrix
[A∣I]→[I∣A−1]
Lets explain this. When we row reduce this to match the identity then the right side in the augmented matrix actually becomes our inverse matrix.
=⎝⎛410320001∣∣∣100010001⎠⎞
We first move the R2 to R1 to make 1 at x11 easier to work with
=⎝⎛140230001∣∣∣010100001⎠⎞
To turn 4 in row 2 into 0 we need to R2−4R1
=⎝⎛1002−50001∣∣∣0101−40001⎠⎞
Then we multiply −51R2 to turn -5 into 1
=⎝⎛100210001∣∣∣05−100540101⎠⎞
Then remove the 2 in the first row we need to R1−2R2
And now we get the identity matrix of the left and the inverse of A, our original matrix A−1 and so AA−1=I
=⎝⎛100010001∣∣∣525−1053540001⎠⎞
If the reduce row echelon form (rref) of the matrix has a row of zeros then the given matrix is non-invertible.
Gaussian Elimination
⎣⎡1−1213−22−57∣∣∣371⎦⎤
To get to a row echelon form we would start by trying to make x21=−1 into x21=0
And bang! That’s Guassian elimination to transform the matrix into row echelon form.
But wait! We see the final column has all zeros so we call this linear system is inconsistent, which really means that the planes (equations) do not meet — there is no point common to all 3, meaning there is no solution!
Consistent systems are those where there is a common point between the equations. We would have R33=1 and so you know z = -15 or if R33=0,R44=0 then that would of worked too bc z can be any number and have infinite solutions
Modifying A Single Matrix Row
If we had
=⎣⎡−1002−203−100∣∣∣000⎦⎤
We want to turn -2 into 1. We can do this by 2−R2 meaning we multiply our row by the opposite so -2 * 2 then halving the row.
=⎣⎡−100210350∣∣∣000⎦⎤
Then we can multiply the top row by -1 to turn -11 into 1
=⎣⎡100−210−350∣∣∣000⎦⎤
This tells us y + 5z = 0 which gives y = -5z which means we can provide any number to create it thus resulting in infinite solutions.
We can say this matrix is (A∣O) meaing that | split difference of the right and left. The Left side is the non-zeros A and the O is the zero matrix.
Since all the constants on the far right are zero we can say this lienar system is homogeneous, meaning they’re all the same. In non-square matrices there will be a row with no leading 1 so it has infinite solutions bc reduce row echelon has a diagonal 1 but rectanges don’t so there will be an open 0 row. We can think of the unknowns (x, y, z) relative to the 1s, if they don’t match then there are infinite number of solutions.
Pivots
Before we get into vector form.
In each row, the first 1 from the left of the row is called the pivot.
⎣⎡x100y100z010w−10−70∣∣∣∣b−9−70⎦⎤
See how we have 1 at R12 behind R11? This isn’t a pivot because it’s after the the 1 at R11 and so that column is actually a free variable.
Each pivot indicates that’s the column variable you’re solving for. So, the first row’s pivot R11 is in x‘s column meaning we’re solving for x.
So this equation would be x=−y+10w−9 bc we move the y and -10w to the right hand side augmented matrxi of the matrix, this is why -9 remains the same.
And so since y column has no pivot we actually say y can be any number or y = y.
The next pivot is R23
Matrix To Vector Form
This augmented matrix of reduce row echelon form has 2 pivots in column x_1 w/ R11 and column x_2 w/ R22
Columns x_3 and x_4 have what we called a free variable bc they have no pivots
And so we want to turn this augmented matrix back to the written equations by grabbing each row and isolating the pivot of the column it’s in x_n. We isolate it by moving all the other vars into the b column w/ just normal algebra. Let’s show what the equation would look like pre-movement to b column:
{x1+3x3−x4=−3x2+2x3−x4=−2
Then lets move it over to b column where the pivot var will be isolated.
The non-pivot columns have free variables, meaning they can be whatever they want. So x3 and x4 can be 1 to make it easier for ourselves :)
⎩⎨⎧x1=−3x3+x4−3x2=−2x3+x4−2x3=x3 or sx4=x4 or t
And now we need to write them as one equation as a vector.
Its really just a short-hand way of saying do matrix multiplication BUT it results in a real number scalar, not a vector!
If the dot product is zero then the vectors are perpendicular (right angle in 2D and 3D) or orthogonal (right angle in n > 3 dimensions)!
Vector Norm
The norm or length of a vector is the absolute sign but doubled up.
Lets say we want to know the norm of vector v=[xy]
∣∣v∣∣=x2+y2
What about a vector of n dimensions such as?
=v=⎣⎡v1⋮vn⎦⎤
We would say
∣∣v∣∣=(v1)2+⋯+(vn)2
But we can also write it as the sqrt of the dot product because if it’s multiplying itself then the values within the matrix are all squared
∣∣v∣∣=v⋅v
This is called the Euclidean norm.
But why wouldn’t squaring v work?
∣∣v∣∣=v2
bc we dont ouput a scalar!!! we have merely squared all the values in the matrix and so it still remains a vector, not our desired scalar used for our distance. We can’t have a matrix distance, silly retard :P
If we wanted to find the distance between two vectors we would say
∣∣u−v∣∣=d(u,v)
This is what GPS systems and satellites would use to detect their relative distances.
If we had
∣∣u∣∣2
Then to evalute it we know
∣∣u∣∣=u⋅u
Therefore, we remove the sqrt of the dot product which means the square is reversed
∣∣u∣∣2=u⋅u
Norm Difference
∣∣u−v∣∣=∣∣(7,−2)−(−5,3)∣∣=∣∣(12,−5)∣∣=122+(−5)2
Mental model for the norm.
Let say u=(2,−7)T
How would we think about this?
∣∣∣∣∣∣u∣∣1u∣∣∣∣
First we would want to solve for ∣∣u∣∣
∣∣u∣∣=u⋅u=22+(−7)2=53
And so we have
531(2−7)
And so we simply want to find the norm of
∣∣∣∣531(2−7)∣∣∣∣
Which when thinking of the dot product is
=531(2−7)⋅531(2−7)
And using the commutative properity
=531531(2−7)(2−7)
1/53 * 1/53 = 1/53 bc 1 * 1 = 1 with the same denominator and we use the dot product for our matrices
=531(22+(−7)2)
Simplify
=531(53)=1=1
Scalar Modulus
For scalar k we define the modulus of k denoted ∣k∣ as
∣k∣=k2
So then we can
∣∣ku∣∣=ku⋅ku=k2(u⋅u)=k2u⋅u=∣k∣∣∣u∣∣
Normalising Vectors
The process of finding a unit vector in the direction of the given vector u is called normalising.
The unit vector in the direction of the vector u is normally denoted by u^ (pronounced ‘u hat’) meaning it is a vector of lenth 1, that is