Section2.2Matrix multiplication and linear combinations
The previous section introduced vectors and linear combinations and demonstrated how they provide a way to think about linear systems geometrically. In particular, we saw that the vector \(\bvec\) is a linear combination of the vectors \(\vvec_1,\vvec_2,\ldots,\vvec_n\) precisely when the linear system corresponding to the augmented matrix
Our goal in this section is to introduce matrix multiplication, another algebraic operation that deepens the connection between linear systems and linear combinations.
Subsection2.2.1Scalar multiplication and addition of matrices
We first thought of a matrix as a rectangular array of numbers. If we say that the shape of a matrix is \(m\times
n\text{,}\) we mean that it has \(m\) rows and \(n\) columns. For instance, the shape of the matrix below is \(3\times4\text{:}\)
This means that we may define scalar multiplication and matrix addition operations using the corresponding column-wise vector operations. For instance,
The matrix \(I_n\text{,}\) which we call the identity matrix, is the \(n\times n\) matrix whose entries are zero except for the diagonal entries, all of which are 1. For instance,
As this preview activity shows, the operations of scalar multiplication and addition of matrices are natural extensions of their vector counterparts. Some care, however, is required when adding matrices. Since we need the same number of vectors to add and since those vectors must be of the same dimension, two matrices must have the same shape if we wish to form their sum.
Subsection2.2.2Matrix-vector multiplication and linear combinations
A more important operation will be matrix multiplication as it allows us to compactly express linear systems. We now introduce the product of a matrix and a vector with an example.
Because \(A\) has two columns, we need two weights to form a linear combination of those columns, which means that \(\xvec\) must have two components. In other words, the number of columns of \(A\) must equal the dimension of the vector \(\xvec\text{.}\)
Similarly, the columns of \(A\) are 3-dimensional so any linear combination of them is 3-dimensional as well. Therefore, \(A\xvec\) will be 3-dimensional.
The product of a matrix \(A\) by a vector \(\xvec\) will be the linear combination of the columns of \(A\) using the components of \(\xvec\) as weights. More specifically, if
If \(A\) is an \(m\times n\) matrix, then \(\xvec\) must be an \(n\)-dimensional vector, and the product \(A\xvec\) will be an \(m\)-dimensional vector.
Suppose that \(I = \left[\begin{array}{rrr}
1 \amp 0 \amp 0 \\
0 \amp 1 \amp 0 \\
0 \amp 0 \amp 1 \\
\end{array}\right]\) is the identity matrix and \(\xvec=\threevec{x_1}{x_2}{x_3}\text{.}\) Find the product \(I\xvec\) and explain why \(I\) is called the identity matrix.
Multiplication of a matrix \(A\) and a vector is defined as a linear combination of the columns of \(A\text{.}\) However, there is a shortcut for computing such a product. Letβs look at our previous example and focus on the first row of the product.
To find the first component of the product, we consider the first row of the matrix. We then multiply the first entry in that row by the first component of the vector, the second entry by the second component of the vector, and so on, and add the results. In this way, we see that the third component of the product would be obtained from the third row of the matrix by computing \(2(3) + 3(1) = 9\text{.}\)
You are encouraged to evaluate the product ItemΒ a of the previous activity using this shortcut and compare the result to what you found while completing that activity.
Subsection2.2.3Matrix-vector multiplication and linear systems
So far, we have begun with a matrix \(A\) and a vector \(\xvec\) and formed their product \(A\xvec = \bvec\text{.}\) We would now like to turn this around: beginning with a matrix \(A\) and a vector \(\bvec\text{,}\) we will ask if we can find a vector \(\xvec\) such that \(A\xvec = \bvec\text{.}\) This will naturally lead back to linear systems.
To see the connection between the matrix equation \(A\xvec =
\bvec\) and linear systems, letβs write the matrix \(A\) in terms of its columns \(\vvec_i\) and \(\xvec\) in terms of its components.
We know that the matrix product \(A\xvec\) forms a linear combination of the columns of \(A\text{.}\) Therefore, the equation \(A\xvec = \bvec\) is merely a compact way of writing the equation for the weights \(c_i\text{:}\)
We have seen this equation before: Remember that PropositionΒ 2.1.12 says that the solutions of this equation are the same as the solutions to the linear system whose augmented matrix is
If \(A=\left[\begin{array}{rrrr}
\vvec_1\amp\vvec_2\amp\ldots\vvec_n
\end{array}\right]\) and \(\xvec=\left[
\begin{array}{c}
x_1 \\ x_2 \\ \vdots \\ x_n \\
\end{array}\right]
\text{,}\) then the following statements are equivalent.
The vector \(\xvec\) satisfies the equation \(A\xvec = \bvec
\text{.}\)
The equation \(A\xvec = \bvec\) gives a notationally compact way to write a linear system. Moreover, this notation will allow us to focus on important features of the system that determine its solution space.
Since we originally asked to describe the solutions to the equation \(A\xvec = \bvec\text{,}\) we will express the solution in terms of the vector \(\xvec\text{:}\)
This shows that the solutions \(\xvec\) may be written in the form \(\vvec + x_3\wvec\text{,}\) for appropriate vectors \(\vvec\) and \(\wvec\text{.}\) Geometrically, the solution space is a line in \(\real^3\) through \(\vvec\) moving parallel to \(\wvec\text{.}\)
If \(A\) and \(\bvec\) are as below, write the linear system corresponding to the equation \(A\xvec=\bvec\) and describe its solution space, using a parametric description if appropriate:
In this section, we have developed some algebraic operations on matrices with the aim of simplifying our description of linear systems. We now introduce a final operation, the product of two matrices, that will become important when we study linear transformations in SectionΒ 2.5.
It is important to note that we can only multiply matrices if the shapes of the matrices are compatible. More specifically, when constructing the product \(AB\text{,}\) the matrix \(A\) multiplies the columns of \(B\text{.}\) Therefore, the number of columns of \(A\) must equal the number of rows of \(B\text{.}\) When this condition is met, the number of rows of \(AB\) is the number of rows of \(A\text{,}\) and the number of columns of \(AB\) is the number of columns of \(B\text{.}\)
Before computing, first explain why the shapes of \(A\) and \(B\) enable us to form the product \(AB\text{.}\) Then describe the shape of \(AB\text{.}\)
Sage can multiply matrices using the * operator. Define the matrices \(A\) and \(B\) in the Sage cell below and check your work by computing \(AB\text{.}\)
Are we able to form the matrix product \(BA\text{?}\) If so, use the Sage cell above to find \(BA\text{.}\) Is it generally true that \(AB = BA\text{?}\)
In this section, we have found an especially simple way to express linear systems using matrix multiplication.
If \(A\) is an \(m\times n\) matrix and \(\xvec\) an \(n\)-dimensional vector, then \(A\xvec\) is the linear combination of the columns of \(A\) using the components of \(\xvec\) as weights. The vector \(A\xvec\) is \(m\)-dimensional.
The solution space to the equation \(A\xvec =
\bvec\) is the same as the solution space to the linear system corresponding to the augmented matrix \(\left[ \begin{array}{r|r} A \amp \bvec \end{array}\right]\text{.}\)
If \(A\) is an \(m\times n\) matrix and \(B\) is an \(n\times p\) matrix, we can form the product \(AB\text{,}\) which is an \(m\times p\) matrix whose columns are the products of \(A\) and the columns of \(B\text{.}\)
Suppose that \(A\) is a \(135\times2201\) matrix, and that \(\xvec\) is a vector. If \(A\xvec\) is defined, what is the dimension of \(\xvec\text{?}\) What is the dimension of \(A\xvec\text{?}\)
What is the product \(A\twovec{1}{0}\) in terms of \(\vvec_1\) and \(\vvec_2\text{?}\) What is the product \(A\twovec{0}{1}\text{?}\) What is the product \(A\twovec{2}{3}\text{?}\)
Suppose that the matrix \(A = \left[\begin{array}{rr}
\vvec_1 \amp \vvec_2
\end{array}\right]\) where \(\vvec_1\) and \(\vvec_2\) are shown in FigureΒ 2.2.9.
The operations that we perform in Gaussian elimination can be accomplished using matrix multiplication. This observation is the basis of an important technique that we will investigate in a subsequent chapter.
Verify that \(SA\) is the matrix that results when the second row of \(A\) is scaled by a factor of 7. What matrix \(S\) would scale the third row by -3?
Verify that \(PA\) is the matrix that results from interchanging the first and second rows. What matrix \(P\) would interchange the first and third rows?
Verify that \(L_1A\) is the matrix that results from multiplying the first row of \(A\) by \(-2\) and adding it to the second row. What matrix \(L_2\) would multiply the first row by 3 and add it to the third row?
When we performed Gaussian elimination, our first goal was to perform row operations that brought the matrix into a triangular form. For our matrix \(A\text{,}\) find the row operations needed to find a row equivalent matrix \(U\) in triangular form. By expressing these row operations in terms of matrix multiplication, find a matrix \(L\) such that \(LA = U\text{.}\)
In this exercise, you will construct the inverse of a matrix, a subject that we will investigate more fully in the next chapter. Suppose that \(A\) is the \(2\times2\) matrix:
Suppose that we want to solve the equation \(A\xvec =
\bvec\text{.}\) We know how to do this using Gaussian elimination; letβs use our matrix \(B\) to find a different way:
Consider the equation \(A\xvec = \twovec{5}{-2}\text{.}\) Find the solution in two different ways, first using Gaussian elimination and then as \(\xvec = B\bvec\text{,}\) and verify that you have found the same result.
The solution space to the equation \(A\xvec = \bvec\) is equivalent to the solution space to the linear system whose augmented matrix is \(\left[\begin{array}{r|r} A \amp \bvec
\end{array}\right]\text{.}\)
If a linear system of equations has 8 equations and 5 unknowns, then the shape of the matrix \(A\) in the corresponding equation \(A\xvec = \bvec\) is \(5\times8\text{.}\)
Suppose \(A=\left[\begin{array}{rrrr}
\vvec_1 \amp \vvec_2 \amp \vvec_3 \amp \vvec_4
\end{array}\right]\text{.}\) Explain why every four-dimensional vector can be written as a linear combination of the vectors \(\vvec_1\text{,}\)\(\vvec_2\text{,}\)\(\vvec_3\text{,}\) and \(\vvec_4\) in exactly one way.
Describe the solution space to the homogeneous equation \(A\xvec = \zerovec\) using a parametric description, if appropriate. What does this solution space represent geometrically?
Describe the solution space to the equation \(A\xvec=\bvec\) where \(\bvec = \threevec{-3}{-4}{1}\text{.}\) What does this solution space represent geometrically and how does it compare to the previous solution space?
We will now explain the relationship between the previous two solution spaces. Suppose that \(\xvec_h\) is a solution to the homogeneous equation; that is \(A\xvec_h=\zerovec\text{.}\) Suppose also that \(\xvec_p\) is a solution to the equation \(A\xvec =
\bvec\text{;}\) that is, \(A\xvec_p=\bvec\text{.}\)
Use the Linearity Principle expressed in PropositionΒ 2.2.3 to explain why \(\xvec_h+\xvec_p\) is a solution to the equation \(A\xvec
= \bvec\text{.}\) You may do this by evaluating \(A(\xvec_h+\xvec_p)\text{.}\)
That is, if we find one solution \(\xvec_p\) to an equation \(A\xvec = \bvec\text{,}\) we may add any solution to the homogeneous equation to \(\xvec_p\) and still have a solution to the equation \(A\xvec = \bvec\text{.}\) In other words, the solution space to the equation \(A\xvec = \bvec\) is given by translating the solution space to the homogeneous equation by the vector \(\xvec_p\text{.}\)
Suppose that a city is starting a bicycle sharing program with bicycles at locations \(B\) and \(C\text{.}\) Bicycles that are rented at one location may be returned to either location at the end of the day. Over time, the city finds that 80% of bicycles rented at location \(B\) are returned to \(B\) with the other 20% returned to \(C\text{.}\) Similarly, 50% of bicycles rented at location \(C\) are returned to \(B\) and 50% to \(C\text{.}\)
where \(B_k\) is the number of bicycles at location \(B\) at the beginning of day \(k\) and \(C_k\) is the number of bicycles at \(C\text{.}\) The information above tells us how to determine the distribution of bicycles the following day:
Suppose that there are 1000 bicycles at location \(B\) and none at \(C\) on day 1. This means we have \(\xvec_1 = \twovec{1000}{0}\text{.}\) Find the number of bicycles at both locations on day 2 by evaluating \(\xvec_2 = A\xvec_1\text{.}\)
Suppose that there are 1000 bicycles at location \(C\) and none at \(B\) on day 1. Form the vector \(\xvec_1\) and determine the number of bicycles at the two locations the next day by finding \(\xvec_2 = A\xvec_1\text{.}\)
Suppose that one day there are 1050 bicycles at location \(B\) and 450 at location \(C\text{.}\) How many bicycles were there at each location the previous day?
Suppose that there are 500 bicycles at location \(B\) and 500 at location \(C\) on Monday. How many bicycles are there at the two locations on Tuesday? on Wednesday? on Thursday?
Suppose that \(\xvec_1 = c_1 \vvec_1 + c_2 \vvec_2\) where \(c_1\) and \(c_2\) are scalars. Use the Linearity Principle expressed in PropositionΒ 2.2.3 to explain why
Suppose that there are initially 500 bicycles at location \(B\) and 500 at location \(C\text{.}\) Write the vector \(\xvec_1\) and find the scalars \(c_1\) and \(c_2\) such that \(\xvec_1=c_1\vvec_1 + c_2\vvec_2\text{.}\)