The past few sections introduced us to matrix-vector multiplication as a means of thinking geometrically about the solutions to a linear system. In particular, we rewrote a linear system as a matrix equation \(A\xvec = \bvec\) and developed the concepts of span and linear independence in response to our two fundamental questions.
In this section, we will explore how matrix-vector multiplication defines certain types of functions, which we call matrix transformations, similar to those encountered in previous algebra courses. In particular, we will develop some algebraic tools for thinking about matrix transformations and look at some motivating examples. In the next section, we will see how matrix transformations describe important geometric operations and how they are used in computer animation.
We will begin by considering a more familiar situation; namely, the function \(f(x) = x^2\text{,}\) which takes a real number \(x\) as an input and produces its square \(x^2\) as its output.
Remember that composing two functions means we use the output from one function as the input into the other; that is, \((g\circ h)(x) = g(h(x))\text{.}\) What function results from composing \((g\circ h)(x)\text{?}\)
In the preview activity, we considered familiar linear functions of a single variable, such as \(g(x) = 2x\text{.}\) We construct a function like this by choosing a number \(m\text{;}\) when given an input \(x\text{,}\) the output \(g(x) = mx\) is formed by multiplying \(x\) by \(m\text{.}\)
In this section, we will consider functions whose inputs are vectors and whose outputs are vectors defined through matrix-vector multiplication. That is, if \(A\) is a matrix and \(\xvec\) is a vector, the function \(T(\xvec) = A\xvec\) forms the product \(A\xvec\) as its output. Such a function is called a matrix transformation.
The matrix transformation associated to the matrix \(A\) is the function that assigns to the vector \(\xvec\) the vector \(A\xvec\text{;}\) that is, \(T(\xvec) = A\xvec\text{.}\)
Notice that the input to \(T\) is a two-dimensional vector \(\twovec{x_1}{x_2}\) and the output is a three-dimensional vector \(\cthreevec{3x_1 - 2x_2}{x_1 +
2x_2}{3x_2}\text{.}\) As a shorthand, we will write
If we describe this transformation as \(T:\real^n\to\real^m\text{,}\) what are the values of \(n\) and \(m\) and how do they relate to the shape of \(A\text{?}\)
If \(A\) is the matrix \(A=\left[\begin{array}{rr} \vvec_1 \amp \vvec_2
\end{array}\right]\text{,}\) what is \(T\left(\twovec10\right)\) in terms of the vectors \(\vvec_1\) and \(\vvec_2\text{?}\) What about \(T\left(\twovec{0}{1}\right)\text{?}\)
Letβs discuss a few of the issues that appear in this activity. First, notice that the shape of the matrix \(A\) and the dimension of the input vector \(\xvec\) must be compatible if the product \(A\xvec\) is to be defined. In particular, if \(A\) is an \(m\times n\) matrix, \(\xvec\) needs to be an \(n\)-dimensional vector, and the resulting product \(A\xvec\) will be an \(m\)-dimensional vector. For the associated matrix transformation, we therefore write \(T:\real^n\to\real^m\) meaning \(T\) takes vectors in \(\real^n\) as inputs and produces vectors in \(\real^m\) as outputs. For instance, if
Second, we can often reconstruct the matrix \(A\) if we only know some output values from its associated linear transformation \(T\) by remembering that matrix-vector multiplication constructs linear combinations. For instance, if \(A\) is an \(m\times2\) matrix \(A=\left[\begin{array}{rr} \vvec_1
\amp \vvec_2 \end{array}\right]\text{,}\) then
That is, we can find the first column of \(A\) by evaluating \(T\left(\twovec{1}{0}\right)\text{.}\) Similarly, the second column of \(A\) is found by evaluating \(T\left(\twovec{0}{1}\right)\text{.}\)
If \(T:\real^n\to\real^m\) is a matrix transformation given by \(T(\xvec) = A\xvec\text{,}\) then the matrix \(A\) has columns \(T(\evec_j)\text{;}\) that is,
Letβs look at some examples and apply these observations.
To begin, suppose that \(T\) is the matrix transformation that takes a two-dimensional vector \(\xvec\) as an input and outputs \(T(\xvec)\text{,}\) the two-dimensional vector obtained by rotating \(\xvec\) counterclockwise by \(90^\circ\text{,}\) as shown in FigureΒ 2.5.7.
Figure2.5.7.The matrix transformation \(T\) takes two-dimensional vectors on the left and rotates them by \(90^\circ\) counterclockwise into the vectors on the right.
We will see in the next section that many geometric operations like this one can be performed by matrix transformations.
If we write \(T:\real^n\to\real^m\text{,}\) what are the values of \(m\) and \(n\text{,}\) and what is the shape of the associated matrix \(A\text{?}\)
If \(\vvec=\twovec{-2}{-1}\) as shown on the left in FigureΒ 2.5.7, use your matrix to determine \(T(\vvec)\) and verify that it agrees with that shown on the right of FigureΒ 2.5.7.
Suppose that we work for a company that makes baked goods, including cakes, doughnuts, and eclairs. The company operates two bakeries, Bakery 1 and Bakery 2. In one hour of operation,
Bakery 1 produces 10 cakes, 50 doughnuts, and 30 eclairs.
If Bakery 1 operates for \(x_1\) hours and Bakery 2 for \(x_2\) hours, we will use the vector \(\xvec=\twovec{x_1}{x_2}\) to describe the operation of the two bakeries.
We would like to describe a matrix transformation \(T\) where \(\xvec\) describes the number of hours the bakeries operate and \(T(\xvec)\) describes the total number of cakes, doughnuts, and eclairs produced. That is, \(T(\xvec) = \threevec{y_1}{y_2}{y_3}\) where \(y_1\) is the number of cakes, \(y_2\) is the number of doughnuts, and \(y_3\) is the number of eclairs produced.
If \(T:\real^n\to\real^m\text{,}\) what are the values of \(m\) and \(n\text{,}\) and what is the shape of the associated matrix \(A\text{?}\)
We can determine the matrix \(A\) using PropositionΒ 2.5.6. For instance, \(T\left(\twovec10\right)\) will describe the number of cakes, doughnuts, and eclairs produced when Bakery 1 operates for one hour and Bakery 2 sits idle. What is this vector?
Suppose that the company receives an order for a certain number of cakes, doughnuts, and eclairs. Can you guarantee that you can fill the order without having leftovers?
In these examples, we glided over an important point: how do we know these functions \(T:\real^n\to\real^m\) can be expressed as matrix transformations? We will take up this question in detail in the next section and not worry about it for now.
It sometimes happens that we want to combine matrix transformations by performing one and then another. In the last activity, for instance, we considered the matrix transformation where \(T(\xvec)\) is the result of rotating the two-dimensional vector \(\xvec\) by \(90^\circ\text{.}\) Now suppose we are interested in rotating that vector twice; that is, we take a vector \(\xvec\text{,}\) rotate it by \(90^\circ\) to obtain \(T(\xvec)\text{,}\) and then rotate the result by \(90^\circ\) again to obtain \(T(T(\xvec))\text{.}\)
This process is called function composition and likely appeared in an earlier algebra course. For instance, if \(g(x) = 2x + 1\) and \(h(x) = x^2\text{,}\) the composition of these functions obtained by first performing \(g\) and then performing \(h\) is denoted by
Composing matrix transformations is similar. Suppose that we have two matrix transformations, \(T:\real^n\to\real^m\) and \(S:\real^m\to\real^p\text{.}\) Their associated matrices will be denoted by \(A\) and \(B\) so that \(T(\xvec) =
A\xvec\) and \(S(\xvec) = B\xvec\text{.}\) If we apply \(T\) to a vector \(\xvec\) to obtain \(T(\xvec)\) and then apply \(S\) to the result, we have
Notice that this implies that the composition \((S\circ T)\) is itself a matrix transformation and that the associated matrix is the product \(BA\text{.}\)
If \(T:\real^n\to\real^m\) and \(S:\real^m\to\real^p\) are matrix transformations with associated matrices \(A\) and \(B\) respectively, then the composition \((S\circ
T)\) is also a matrix transformation whose associated matrix is the product \(BA\text{.}\)
Notice that the matrix transformations must be compatible if they are to be composed. In particular, the vector \(T(\xvec)\text{,}\) an \(m\)-dimensional vector, must be a suitable input vector for \(S\text{,}\) which means that the inputs to \(S\) must be \(m\)-dimensional. In fact, this is the same condition we need to form the product \(BA\) of their associated matrices, namely, that the number of columns of \(B\) is the same as the number of rows of \(A\text{.}\)
We will explore the composition of matrix transformations by revisiting the matrix transformations from ActivityΒ 2.5.3.
Letβs begin with the matrix transformation \(T:\real^2\to\real^2\) that rotates a two-dimensional vector \(\xvec\) by \(90^\circ\) to produce \(T(\xvec)\text{.}\) We saw in the earlier activity that the associated matrix is \(A = \begin{bmatrix}
0 \amp -1 \\
1 \amp 0 \\
\end{bmatrix}
\text{.}\) Suppose that we compose this matrix transformation with itself to obtain \((T\circ T)(\xvec) =
T(T(\xvec))\text{,}\) which is the result of rotating \(\xvec\) by \(90^\circ\) twice.
What is the matrix associated to the composition \((T\circ T)\text{?}\)
Write the two-dimensional vector \((T\circ
T)\left(\twovec xy\right)\text{.}\) How might this vector be expressed in terms of scalar multiplication and why does this make sense geometrically?
In the previous activity, we imagined a company that operates two bakeries. We found the matrix transformation \(T:\real^2\to\real^3\) where \(T\left(\twovec{x_1}{x_2}\right)\) describes the number of cakes, doughnuts, and eclairs when Bakery1 runs for \(x_1\) hours and Bakery 2 runs for \(x_2\) hours. The associated matrix is \(A = \begin{bmatrix}
10 \amp 20 \\
50 \amp 30 \\
30 \amp 30 \\
\end{bmatrix}
\text{.}\)
We will describe a matrix transformation \(S:\real^3\to\real^2\) where \(S\left(\threevec{y_1}{y_2}{y_3}\right)\) is a two-dimensional vector describing the number of cups of flour and sugar required to make \(y_1\) cakes, \(y_2\) doughnuts, and \(y_3\) eclairs.
Use PropositionΒ 2.5.6 to write the matrix \(B\) associated to the transformation \(S\text{.}\)
Suppose that Bakery 1 operates for 75 hours and Bakery 2 operates for 53 hours. How many cakes, doughnuts, and eclairs are produced? How many cups of flour and sugar are required?
Suppose we run a company that has two warehouses, which we will call \(P\) and \(Q\text{,}\) and a fleet of 1000 delivery trucks. Every morning, a delivery truck goes out from one of the warehouses and returns in the evening to one of the warehouses. It is observed that
70% of the trucks that leave \(P\) return to \(P\text{.}\) The other 30% return to \(Q\text{.}\)
The distribution of trucks is represented by the vector \(\xvec=\twovec{x_1}{x_2}\) when there are \(x_1\) trucks at location \(P\) and \(x_2\) trucks at \(Q\text{.}\) If \(\xvec\) describes the distribution of trucks in the morning, then the matrix transformation \(T(\xvec)\) will describe the distribution in the evening.
Suppose that all 1000 trucks begin the day at location \(P\) and none at \(Q\text{.}\) How many trucks are at each location that evening? Using our vector representation, what is \(T\left(\ctwovec{1000}{0}\right)\text{?}\)
In the same way, suppose that all 1000 trucks begin the day at location \(Q\) and none at \(P\text{.}\) How many trucks are at each location that evening? What is the result \(T\left(\ctwovec{0}{1000}\right)\) and what is \(T\left(\twovec01\right)\text{?}\)
Suppose that \(S\) is the matrix transformation that transforms the distribution of trucks \(\xvec\) one morning into the distribution of trucks in the morning one week (seven days) later. What is the matrix that defines the transformation \(S\text{?}\)
As we will see later, this type of situation occurs frequently. We have a vector \(\xvec\) that describes the state of some system; in this case, \(\xvec\) describes the distribution of trucks between the two locations at a particular time. Then there is a matrix transformation \(T(\xvec) = A\xvec\) that describes the state at some later time. We call \(\xvec\) the state vector and \(T\) the transition function, as it describes the transition of the state vector from one time to the next.
We call this situation where the state of a system evolves from one time to the next according to the rule \(\xvec_{k+1}=A\xvec_k\) a discrete dynamical system. In ChapterΒ 4, we will develop a theory that enables us to make long-term predictions about the evolution of the state vector.
This section introduced matrix transformations, functions that are defined by matrix-vector multiplication, such as \(T(\xvec) = A\xvec\) for some matrix \(A\text{.}\)
If \(A\) is an \(m\times n\) matrix, then \(T:\real^n\to\real^m\text{.}\)
A discrete dynamical system consists of a state vector \(\xvec\) along with a transition function \(T(\xvec) =
A\xvec\) that describes how the state vector evolves from one time to the next. Powers of the matrix \(A\) determine the long-term behavior of the state vector.
If \(T:\real^n\to\real^m\text{,}\) what are the values of \(m\) and \(n\text{?}\) What values of \(m\) and \(n\) are appropriate for the transformation \(S\text{?}\)
Suppose \(T:\real^3\to\real^2\) is a matrix transformation with \(T(\evec_j) = \vvec_j\) where \(\vvec_1\text{,}\)\(\vvec_2\text{,}\) and \(\vvec_3\) are as shown in FigureΒ 2.5.10.
In ExampleΒ 2.5.5 and ExampleΒ 2.5.4, we wrote matrix transformations in terms of the components of \(T(\xvec)\text{.}\) This exercise makes use of that form.
Letβs return to the example in ActivityΒ 2.5.3 concerning the company that operates two bakeries. We used a matrix transformation with input \(\xvec\text{,}\) which recorded the amount of time the two bakeries operated, and output \(T(\xvec)\text{,}\) the number of cakes, doughnuts, and eclairs produced. The associated matrix is \(A=\begin{bmatrix}
10 \amp 20 \\
50 \amp 30 \\
30 \amp 30 \\
\end{bmatrix}
\text{.}\)
If \(\xvec=\twovec{x_1}{x_2}\text{,}\) write the output \(T(\xvec)\) as a three-dimensional vector in terms of \(x_1\) and \(x_2\text{.}\)
Suppose that a bicycle sharing program has two locations \(P\) and \(Q\text{.}\) Bicycles are rented from some location in the morning and returned to a location in the evening. Suppose that
60% of bicycles that begin at \(P\) in the morning are returned to \(P\) in the evening while the other 40% are returned to \(Q\text{.}\)
If \(x_1\) is the number of bicycles at location \(P\) and \(x_2\) the number at \(Q\) in the morning, write an expression for the number of bicycles at \(P\) in the evening.
If \(T:\real^2\to\real^3\) is a matrix transformation, then it is possible that every equation \(T(\xvec) = \bvec\) has a solution for every vector \(\bvec\text{.}\)
Suppose that \(x_1\text{,}\)\(x_2\text{,}\) and \(x_3\) record the amounts of time that the three plants are operated and that \(M\) and \(Y\) record the amount of milk and yogurt produced. If we write \(\xvec=\threevec{x_1}{x_2}{x_3}\) and \(\yvec = \twovec{M}{Y}\text{,}\) find the matrix \(A\) that defines the matrix transformation \(T(\xvec) =
\yvec\text{.}\)
If we write the vector \(\zvec = \twovec{E}{L}\) to record the required amounts of electricity \(E\) and labor \(L\text{,}\) find the matrix \(B\) that defines the matrix transformation \(S(\yvec) =
\zvec\text{.}\)
If \(\xvec = \threevec{30}{20}{10}\) describes the amounts of time that the three plants are operated, how much milk and yogurt is produced? How much electricity and labor are required?
Find the matrix \(C\) that describes the matrix transformation \(R(\xvec)=\zvec\) that gives the required amounts of electricity and labor when the each plants is operated an amount of time given by the vector \(\xvec\text{.}\)
Suppose that two species \(P\) and \(Q\) interact with one another and that we measure their populations every month. We record their populations in a state vector \(\xvec = \twovec{p}{q}\text{,}\) where \(p\) and \(q\) are the populations of \(P\) and \(Q\text{,}\) respectively. We observe that there is a matrix
such that the matrix transformation \(T(\xvec)=A\xvec\) is the transition function describing how the state vector evolves from month to month. We also observe that, at the beginning of July, the populations are described by the state vector \(\xvec=\twovec{1}{2}\text{.}\)
What will the populations be at the beginning of August?
We will record the number of present students \(p\) and the number of absent students \(a\) in a state vector \(\xvec=\twovec{p}{a}\) and note that that state vector evolves from one day to the next according to the transition function \(T:\real^2\to\real^2\text{.}\) On Tuesday, the state vector is \(\xvec=\ctwovec{1700}{100}\text{.}\)
Suppose we initially have 1000 students who are present and none absent. Find \(T\left(\ctwovec{1000}{0}\right)\text{.}\)