Thursday, June 23, 2005

Duality and Tensors

A little while ago, I wrote about the idea of dual spaces, and how the idea of a dual vector could be represented directly in languages like LISP and Lua, where functions can be treated like any other sort of data. If you remember that entry (or if your page down button works), you'll know that for any real vector space V the dual space V* is the space of all linear functions from V into the real numbers R. We can also talk about the double dual space V**, which is the space of linear functionals on V*. There is a natural way to embed the original space V into the double dual space V** by means of the evaluate at operation: if v is in V and f is in V*, then eval-at[v] in V** is defined by

eval-at[v](f) = f(v).
(Here, and for the rest of this entry, I will use square brackets for the arguments to a function which produces another function.) For most spaces of interest, including all finite-dimensional spaces, the eval-at map is onto. That is, the two spaces are isomorphic: every element of V** can be represented as evaluation at some element of V. This property is called reflexivity.

If you like Dirac notation, the vectors are kets and the dual vectors are bras. If you don't like Dirac notation, you may here take a moment to chortle in amusement at the notion that theoretical physicists spend a lot of time staring at bras. You've heard the joke about the wave vector for Schrodinger's cat, right? The cat in the ket? Anyhow, if you prefer to think about representations of vectors as rows and columns of numbers, then you might write the components of an ordinary vector as a column, and those of a dual vector as a row.

So: I have vectors, and I have dual vectors, and I have double-dual vectors which are identified by ordinary vectors again. All I've done is to take something I know about, a vector space, and look at some special type of functions whose domain is that space. All I've done, I say, as if the idea of going from the examination of structures to the examination of functions on those structures wasn't one of the most pervasive and subtle ideas of modern algebra! So far, though, I've only looked at functions of one variable. What happens if I start doing the same manipulations with functions of more than one variable?

What happens is that I get multilinear algebra, sometimes known as tensor algebra. A linear functional from V into R is a dual vector, or a type of rank-one tensor. A multi-linear function from V cross V into R (that is, a function which is linear in each slot independently) is a type of a rank-two tensor. More generally, I can talk about multi-linear functions mapping a list of vectors and dual vectors into R; these are tensors. The rank of a tensor is just the number of arguments (there is a completely different concept of rank that also comes up in tensor analysis, so if this doesn't match what you thought rank was, relax).

What happens if we think about a multi-linear function one variable at a time? What happens is that we see very familiar things. Suppose I have a tensor A which maps V* times V into R. What happens if I fix the first component? That is, what if I define

g[u](v) = A(u,v)?
The function g[u] is a linear function from V into R; that is, g[u] is in V*. Similarly, suppose I have another tensor B on V* time V, and I can freeze the second component to get a function in V**:
h[w](v) = B(v,w).
Now I have something in the vector space V* and something in the dual vector space V**; my knee-jerk response is to evaluate the one at the other. That is, can I say anything interesting about the function from V* times V** into R defined by
C(u,w) = h[w](g[u])?
Yes, I can say something very interesting: C is again a rank-two tensor on V* times V** (or, by employing the natural isomorphism between V** and V, it becomes a rank-two tensor on V* times V). We call C a contraction of A and B. You have seen this before, though you may not recognize it as written here. A, B, and C can be viewed as linear operators from V into V; for example, we have
linear-op[B](w) = eval-at-1(h[w]).
Viewed this way, linear-op[C] is the composition of linear-op[A] and linear-op[B]. If you prefer to choose a basis and think in terms of coordinates, matrix(C) = matrix(A)*matrix(B).

The ideas that I've sketched above work for tensors with more than two components, and you can apply them repeatedly in order to talk about double contractions and so forth. You can even contract different components of a single tensor against each other. For a rank two tensor, this self-contraction is called a trace -- in matrix terms, it can be thought of as a sum of the diagonal elements, or as a sum of eigenvalues. One advantage to a coordinate-free development of tensor ideas is that statements like the trace of a matrix is invariant under similarity transformations to the matrix become tautologies. Of course the trace doesn't depend on the coordinate system; you don't even need a choice of basis to define it! Beside taking traces, another common tensor operation is to raise or lower some index -- that is, change a tensor whose kth argument is in V into a tensor whose kth argument is in V*, or vice-versa. If V has a dot product, there's a natural way to achieve this transformation: the dot product is a rank two tensor on V times V, and raising and lowering indices just involves contracting with the this tensor. Or, if you prefer, you can use the dot product to define an isomorphism between V and V*, and raising or lowering the kth index just involves composing the tensor with this isomorphism in the kth argument.

Isn't this fun? I could do it all day... and some days I do. And all this beautiful mess rests on a single idea: the idea of a dual space. The difficulty of tensor algebra is not that the basic idea is difficult: it's a deep idea, but ultimately not a complex idea. The real difficulty of tensor analysis is notation. We usually write from left to right, and we have a simple mental model of the notation for function evaluation: inputs come in on the right, and go out on the left. If you're comfortable with linear algebraic notation, you might not even bother with any special symbol to indicate that operations are composed: you might just write the composition of operators A and B as AB. We could draw a graph with arrows pointing from inputs to outputs, and it would form a straight line; for more complicated problems in which functions can have multiple inputs, it would form a tree (a parse tree for the expression, if you like). When we introduce the idea of duality, this mental picture gets confused: if I have a functional f and a vector v, should I draw an arrow to indicate that the vector is an input to the functional, or (using eval-at) should I draw an arrow to indicate that the functional is an input to the vector? When I introduce higher-rank tensors things get even more complicated. If I were to draw a graph to denote the flow of information in a general tensor contraction, I would not only have no idea which direction the arrows should point, but I would have to allow the arrows to form a topology much more general than a tree (if done in the obvious way).

So how do you actually compute with tensors, when it's so difficult to denote what's acting on what? You give labels to the arguments, and give rules for how the labels connect to each other. Usually this is done at the same time one chooses a basis: the result is indicial algebra, which is very natural for computation, but which hides the real nature of tensor algebra (and wreaks havoc on one's eyesight). Another way to denote how tensor contractions work is to draw a graph of how arguments are attached to each other by contraction; this approach gives you tensor diagrams. Perhaps unsurprisingly, I learned about tensor diagrams from a computer scientist, not a mechanician or a mathematician: if we can turn a problem into a graph operation, we probably will.

The other thing that I think causes immense confusion is that tensor algebra often is described only briefly as a first step to describing tensor calculus. That step, too, looks and sounds much more complicated than it really is. But this is a topic for another day.