A little while ago, I wrote about the idea of dual spaces, and how
the idea of a dual vector could be represented directly in languages
like LISP and Lua, where functions can be treated like any other
sort of data. If you remember that entry (or if your page down
button works), you'll know that for any real vector space V the
dual space V* is the space of all linear
functions from V into the real numbers R. We can also talk
about the double dual space V**, which is the space of
linear functionals on V*. There is a natural way to
embed the original space V into the double dual space V**
by means of the evaluate at
operation: if v is in V and f is
in V*, then eval-at[v] in V** is defined by
If you like Dirac notation, the vectors are kets and the dual vectors are bras. If you don't like Dirac notation, you may here take a moment to chortle in amusement at the notion that theoretical physicists spend a lot of time staring at bras. You've heard the joke about the wave vector for Schrodinger's cat, right? The cat in the ket? Anyhow, if you prefer to think about representations of vectors as rows and columns of numbers, then you might write the components of an ordinary vector as a column, and those of a dual vector as a row.
So: I have vectors, and I have dual vectors, and I have double-dual vectors which are identified by ordinary vectors again. All I've done is to take something I know about, a vector space, and look at some special type of functions whose domain is that space. All I've done, I say, as if the idea of going from the examination of structures to the examination of functions on those structures wasn't one of the most pervasive and subtle ideas of modern algebra! So far, though, I've only looked at functions of one variable. What happens if I start doing the same manipulations with functions of more than one variable?
What happens is that I get multilinear algebra, sometimes known as
tensor algebra. A linear functional from V into
What happens if we think about a multi-linear function one variable
at a time? What happens is that we see very familiar things.
Suppose I have a tensor A which maps V* times V into
The ideas that I've sketched above work for tensors with more than
two components, and you can apply them repeatedly in order to talk
about double contractions and so forth. You can even contract
different components of a single tensor against each other. For a
rank two tensor, this self-contraction is called a trace -- in
matrix terms, it can be thought of as a sum of the diagonal
elements, or as a sum of eigenvalues. One advantage to a
coordinate-free development of tensor ideas is that statements like
the trace of a matrix is invariant under similarity
transformations to the matrix
become tautologies. Of course the
trace doesn't depend on the coordinate system; you don't even need a
choice of basis to define it! Beside taking traces, another common
tensor operation is to raise or lower some index -- that is, change
a tensor whose kth argument is in V into a tensor whose kth argument
is in V*, or vice-versa. If V has a dot product, there's
a natural way to achieve this transformation: the dot product is a
rank two tensor on V times V, and raising and lowering indices just
involves contracting with the this tensor. Or, if you prefer, you
can use the dot product to define an isomorphism between V and
V*, and raising or lowering the kth index just involves
composing the tensor with this isomorphism in the kth argument.
Isn't this fun? I could do it all day... and some days I do. And all this beautiful mess rests on a single idea: the idea of a dual space. The difficulty of tensor algebra is not that the basic idea is difficult: it's a deep idea, but ultimately not a complex idea. The real difficulty of tensor analysis is notation. We usually write from left to right, and we have a simple mental model of the notation for function evaluation: inputs come in on the right, and go out on the left. If you're comfortable with linear algebraic notation, you might not even bother with any special symbol to indicate that operations are composed: you might just write the composition of operators A and B as AB. We could draw a graph with arrows pointing from inputs to outputs, and it would form a straight line; for more complicated problems in which functions can have multiple inputs, it would form a tree (a parse tree for the expression, if you like). When we introduce the idea of duality, this mental picture gets confused: if I have a functional f and a vector v, should I draw an arrow to indicate that the vector is an input to the functional, or (using eval-at) should I draw an arrow to indicate that the functional is an input to the vector? When I introduce higher-rank tensors things get even more complicated. If I were to draw a graph to denote the flow of information in a general tensor contraction, I would not only have no idea which direction the arrows should point, but I would have to allow the arrows to form a topology much more general than a tree (if done in the obvious way).
So how do you actually compute with tensors, when it's so difficult to denote what's acting on what? You give labels to the arguments, and give rules for how the labels connect to each other. Usually this is done at the same time one chooses a basis: the result is indicial algebra, which is very natural for computation, but which hides the real nature of tensor algebra (and wreaks havoc on one's eyesight). Another way to denote how tensor contractions work is to draw a graph of how arguments are attached to each other by contraction; this approach gives you tensor diagrams. Perhaps unsurprisingly, I learned about tensor diagrams from a computer scientist, not a mechanician or a mathematician: if we can turn a problem into a graph operation, we probably will.
The other thing that I think causes immense confusion is that tensor algebra often is described only briefly as a first step to describing tensor calculus. That step, too, looks and sounds much more complicated than it really is. But this is a topic for another day.