I'm writing the first draft of my thesis introduction and -- like anyone else -- I'm utterly frustrated by the task. Much of the middle of the thesis is written or partly written, but this is the first time I've seriously attempted to write an introduction. At least, this is the first time I've tried to write an introduction to my research as a coherent whole; I've certainly written introductions to pieces of my research. Whatever the case, I'm suffering writer's block, and this seems like a very good time to spend a few minutes on a tangent.
From a high level, I'm frustrated because I'm having a hard time explaining something which fundamentally isn't complicated. Parlett's book The Symmetric Eigenvalue Problem begins with the words
Vibrations are everywhere ... and so, too, are the eigenvalues (or frequencies) associated with them.
And so it is. There are typically two steps to finding the frequencies and mode shapes -- the eigenvalues and eigenvectors -- for a vibrating mechanical object. First, the continuous motion has to be approximated by something finite; and second, the frequencies of the finite problem must be computed numerically. My research concentration for the past couple years has mostly dealt with different types of eigenvalue problems, infinite or finite, that have interesting structures. If you pay attention to that structure, you can find resonant frequencies more quickly and accurately than you could if you proceeded blindly; if you ignore too much problem structure, you'll often end up with no information at all! I often apply my methods to problems in microsystem (MEMS) engineering, and the special mathematical structures correspond to special physical structures in the devices; but the ideas apply much more broadly. As a computer scientist, I spend a lot of time not just in developing the mathematical theory needed to solve structured eigenvalue problems, but also in writing codes that implement the ideas. My thesis research, then, comes in three parts: there's the engineering problems that drive my development; there's the software packages I write to solve those problems; and there's the fundamental mathematics and physics framework that underpins both the software development and the interpretation of computed results.
That's a reasonable nutshell description. But those sentences could
describe a lot of possible theses. As always, the devil is
in the details. What do I mean by
structure? I mean certain symmetries of the devices, either
in the geometry of the domain or in the structure of the governing
equations. I mean continuous dependence on parameters, so that a
single computation can be interpreted in the context of a family of
very similar computations. I mean the existence of different
physical effects -- like elasticity and heat flow -- which interact,
but have very different time scales. I mean having a nice partial
differential equation with not-so-nice boundary conditions, (where
not-so-nice are obviously stand-ins for more
complicated technical conditions). These structures are more subtle
than, say, self-adjointness (a very strong sort of symmetry
condition which the equations of both classical and quantum
mechanics often have). And what do I mean when I say I apply my
MEMS engineering? MEMS is a huge field: I've really
been working specifically on studying very high frequency (high
MHz-GHz) mechanical resonators of the type that might one day end up
in your cell phone -- which already uses a mechanical filter to
process signals, by the way. And as far as software goes, I write
my own codes for solving PDEs using finite elements and spectral
elements. I write my own implementations for my new numerical
ideas, and otherwise I try to use other people's codes as much as
possible -- which presents software interfacing problems which are
of both academic and practical interest.
The trouble I'm having is essentially one of balance. How do I
convey both the strategy and the tactics of my work? In the
introduction, the strategy is key -- but the interplay between
strategy and tactics is, in my not-so-humble opinion, what makes or
breaks any sort of numerical computation. One of my favorite
elementary numerics books is Numerical Methods That (Usually)
Work by Forman Acton. The book is bound in red cloth, and the
title is printed on the front in a metallic shade -- except the word
Usually, which is indented into the cover but not otherwise
filled in, making it impossible to see unless you look very closely.
Chapter 7 of Acton's book bears the title
Tactics, and it begins with the following words:
When we first approach a numerical problem we must take care lest we become immersed too soon in the detail without having thought through clearly the general strategy of our attack. There is, of course, the alternative danger that we will concern ourselves exclusively with the grand design while ignoring inconvenient details that turn out to be decisive -- thereby wrecking our plans with a thoroughness that borders on the catastrophic. The second danger, however, seems less severe than the first, against which we must accordingly warn more loudly.
Exactly! Without really understanding the engineering application,
I would probably end up solving the wrong problem (even if it was
the problem that was originally posed to me). I could easily pitch
this as a thesis about computer-aided design of MEMS; I could pitch
it as a computational mechanics thesis; or I could pitch this as a
thesis about numerical linear algebra; or I could pitch it as a
thesis about numerical software engineering. None of these global
views matches how I think about these problems, though. The
different results are independently interesting, but I know people
who are more competent than me in each of those areas
independently. The thing that excites me about this work -- and
about most of the problems I work on, even as
-- is that they are interesting across multiple disciplines. Don't
get me wrong: I think it's a lot of fun to prove theorem's for
purely aesthetic purposes, and I've been known to indulge myself in
such at times; and while I'm not a strong designer and will never be
known for the widgets I've dreamed up, I can appreciate people who
pure engineering design work. But I dip a little
into a lot of things -- that's simply how I make progress on
So now I'm supposed to write an introduction which is comprehensible
general audience in my field. But which field is it,
exactly? Can I assume my readers will know something about MEMS,
computational mechanics, PDE theory and functional analysis, linear
algebra, numerical analysis, and software engineering? Many people
-- maybe most -- who would find this work interesting are perhaps
more expert than I in at least one or two of those areas, but not
I am, of course, far from alone in facing this difficulty. Besides
Acton's book, I brought home with me two other books that I admire
for the way they combine mathematically deep ideas,
physically-guided intuition, and a generally accessible
presentation. Those books are Applied Analysis by Lanczos
and Chebyshev and Fourier Spectral Methods by Boyd. I'm
still on a
reader's honeymoon with that last book, and I
don't know if my current admiration will weather the test of time as
my admiration for the books of Lanczos and Acton has -- but I
suspect it will. Whether or not I continue to refer to Boyd's book
regularly, though, I think I will continue to agree with these
sentiments from his introduction:
It is not that spectral concepts are difficult, but rather that they link together as the components of an intellectual and computational ecology. Those who come to the course with no previous adventures in numerical analysis will be like urban children abandoned in the wilderness. Such innocents will learn far more than hardened veterans of the arithmurgical wars, but emerge from the forests with a lot more bruises.
I have also tried, not as successfully as I would have wished, to stress the imprtance of analyzing the physics of the problem before, during, and after computation. This is partly a reflection of my own scientific style: like a sort of mathematical guerrilla, I have ambushed problems with Pade approximants and perturbative derivatives of the KdV equations as well as with Chebyshev polynomials; numerical papers are only half my published articles.
However, there is a deeper reason: the numerical agenda is always set by the physics. The geometry, the boundary layers and fronts, and the symmetries are the topography of the computation. He or she who would scale Mt. Everest is well-advised to scou the passes before beginning the climb.
Perhaps my audience is what Lanczos would call
analysts. Parexic analysis, in Lanczos's parlance, is the range
of analysis techniques that lie between pure analysis (analysis of
infinite processes) and numerical analysis (analysis of numerical
algorithms). Lanczos describes his terminology clearly in the introduction
to Applied Analysis:
The Greek wordparexic(with the roots para = almost, quasi, and ek = out) meansnearby.Hence the termparexic analysiscan well be adopted to mean that we do not want an exact but only anearbydetermination of a certain quantity. We can then speak of parexic methods, parexic expansions, parexic viewpoints, in contradistinction to the corresponding methods, expansions, and viewpoints ofpure analysiswhich aim at arbitrary accuracy with the help of infinite processes.
In my experience, most mathematically-educated computational scientists and engineers appreciate this sort of analysis; the unwary -- who often rush where angels fear to tread -- never seem to realize that the computer cannot be trusted to deal gracefully with infinite sums. I know no effective way to caution those brash souls to prudence before their computations go awry (and often no way of convincing them afterward that there are some computations they can trust).
As far as the mathematics goes, it needn't be that frightening. I think most of what I do can readily be absorbed by someone with no strong understanding of functional analysis -- or even of numerical analysis! All that's needed is enough mathematical maturity to follow the high-level pictures. This is another thing I like about Lanczos's writing; he explicitly assumes such an audience. Quoting from the preface to Applied Analysis:
Furthermore, the author has the notion that mathematical formulas have theirsecret life,behind their Golem-like appearance. To bring out thesecret lifeof mathematical relations by an occasional narrative digression does not appear to him a profanation of the sacred rituals of formal analysis but merely an attempt to a more integrated way of understanding. The reader who has to struggle through a maze oflemmas,corollaries,andtheorems,can easily get lost in formalistic details, to the detriment of the essential elements of the results obtained. By keeping his mind on the principal points he gains in depth, though he may lose in details. The loss is not serious, however, since any reader equipped with the elementary tools of algebra and calculus can easily interpolate the missing details. It is a well-known experience that the only truly enjoyable and profitable way of studying mathematics is the method offilling-in detailsby one's own efforts. This additional work, the author hopes, will stir the reader's imagination and may easily lead to stimulating discussions and further explorations, on both the university and the research levels.
I agree! I agree! So tell me again... how did I plan to write this introduction?
- Currently drinking: Vanilla-flavored tea