Saturday, February 05, 2005

Strategy and Tactics

I'm writing the first draft of my thesis introduction and -- like anyone else -- I'm utterly frustrated by the task. Much of the middle of the thesis is written or partly written, but this is the first time I've seriously attempted to write an introduction. At least, this is the first time I've tried to write an introduction to my research as a coherent whole; I've certainly written introductions to pieces of my research. Whatever the case, I'm suffering writer's block, and this seems like a very good time to spend a few minutes on a tangent.

From a high level, I'm frustrated because I'm having a hard time explaining something which fundamentally isn't complicated. Parlett's book The Symmetric Eigenvalue Problem begins with the words

Vibrations are everywhere ... and so, too, are the eigenvalues (or frequencies) associated with them.

And so it is. There are typically two steps to finding the frequencies and mode shapes -- the eigenvalues and eigenvectors -- for a vibrating mechanical object. First, the continuous motion has to be approximated by something finite; and second, the frequencies of the finite problem must be computed numerically. My research concentration for the past couple years has mostly dealt with different types of eigenvalue problems, infinite or finite, that have interesting structures. If you pay attention to that structure, you can find resonant frequencies more quickly and accurately than you could if you proceeded blindly; if you ignore too much problem structure, you'll often end up with no information at all! I often apply my methods to problems in microsystem (MEMS) engineering, and the special mathematical structures correspond to special physical structures in the devices; but the ideas apply much more broadly. As a computer scientist, I spend a lot of time not just in developing the mathematical theory needed to solve structured eigenvalue problems, but also in writing codes that implement the ideas. My thesis research, then, comes in three parts: there's the engineering problems that drive my development; there's the software packages I write to solve those problems; and there's the fundamental mathematics and physics framework that underpins both the software development and the interpretation of computed results.

That's a reasonable nutshell description. But those sentences could describe a lot of possible theses. As always, the devil is in the details. What do I mean by structure? I mean certain symmetries of the devices, either in the geometry of the domain or in the structure of the governing equations. I mean continuous dependence on parameters, so that a single computation can be interpreted in the context of a family of very similar computations. I mean the existence of different physical effects -- like elasticity and heat flow -- which interact, but have very different time scales. I mean having a nice partial differential equation with not-so-nice boundary conditions, (where nice and not-so-nice are obviously stand-ins for more complicated technical conditions). These structures are more subtle than, say, self-adjointness (a very strong sort of symmetry condition which the equations of both classical and quantum mechanics often have). And what do I mean when I say I apply my ideas to MEMS engineering? MEMS is a huge field: I've really been working specifically on studying very high frequency (high MHz-GHz) mechanical resonators of the type that might one day end up in your cell phone -- which already uses a mechanical filter to process signals, by the way. And as far as software goes, I write my own codes for solving PDEs using finite elements and spectral elements. I write my own implementations for my new numerical ideas, and otherwise I try to use other people's codes as much as possible -- which presents software interfacing problems which are of both academic and practical interest.

The trouble I'm having is essentially one of balance. How do I convey both the strategy and the tactics of my work? In the introduction, the strategy is key -- but the interplay between strategy and tactics is, in my not-so-humble opinion, what makes or breaks any sort of numerical computation. One of my favorite elementary numerics books is Numerical Methods That (Usually) Work by Forman Acton. The book is bound in red cloth, and the title is printed on the front in a metallic shade -- except the word Usually, which is indented into the cover but not otherwise filled in, making it impossible to see unless you look very closely. Chapter 7 of Acton's book bears the title Strategy versus Tactics, and it begins with the following words:

When we first approach a numerical problem we must take care lest we become immersed too soon in the detail without having thought through clearly the general strategy of our attack. There is, of course, the alternative danger that we will concern ourselves exclusively with the grand design while ignoring inconvenient details that turn out to be decisive -- thereby wrecking our plans with a thoroughness that borders on the catastrophic. The second danger, however, seems less severe than the first, against which we must accordingly warn more loudly.

Exactly! Without really understanding the engineering application, I would probably end up solving the wrong problem (even if it was the problem that was originally posed to me). I could easily pitch this as a thesis about computer-aided design of MEMS; I could pitch it as a computational mechanics thesis; or I could pitch this as a thesis about numerical linear algebra; or I could pitch it as a thesis about numerical software engineering. None of these global views matches how I think about these problems, though. The different results are independently interesting, but I know people who are more competent than me in each of those areas independently. The thing that excites me about this work -- and about most of the problems I work on, even as side projects -- is that they are interesting across multiple disciplines. Don't get me wrong: I think it's a lot of fun to prove theorem's for purely aesthetic purposes, and I've been known to indulge myself in such at times; and while I'm not a strong designer and will never be known for the widgets I've dreamed up, I can appreciate people who do really pure engineering design work. But I dip a little into a lot of things -- that's simply how I make progress on problems.

So now I'm supposed to write an introduction which is comprehensible to a general audience in my field. But which field is it, exactly? Can I assume my readers will know something about MEMS, computational mechanics, PDE theory and functional analysis, linear algebra, numerical analysis, and software engineering? Many people -- maybe most -- who would find this work interesting are perhaps more expert than I in at least one or two of those areas, but not all.

I am, of course, far from alone in facing this difficulty. Besides Acton's book, I brought home with me two other books that I admire for the way they combine mathematically deep ideas, physically-guided intuition, and a generally accessible presentation. Those books are Applied Analysis by Lanczos and Chebyshev and Fourier Spectral Methods by Boyd. I'm still on a reader's honeymoon with that last book, and I don't know if my current admiration will weather the test of time as my admiration for the books of Lanczos and Acton has -- but I suspect it will. Whether or not I continue to refer to Boyd's book regularly, though, I think I will continue to agree with these sentiments from his introduction:

It is not that spectral concepts are difficult, but rather that they link together as the components of an intellectual and computational ecology. Those who come to the course with no previous adventures in numerical analysis will be like urban children abandoned in the wilderness. Such innocents will learn far more than hardened veterans of the arithmurgical wars, but emerge from the forests with a lot more bruises.

...

I have also tried, not as successfully as I would have wished, to stress the imprtance of analyzing the physics of the problem before, during, and after computation. This is partly a reflection of my own scientific style: like a sort of mathematical guerrilla, I have ambushed problems with Pade approximants and perturbative derivatives of the KdV equations as well as with Chebyshev polynomials; numerical papers are only half my published articles.

However, there is a deeper reason: the numerical agenda is always set by the physics. The geometry, the boundary layers and fronts, and the symmetries are the topography of the computation. He or she who would scale Mt. Everest is well-advised to scou the passes before beginning the climb.

Perhaps my audience is what Lanczos would call parexic analysts. Parexic analysis, in Lanczos's parlance, is the range of analysis techniques that lie between pure analysis (analysis of infinite processes) and numerical analysis (analysis of numerical algorithms). Lanczos describes his terminology clearly in the introduction to Applied Analysis:

The Greek word parexic (with the roots para = almost, quasi, and ek = out) means nearby. Hence the term parexic analysis can well be adopted to mean that we do not want an exact but only a nearby determination of a certain quantity. We can then speak of parexic methods, parexic expansions, parexic viewpoints, in contradistinction to the corresponding methods, expansions, and viewpoints of pure analysis which aim at arbitrary accuracy with the help of infinite processes.

In my experience, most mathematically-educated computational scientists and engineers appreciate this sort of analysis; the unwary -- who often rush where angels fear to tread -- never seem to realize that the computer cannot be trusted to deal gracefully with infinite sums. I know no effective way to caution those brash souls to prudence before their computations go awry (and often no way of convincing them afterward that there are some computations they can trust).

As far as the mathematics goes, it needn't be that frightening. I think most of what I do can readily be absorbed by someone with no strong understanding of functional analysis -- or even of numerical analysis! All that's needed is enough mathematical maturity to follow the high-level pictures. This is another thing I like about Lanczos's writing; he explicitly assumes such an audience. Quoting from the preface to Applied Analysis:

Furthermore, the author has the notion that mathematical formulas have their secret life, behind their Golem-like appearance. To bring out the secret life of mathematical relations by an occasional narrative digression does not appear to him a profanation of the sacred rituals of formal analysis but merely an attempt to a more integrated way of understanding. The reader who has to struggle through a maze of lemmas, corollaries, and theorems, can easily get lost in formalistic details, to the detriment of the essential elements of the results obtained. By keeping his mind on the principal points he gains in depth, though he may lose in details. The loss is not serious, however, since any reader equipped with the elementary tools of algebra and calculus can easily interpolate the missing details. It is a well-known experience that the only truly enjoyable and profitable way of studying mathematics is the method of filling-in details by one's own efforts. This additional work, the author hopes, will stir the reader's imagination and may easily lead to stimulating discussions and further explorations, on both the university and the research levels.

I agree! I agree! So tell me again... how did I plan to write this introduction?

  • Currently drinking: Vanilla-flavored tea