Monday, November 15, 2004

Perturbations, Parallelism, Performance

How can you improve the speed of a numerical calculation?

Ask many computer scientists this question, and they'll start thinking about ways to improve the code that does the calculating. If they're well educated, the first question they'll ask is Does it really need to be optimized? Then they'll ask where the time is being spent, and what available high-performance tools can be brought to bear. Would a faster processor or more memory help? If they're sufficiently over-educated, or simply sufficiently brave, they might suggest running the computation in parallel.

Alas, even among those who should know better -- anyone with an undergrad education in computer science should know better, and probably people with undergrad degrees in a variety of other disciplines should, too -- the first question often doesn't get enough attention. Does it really need to be optimized? It's a vague question, and not at all as obvious as it might first appear.

Numerical calculations are not done in a vacuum; as Hamming once said, The purpose of computation is not numbers, but insight (to which some later wit replied that the purpose of computation is not yet in sight). Somebody wants to better understand some phenomenon, perhaps physical. The phenomenon is described by a mathematical model, usually simplified; the mathematical model is discretized; and a computer code is employed to solve the resulting problem. If the question does this really need to be optimized? is first asked after the computer code is in place, is is asked too late.

Consider the physical problem. Is doing a computation the fastest way to understanding? If a five minute experiment can supplant five hours of setting up a model and computing with it, then the five minute experiment is probably worth doing. At least do a thought experiment first to ensure that you're asking a reasonable question.

Consider the construction of the mathematical model. How complex must the model be to accurately predict the behavior of interest? Can the real problem be viewed as a simple problem for which the answer is known? Or as a perturbation from such a simple problem? Is there a simple approximation which can be used as a starting point for more refined estimates (or even as an ending point, if not much accuracy is needed)? Ideally, the person setting up the model should ask these questions. Certainly the computer will not ask them.

Consider the construction of a discretized model for use on a computer. How faithful must the computer model be to the original mathematical model? This question works even if the original model is discrete; sometimes, it's possible to approximate one discrete problem by another discrete problem to very high accuracy. Can prior knowledge of how the system should behave be used to get the solution more easily? Is there one model or the other that provides particularly nice properties, in terms of error analysis or performance or both?

And now consider the construction of a computer program to solve the problem. Can someone else's code be of use? If the code available is too slow, is the right thing to do to speed the code up, or to reconsider the problem? If it's necessary to speed the code up, where is the critical bottleneck, the place in the computation where most of the time is going? And now it is time to use the computer scientist's bag of tricks to make things faster.

I've helped a colleague do a probability calculation which he thought would require extended precision by doing a back-of-the-envelope Taylor expansion. The expansion method not only required no coding, but it was more accurate than the method he had in mind. I've taken calculations of my own which originally took two hours to run and re-run them in milliseconds by replacing a generic finite element model with a semi-analytical formula based on finding singularities in a certain matrix whose entries were Bessel functions. I've used nondimensionalization arguments to reduce problems in two or three dimensions to problems in one dimension. I've used special structures in my problems, and in my models, and sometimes in my formulations to help me or help other people untangle problems that initially seemed difficult. I've taken a code that I thought I would have to run in parallel, profiled it, and discovered that all the time was going into file management and text processing tasks which weren't really needed, and replaced that slow bottleneck to get something which suddenly seemed blazingly fast on an ordinary desktop machine.

And some days -- many days -- I've been able to use existing software that gave me the answers I wanted sufficiently fast that it would have been inefficient, in terms of total time to solution, for me to spend any more thought on using a smarter solver or a more clever problem formulation.

How can you improve the speed of a numerical calculation? You can improve it by realizing that speed often means time to get useful numbers and not megaflops (millions of floating-point operations per second).