In *How to Solve It*, Polya lists four steps for mathematical
problem-solving: understand the problem, devise a plan, carry out the plan,
and then look back. It seems like I spend most of my time -- or at least
most of my interesting time -- in the understand the problem

or the look back

phases.

In some sense, I spend *way* more time in the phases of initial
understanding and later reflection than in any of the other phases. I spend
years on them! The thing is, devise a plan

and carry out the plan

are crucial components in solving individual problems. But figuring out a problem
that's simultaneously interesting and tractable -- ah, this is not so much a matter
of planning as a matter of observation.

I've already mentioned my experience with geometry: took a graduate Riemannian
geometry course, felt completely bewildered, came out... and got it

over
the next several years in a series of ah!

moments (where moments

should be very loosely interpreted as time periods ranging between a second
and twelve hours

). I'm less sure if I've mentioned other such experiences.
The time between initial exposure and any personal feeling of competence is
often a couple years, at least: for linear algebra, for numerical analysis, for
functional analysis, for ODEs, for PDEs... the list goes on. But what really
constitutes a satisfying feeling of basic competence? I don't think it's the
ability to ace an exam, or even to do difficult problems; those are side-effects
at best. I've received good grades in all my mathematical courses, really;
I saw stuff, I did the problems, I put together the pieces in my head,
and at the end of those courses, I knew the material as well as anyone could
expect. *But* there's a difference between knowing something at the end
of a semester and knowing something so well that you can use it intuitively in
a complicated argument, so well that it's not just something that you remember,
but something you couldn't possibly forget.

Of course, effortless recall of mathematical concepts is no more sufficient to
do interesting mathematics than effortless use of English vocabulary is sufficient
to craft good prose. It's a starting point. It is, to me, the real meat of the
understanding the problem

phase. And to do it right either takes someone
much smarter than me, or someone willing to devote years of thought -- though not
continuous thought -- to really grok the concepts. Sometimes I run into people
who claim expertise before they've spent the time to get to what I would consider
a starting point (for whatever reason this happens more in CS than in math),
and who are undone by a few simple questions. I've been in that position, too!
But I feel a little bad when I see it happen to someone else, and then I see that
person walk away still feeling like an expert. It's not so much that fake experts
hurt others, or that they hurt the reputation of their field of expertise,

though unfortunately such things do happen; but when I've met such folks, they
seem to have a certain smugness which inures them to the feeling of accomplishment
which only comes (for me) after years of feeling like the brain knows what's
going on, but the guts haven't quite caught up.

For the record: I consider myself expert in certain aspects of numerical software development. I would grudgingly grant that I might become an expert in certain aspects of numerical linear algebra. There's no way that I'd let you call me an expert in Riemannian geometry; it's just that I've spent enough years with certain ideas floating around in my mind -- things like manifolds and atlases, differential forms and vector bundles and tensor bundles -- that I feel a certain comfort with the material. And the same is true of other things.

Looking back is just as important, and just as time consuming, as the initial
effort of understanding mathematical ideas. The initial understanding and
the looking back blur together. Let me give an example, one that partly inspired
this whole line of thought. One of the first topics covered in a compilers class
is lexical analysis: how does one break up a string like 1 + a12(b*c)

into
meaningful tokens, things like the number 1 or the identifier a12? The usual way
of approaching this is through finite automata: one starts with regular
expressions, which are translated into nondeterministic finite automata, which
are then converted into deterministic automata, which are then compacted. Then
you generate some executable representation of the automata -- a state table and
an interpreter, perhaps -- and away you go, tokenizing files with gleeful abandon.
Or maybe you just say to heck with this,

and do it in an ad-hoc way, in
which case you'll probably re-invent either state machines or recursive-descent
parsing. For the lexer the students were required to write, we gave them a
code base which was written in a recursive-descent style; but it turns out that
a DFA can be converted directly into an LL(1) grammar, so... six of one, half
a dozen of the other.

So all these ideas come into a seemingly simple task of chopping up a file
into individual tokens. And it's all very beautiful, but it can be sort of
confusing, particularly since the students were really writing an LL(1) parser
(a recursive-descent type parser) before they were ever formally introduced to
the ideas of LL(1) parsing. So one day, as we were walking out of section,
one of my students asked why should we learn the NFA -> DFA algorithm, if
we're not coding it directly, and if we there are well-established tools that
will do such conversions for us?

It was a fair question, but while my brain
was sorting among the reasons I could give, my gut interjected. If you
can understand NFA -> DFA conversion,

I said, then you'll be able to
understand the construction of state machines for LR(k) parsing. And you want
to understand the construction of those state machines, even if you're using an
automated tool; otherwise, you'll have no understanding of how to avoid conflicts
when they occur.

My brain agreed that this was a pretty good reason,
and started to expand; but by that time the student had thanked me and we'd
parted ways.

Guess what? Before he asked, I'd never really thought about it before. But it really is true. The technical point: the NFA -> DFA algorithm and the LR(1) parser construction algorithm are similar in flavor, in that both involve manipulating states that represent sets of things (sets of NFA states in one case, and sets of productions in the other). Further, you can mimic the effects of the NFA -> DFA algorithm by converting an NFA into a certain type of grammar and then generating an LR(k) parse table.

And stuff like this keeps happening! But sometimes it goes off in odd directions.
I write an LL(1) parser generator, I learn something interesting about Lisp.
I try to come up with a non-trivial regular expression for a class exercise,
I end up reminding myself of some interesting number theory... and then wishing
that I had time to go learn a bit more about analytic number theory, since I've
been thinking about analytic function theory in a different context. I think
about code optimization shortly after thinking about a problem in floating point
error analysis, and I find myself thinking gee, I could probably write an
optimizer that new my favorite floating point error analysis tricks, and was able
to generate code for maximal accuracy rather than for maximal performance.

Oh, so many wonderful connections there are! The best thing is that I've learned
that I can mention some of these things in class, and at this point most of the
students seem to understand what I've said -- and a few of them get curious enough
that *they* start playing with the ideas and making connections and asking
questions. O, brave new world, that has such people in it!

But what is this activity, really? It's not problem-solving, really, though sometimes it results in good problems. It's something of a hybrid: exploration to discover new problems, or to discover how to understand problems, followed by reflection to steer further exploration. Were I a poet, I might mention the phoenix here, with new attacks and approaches rising from the ashes of old problems. But I am not a poet, so I shan't mention the phoenix. Except I've mentioned the phoenix just in order to bring attention to what I'm not mentioning. So am I waxing poetic?

Ah, the joy of it all!