Tea Total: 2005

Tuesday, November 15, 2005

Moving

I'm moving to a new bat place and new bat format: Numerical Notes. This move is spurred by a combination of technical and personal considerations.

On the technical side: I mostly spend my evenings away from computer networks, which means that the times I'm most likely to take a break and blog are also the times that I'm least likely to have the connectivity to post anything. Also, I've decided that I'd really like to be able to write mathematical ideas, and it's very difficult to do that with plain HTML. The new tools give me more control.

On the personal side: I would rather correspond with my friends and family (by letters or e-mail) than broadcast news through this blog. On the other hand, essay writing to an audience -- even an unknown audience -- has proven to be a valuable way for me to clarify my thoughts. Expect further essays and computations at the new site, but not so much personal news. I hope I will be better about sending letters with that!

Thanks for reading.

Tuesday, October 04, 2005

Understanding and reviewing

In How to Solve It, Polya lists four steps for mathematical problem-solving: understand the problem, devise a plan, carry out the plan, and then look back. It seems like I spend most of my time -- or at least most of my interesting time -- in the understand the problem or the look back phases.

In some sense, I spend way more time in the phases of initial understanding and later reflection than in any of the other phases. I spend years on them! The thing is, devise a plan and carry out the plan are crucial components in solving individual problems. But figuring out a problem that's simultaneously interesting and tractable -- ah, this is not so much a matter of planning as a matter of observation.

I've already mentioned my experience with geometry: took a graduate Riemannian geometry course, felt completely bewildered, came out... and got it over the next several years in a series of ah! moments (where moments should be very loosely interpreted as time periods ranging between a second and twelve hours). I'm less sure if I've mentioned other such experiences. The time between initial exposure and any personal feeling of competence is often a couple years, at least: for linear algebra, for numerical analysis, for functional analysis, for ODEs, for PDEs... the list goes on. But what really constitutes a satisfying feeling of basic competence? I don't think it's the ability to ace an exam, or even to do difficult problems; those are side-effects at best. I've received good grades in all my mathematical courses, really; I saw stuff, I did the problems, I put together the pieces in my head, and at the end of those courses, I knew the material as well as anyone could expect. But there's a difference between knowing something at the end of a semester and knowing something so well that you can use it intuitively in a complicated argument, so well that it's not just something that you remember, but something you couldn't possibly forget.

Of course, effortless recall of mathematical concepts is no more sufficient to do interesting mathematics than effortless use of English vocabulary is sufficient to craft good prose. It's a starting point. It is, to me, the real meat of the understanding the problem phase. And to do it right either takes someone much smarter than me, or someone willing to devote years of thought -- though not continuous thought -- to really grok the concepts. Sometimes I run into people who claim expertise before they've spent the time to get to what I would consider a starting point (for whatever reason this happens more in CS than in math), and who are undone by a few simple questions. I've been in that position, too! But I feel a little bad when I see it happen to someone else, and then I see that person walk away still feeling like an expert. It's not so much that fake experts hurt others, or that they hurt the reputation of their field of expertise, though unfortunately such things do happen; but when I've met such folks, they seem to have a certain smugness which inures them to the feeling of accomplishment which only comes (for me) after years of feeling like the brain knows what's going on, but the guts haven't quite caught up.

For the record: I consider myself expert in certain aspects of numerical software development. I would grudgingly grant that I might become an expert in certain aspects of numerical linear algebra. There's no way that I'd let you call me an expert in Riemannian geometry; it's just that I've spent enough years with certain ideas floating around in my mind -- things like manifolds and atlases, differential forms and vector bundles and tensor bundles -- that I feel a certain comfort with the material. And the same is true of other things.

Looking back is just as important, and just as time consuming, as the initial effort of understanding mathematical ideas. The initial understanding and the looking back blur together. Let me give an example, one that partly inspired this whole line of thought. One of the first topics covered in a compilers class is lexical analysis: how does one break up a string like 1 + a12(b*c) into meaningful tokens, things like the number 1 or the identifier a12? The usual way of approaching this is through finite automata: one starts with regular expressions, which are translated into nondeterministic finite automata, which are then converted into deterministic automata, which are then compacted. Then you generate some executable representation of the automata -- a state table and an interpreter, perhaps -- and away you go, tokenizing files with gleeful abandon. Or maybe you just say to heck with this, and do it in an ad-hoc way, in which case you'll probably re-invent either state machines or recursive-descent parsing. For the lexer the students were required to write, we gave them a code base which was written in a recursive-descent style; but it turns out that a DFA can be converted directly into an LL(1) grammar, so... six of one, half a dozen of the other.

So all these ideas come into a seemingly simple task of chopping up a file into individual tokens. And it's all very beautiful, but it can be sort of confusing, particularly since the students were really writing an LL(1) parser (a recursive-descent type parser) before they were ever formally introduced to the ideas of LL(1) parsing. So one day, as we were walking out of section, one of my students asked why should we learn the NFA -> DFA algorithm, if we're not coding it directly, and if we there are well-established tools that will do such conversions for us? It was a fair question, but while my brain was sorting among the reasons I could give, my gut interjected. If you can understand NFA -> DFA conversion, I said, then you'll be able to understand the construction of state machines for LR(k) parsing. And you want to understand the construction of those state machines, even if you're using an automated tool; otherwise, you'll have no understanding of how to avoid conflicts when they occur. My brain agreed that this was a pretty good reason, and started to expand; but by that time the student had thanked me and we'd parted ways.

Guess what? Before he asked, I'd never really thought about it before. But it really is true. The technical point: the NFA -> DFA algorithm and the LR(1) parser construction algorithm are similar in flavor, in that both involve manipulating states that represent sets of things (sets of NFA states in one case, and sets of productions in the other). Further, you can mimic the effects of the NFA -> DFA algorithm by converting an NFA into a certain type of grammar and then generating an LR(k) parse table.

And stuff like this keeps happening! But sometimes it goes off in odd directions. I write an LL(1) parser generator, I learn something interesting about Lisp. I try to come up with a non-trivial regular expression for a class exercise, I end up reminding myself of some interesting number theory... and then wishing that I had time to go learn a bit more about analytic number theory, since I've been thinking about analytic function theory in a different context. I think about code optimization shortly after thinking about a problem in floating point error analysis, and I find myself thinking gee, I could probably write an optimizer that new my favorite floating point error analysis tricks, and was able to generate code for maximal accuracy rather than for maximal performance. Oh, so many wonderful connections there are! The best thing is that I've learned that I can mention some of these things in class, and at this point most of the students seem to understand what I've said -- and a few of them get curious enough that they start playing with the ideas and making connections and asking questions. O, brave new world, that has such people in it!

But what is this activity, really? It's not problem-solving, really, though sometimes it results in good problems. It's something of a hybrid: exploration to discover new problems, or to discover how to understand problems, followed by reflection to steer further exploration. Were I a poet, I might mention the phoenix here, with new attacks and approaches rising from the ashes of old problems. But I am not a poet, so I shan't mention the phoenix. Except I've mentioned the phoenix just in order to bring attention to what I'm not mentioning. So am I waxing poetic?

Ah, the joy of it all!

Monday, October 03, 2005

Another week in books

Adventures of a Mathematician (S. Ulam)
Did I finish this two weeks ago? In any case, it was interesting to read. In addition to the autobiographical and historical comments, Ulam says a great deal about mathematics and mathematical ways of thinking. Though he sometimes mentions specific technical areas, almost all of the book is accessible to a general audience.
Fool's Errand (R. Hobb)
The books in the Farseer trilogy kept me up past when I should have gone to sleep. This one did, too. But I think I will postpone picking up Golden Fool, the second book in this new trilogy. I think the bleakness of the characters is a bit overdone.
Forty Signs of Rain (K. S. Robinson)
I can't think of any books by KSR that I haven't enjoyed (although Icehenge wasn't as much fun as Years of Rice and Salt, for instance). I have a few quibbles, mostly involving his description of the idea of an algorithm, but I like his cast of characters and of ideas, and I really like the writing style. There wasn't much plot, yet, but it didn't feel like a deficiency; and anyhow, I expect the plot will develop further in the other books.

I do not think the similarity of the events in the book to current political and meteorological events is particularly coincidental.
The Education of Henry Adams (H. Adams)
I haven't gotten more than thirty pages into it, but so far I've enjoyed it. There's a dry humor there that appeals to me.
Best American Science Writing 2005 (Edited by A. Lightman)
Haven't started it yet, but I'm looking forward to it. I usually enjoy anthologies of popular science writing.
Fundamentals of Electroacoustics (Fischer)
A short, plainly written description of different types of electromechanical interactions, and of circuit-style models of coupled electrical and mechanical systems (speakers and microphones, mostly). Translated from the German original. Why couldn't I have found this book a couple years ago? But it's on my shelf now.
Mathematics of Classical and Quantum Physics (F.W. Byron and R.W. Fuller)
A Dover reprint with two volumes bound as one. Includes sections on variational calculus, Hilbert spaces and spectral theory, analytic function theory, Green's functions and integral equations, and symmetry groups. I have other books that treat most of these topics in greater detail, but many of those books make little or no useful mention of physical applications or motivations. At the same time, Byron and Fuller have written an essentially mathematical book: there are no rabbit-from-a-hat tricks (or when there are, they are preceded by an apology and a reference), and certain details which seem subject to frequent abuse in physics and engineering texts -- like the treatment of the Dirac delta -- are handled with appropriate rigor.

An aside: I'm much better as a mathematician than as a physicist. When I think about physical effects, I tend to think of them as concrete approximations of certain mathematical abstractions; and for the most part, my intuition for the mathematical things is better than my intuition for the physical things. This is a constant source of frustration for me when I'm learning about physics, since hand-waving appeals to physical intuition generally confuse me more than they help, at least initially. Nonetheless, I like learning about physics, and about connections between mathematics and physics. Furthermore, I like to learn about mathematical and scientific history, and a great deal of mathematics has historically been inspired by descriptions of physics: the calculus of variations coming from classical mechanics, many ideas in functional analysis coming from quantum mechanics, many ideas in group theory coming from mechanical symmetries, many ideas in graph theory connecting to circuit analysis, and so on.
Modulated Waves (Ostrovsky and Potapov)
I've mentioned this book before. I read the first several chapters over the course of a couple weeks a while back (early summer), and since then I've been re-reading chapters or reading new chapters as the whim or the need takes me. This, too, is a mathematical book; but like Byron and Fuller's book, it contains a nice infusion of physical ideas and applications. I picked up my copy of Ostrovsky and Potapov this weekend to compare their presentation of dispersion relations to the corresponding presentation in Byron and Fuller.

Sunday, October 02, 2005

Windows woes

I wish I had a better idea how to effectively use a Windows box at a distance. Now that my old laptop is retired, I no longer have such easy access to a Windows machine. The department has a server that I can use with rdesktop, but it's of limited usefulness: I can't run MATLAB on it to compile my codes; I can't print from it; and for security reasons, I can't access it from the wireless network.

Even were I able to get remote access to a Windows machine which was not so restricted, I know that I'd still find it an irksome thing to use. No, I don't think Windows is counter-intuitive, nor that it's immensely buggy. Since the introduction of Windows 2000, I think the operating system has become immensely more stable; and Microsoft does well enough at making most of its software usable. I just wish the system was a little less interactive!

Effective computing is largely about constructive laziness. For the compilers class for which I'm an instructor this semester, we have a lot of code that's distributed to students in compiled form. The compiled files vary from vendor to vendor, platform to platform, and version to version. So, in the spirit of constructive laziness, I wrote a build script that shuttles files back and forth between three different computers, then runs Lisp programs to generate compiled files for seven different platforms, then moves all those files to the appropriate places on the class web page.

Seven different platforms, just by typing make all! But then there's an eighth platform, which is Allegro CL 7.0 under Windows; and to produce those compiled files, I defer to the professor. It would be much more convenient for both of us if I could just add another line or two to my build file; but I don't know how to do that, and while I could probably write more code to get around the issue, it's not worth the bother.

(Incidentally, this is also a good reason for distributing programs as source code -- then you can defer to someone else the work of building the program on whatever platform is of interest! Clearly, though, this is not such a good option when you're trying to provide a student with a model solution without actually giving away all the details.)

I think it may be possible for me to run a PC emulator on my Mac laptop, and on that Windows emulator to install enough UNIX-like utilities that I can autobuild there. Also, in the unlikely event that someone reading this has experience running gcc as a cross-compiler with a Windows target, I'd like to hear about it.

Wednesday, September 28, 2005

Frequently searched questions

The word you're thinking of is teetotal. It's an adjective that describes one who does not drink. It is a pun, since I prefer tea to alcohol (though I've decided a cup of cider from time to time is okay, too).
There are two aspects of MATLAB EXternal interface (MEX) programming that cause most of your headaches. First: when you use the mex script, the compiler flags are set in such a way that most of the warnings are turned off. This is bad mojo. Where possible, put most of the functionality for your program into a separate file, so that you can compile it to object files (and perhaps even link it to a testing stub) without testing mex. Then use mex to link everything together. Second: mex creates dynamically loadable modules, and some of the dependency checks for those modules are not performed until load time or (if you're on OS X) possibly even until the unlinked routines are hit at run time. That's why you get errors about unresolved symbols and invalid mex files. Check the order of the libraries on your link line, and check to make sure you have all the libraries you need.
You don't want to write your own eigensolver, LU factorization routine, etc. Not in C++, not in C, not in Fortran. For dense numerical linear algebra, go look at LAPACK and a good BLAS implementation (BLAS = Basic Linear Algebra Subroutines). If you're writing in C++, and you have trouble linking to a Fortran code like LAPACK, make sure you use the extern C directive in order to turn of the C++ compiler's name-mangling. If you're writing in C on most Unix-y platforms, you can call Fortran code by simply appending an underscore; e.g. call dgemm_ from C instead of dgemm. Of course, this is somewhat platform dependent. There is a C translation of LAPACK, which I was responsible for a while back, but I think it's worth the effort of learning to link to Fortran. Still, you might also check out the other packages on Netlib, to see if any of them fit your needs.
When I wrote before about tensors and duality, the code was there purely for illustration. It is not how I would typically structure a computation, or at least it's not how I'd structure it if I wanted to run something big in a reasonable amount of time. You might check out the Matlisp bindings if you're interested in something serious.
Am I really so pessimistic about my code? For those searching on crash, check out Valgrind. This tool will help you. It surely helps me.
If you really are interested in code to do number-theoretic computations, check out Magma, Maxima, or perhaps Mathematica or Maple. If you want to crunch numbers, consider MATLAB, Octave, or perhaps R. There exist C++ number theory libraries, but I'm not sure how much use they'll be (unless you're trying to write a distributed code-cracking system or something -- in which case it's already been done, and you should probably use existing sources).
For information on the Euler totient function, the lengths of repeating fractions, or a variety of other such things, I refer you to Eric Weisstein's Mathworld.
If you want a Lisp lexer generator or LL(1) parser generator, you can grab one from my software page. But for the parser generator, you may prefer something that handles LALR grammars, and for the lexer -- why bother?

Monday, September 19, 2005

Time out for reading

There is a new Half Price books in Berkeley, along Shattuck Avenue a couple blocks west of campus. Curiously, they're in the same space as the dollar store where I got some pans and utensils just after I moved to Berkeley. As part of their opening celebration, they have an additional 20% off. So I wandered in, saw some familiar faces, and picked up a few books:

Adventures of a Mathematician (S. Ulam) --
This autobiographical book is the only one that I've started. There's a preface that describes Ulam's work, which covers a broader range of pure and applied mathematics than I'd realized. And I thought, excellent, this will be very interesting.

Then I went to the Au Coquelet cafe to read the prologue and sip something warm, and I realized that this will be very interesting. This book didn't start life as an autobiography; initially, Ulam thought to write a biography of von Neumann. But (on page 5):

When I started to organize my thoughts, I realized that up to that time -- it was about 1966, I think -- there existed few descriptions of the unusual climate in which the birth of the atomic age took place. Official histories do not give the real motivations or go into the inner feelings, doubts, convictions, determinations, and hopes of the individuals who for over two years lived under unusual conditions. A set of flat pictures, they give at best only the essential facts.

Thinking of all this in the little plane from Albuquerque to Los Alamos, I remembered how Jules Verne and H. G. Wells had influenced me in my childhood in books I read in Polish translation. Even in my boyish dreams, I did not imagine that some day I would take part in equally fantastic undertakings.

The result of all these reflections was that instead of writing a life of von Neumann, I have undertaken to describe my personal history, as well as what I know of a number of other scientists who also became involved in the great technological achievements of this age.

What follows is a remarkable book, part autobiography, part biography, and part introspection on the workings of memory and of the mathematical mind.
The Education of Henry Adams (H. Adams) --
I'm not sure where I first heard about this book. I think it was from reading Jacques Barzun, either Teacher in America or A Stroll with William James; that it would be from Barzun is a pretty good guess, though, since Barzun's books are usually crammed with references to books that I decide I want to read (and then quickly lose in my list). Either way, I know in advance to expect more than basic autobiography from this one.
Fool's Errand (R. Hobb) --
A little lighter reading. I enjoyed the first trilogy, but every time I looked for the start to the second trilogy on the local bookshelves, I could only find the second book and on. So now I have something for the next time I feel like a novel.

Right now I'm switching back and forth between Ulam's book and the most recent SIAM Review for my evening reading. This issue of SIAM Review is a really good one, both for technical content and for style. J.P. Boyd, who has written a book on spectral methods which I've mentioned before, has an article on Hyperasymptotics and the Linear Boundary Layer Problem which is both informative and highly entertaining. To quote again:

Unfortunately, asymptotics is usually taught very badly when taught at all. When a student asks, What does one do when x is larger than the radius of convergence of the power series?, the response is a scowl and a muttered asymptotic series!, followed by a hasty scribbling of the inverse power series for a Bessel function. But of course, that's all built-in to MATLAB, so one never has to use it any more.

Humbug! ... Arithmurgy [number-crunching] hasn't replaced asymptotics; rather, number-crunching and asymptotic series are complementary and mutually enriching.

The article refers several times to a longer article (98 pages) entitled The Devil's invention: Asymptotics, superasymptotics, and hyperasymptotics, which I think I would have to put on my reading list just for the title, even if I didn't know that I found the author so entertaining. But given the state of my reading list right now, perhaps it will have to go onto the longer list rather than the shorter one.

Most of my reading time recently has gone to technical material (excepting Tools for Teaching by B. Davis, which is the textbook for the teaching course). This is unfortunate, because what I read tends to strongly influence what things I think about for casual conversation. So if I'm only reading about number theory, Gershgorin disks, and the error analysis of non-self-adjoint PDE eigenvalue problems, I tend to end up talking about those things to anyone who will listen. Those topics are really cool, and I do have buddies who are willing to go have a coffee, or a cider and some pizza, and have a conversation that wanders in and out of technical realms. Nevertheless: a little light reading in the evenings seems to make conversations in the day go more smoothly.

Multiples of q

You probably learned this trick in grade school: to see whether a number is divisible by three, add all the digits and check if the sum is divisible by three. The same thing works for nine. Ever wonder why?

Actually, three and nine are just special cases of something very general. What does it mean if q divides n evenly? It means that there is ultimately no remainder in the division. So if r represents the remainder at each step of long division, then I should be able to write

  r = 0
  for d = digits
    r = mod(r * 10 + d, q)
  end

If r = 0 at the end of the loop, then the number represented by the given string of digits is evenly divisible by q. Now, it's not too difficult to show that I could rewrite the update of r as

  r = mod(r * (10 + k * q) + d, q)

for any integer k. In particular, this means that I could let p = mod(10,q), and write the update formula as

  r = mod(r * p + d, q)

The trick with division by three and division by nine works because mod(10,3) = mod(10,9) = 1.

Once you realize how this works, you can think of quick checks for all sorts of divisibility properties. For example, take the number 86834; is it evenly divisible by 11? Yes! I can tell because 10 and -1 are equivalent modulo 11, so that my remainder update looks like

  r = mod(-r + d, 11)

I can thus check divisibility by 11 by looking at alternating sums of the digits. So since 8 - 6 + 8 - 3 + 4 = 11 is divisible by 11, so also is 86834. Or if I wanted to know whether a number written in octal (base 8) was divisible by 7, I could add the octal digits and see whether seven divided the sum.

Or I could play around for five minutes to find that the binary representation of any multiple of three matches the regular expression

  (0 | 1 (00)* (1|01))*

Hooray! A meaningful regular expression which isn't absolutely trivial to parse! So such toy problems in number theory not only amuse me, but they also actually serve some use as a source of exercises for my students. We'll see whether any of them appreciate the entertainment value as I do.

Currently drinking: Mint tea

Friday, September 16, 2005

Penguins

A friend pointed out this, which I think is one of the most entertaining uses of GIF animation that I've seen in a long time.

Monday, September 12, 2005

A little long division

Suppose p and q are relatively prime integers, and 0 < p < q. If q has any prime factors besides 2 and 5, then a decimal expansion of p/q will eventually start to repeat. How long will the repeating part be?

Looking at the sequence of digits is actually a very difficult way to approach this problem. It's much simpler to look at the sequence of remainders. Remember how long division works. At every step, we have a remainder r, 0 <= r < q. To get the next digit in the expansion, we figure out how many times q goes into 10*r; this gives us a digit and a new remainder. In C++ code, I might write the long division algorithm this way:

  /* Compute ndigits digits of p/q */
  r = p;
  for (i = 0; i < ndigits; ++i) {
    d[i] = (10*r) / q;
    r    = (10*r) % q;
  }

where those unfamiliar with C should remember that C's integer division gives an integer result (in C, we have 1/2 == 0, but 1.0/2 == 0.5). The operator % gives the remainder of the division. If we keep computing digits, we will eventually see one of two things: a zero remainder, or a repeat of a remainder seen in a previous step. In the former case, p/q can be written as a non-repeating decimal; in the latter case, we have a repeating fraction, where the length of the repeating part is the same as the distance between occurrences of the same remainder in the division algorithm.

The operator % is sometimes called the modulo operator, but it's actually an odd name. In much of computer science, we speak of the mod operator, and mean operator that computes the remainder in a division. By contrast, if you're working with abstract algebra or number theory and you say that two things are equivalent modulo something, you mean that there is an underlying relation that divides your structure into equivalence classes, and the two things under investigation happen to lie in the same equivalence class. The usual example of this is time: 11 pm today and 11 pm yesterday are not the same instant in time, but they are equivalent modulo the twenty four hour cycle we use for time keeping. The connection between modulo-as-operation and modulo-an-equivalence is that when one takes a remainder after dividing by q, one gets a single canonical representative of a whole equivalence class of numbers which differ from each other by some multiple of q. It's like using 11 pm today to refer to 11 pm any day; they're all equivalent modulo a 24 hour day.

Anyhow. The C code for long division computes a sequence of remainders, which look like r[k] = (p*10^k) % q. If q has prime factors other than 2 and 5, then at some point we'll encounter a remainder which is prime relative to q; at that point, we've started the repeating part of the fraction. Past that point, the remainders are all units in the ring of integers modulo q (Z_q), which is just a fancy way of saying that all the successive remainders are also prime relative to q (and therefore have inverses in Z_q). For instance, 4 and 7 are relatively prime, so 4 is a unit in Z_7; the inverse is 2, since 4*2 = 7+1 = 1 modulo 7. The set of units in Z_q is an important object in number theory, so important that it's size is given a special name: the number of natural numbers less than q which are prime relative to q is written phi(q); it is called Euler's phi function or Euler's totient function.

Hmm. I assume totient is related to quotient, but I actually don't know the origin of this word. I'll have to look that up.

One reason why phi(q) is so important is that it connects the structure of multiplication and addition in Z_q. For example, for any m which is a unit in Z_q, I have m^phi(q) = 1; for instance, phi(7) = 6, and 4^6 = 4096 = 1 modulo 7. That means, for instance, that I have an easy way to write m^(-1); it's just m^(phi(q)-1). There are all sorts of other entertaining things you can do with this sort of manipulation -- including figuring out something about the question we originally asked, which is how long the repeating part in the decimal expansion of p/q will be. Without loss of generality, I can just consider 1/q; for more general p/q, I will generate a sequence of remainders which has the same period as for 1/q. Anyhow, if I start from the remainder 1 and run long division, the remainders will generate a cyclic subgroup of the group of units in Z_q -- I'll keep generating new remainders until at some point I get the remainder 1 back. Since I'm always multiplying by 10, the cycle thus formed is sometimes called the orbit of 10 in the group of units in Z_q. The order of this cycle -- the number of elements -- is the number of digits in the repeating part of the decimal expansion. There is a wonderfully versatile little theorem, due to Lagrange, which says that the order of a subgroup must divide the order of the group in which it is contained; in this case, that means that the number of digits in the repeating part must divide phi(q).

Cool, no? So the number of digits in the repeating part of a decimal expansion of p/q must divide phi(q). Actually, if you walk through the arguments I just made, you find that this is true for any base. So, for example, take q = 47; since phi(q) = 46 = 2*23, I can take an expansion in any base and find that the length of the repeating part is either 0, 1, 2, 23, or 46. Base 2, base 10, base 16 -- they all have to generate repetitions with a period that divides 46. Which divisor of 46 you actually get depends on the base in which you do the expansion, though.

So while writing the length of the repeating part of p/q is difficult in general, we can quickly say that it divides phi(q). When q is prime, phi(q) = q-1, and so we could potentially get a repeating part as long as q-1 digits. In fact, we often do get a repeating part as long as q-1 digits... which is why writing numbers in terms of repeating decimal isn't a very good choice of representations for most q. If it takes log10(p) + log10(q) < 2*log10(q) digits to represent a rational number in the form p/q, but as many as q-1 digits to represent the number as a repeating decimal, then the repeating decimal representation is exponentially longer than the rational number representation. Not good! At least, it's not good if you want to actually do arithmetic; if you want to play with number theory on the other hand, or with finite state machines, then this is lots of fun.

As an amusing aside, you can use the development above to write a wholly impractical scheme for cracking the RSA cryptosystem. You see, the RSA system is secure because we don't have good algorithms for computing the Euler totient function for large numbers. Yes, I know you've heard that the reason that RSA is secure has something to do with factoring, but the difficulty of factoring large numbers is really just a proxy for the difficulty of computing Euler's totient. The actual connection is that if m and n are large prime numbers then there's a simple formula for the totient of the product: phi(m*n) = (m-1)*(n-1). But if we do long division in several different bases, we're almost inevitably going to run into a base in which the length of the repeating part is equal to phi(q). So the totient function is actually easy to compute, and any talented grade schooler who knows about long division and repeating decimals can be taught how to crack RSA, right? Well, not exactly. The fact that computing phi(m*n) in this way would take time on the order of m*n -- rather than on the order of log(m*n) -- means that your precocious grade schooler would take at least 130 years to crack a 32-bit key, assuming that he never slept and could get a digit a second right.

A fast computer using a more effective attack algorithm could break a 32-bit RSA key in a matter of minutes (maybe less, I'm not sure). But those algorithms tend to run into the same explosive cost increases as the long-division cracking algorithm that I just proposed, so the key doesn't have to get that much longer to be much more secure. Currently, most transactions are guarded by keys at least 128 bits long, and the keys I use for remote login have either 1024 or 2048 bits. So unless you want to go into number theory as a career, you probably will find that the easiest way to compromise my keys is to completely go around them -- by finding my office and kicking me in the shins until I'm willing to hand them over, for instance.

Tuesday, September 06, 2005

Honest exercise

Alas, we didn't get through all the Common Lisp exercises I'd planned for section today. This is too bad, because I would have liked to have people think about the last exercise: given a collection of (programmer program) pairs, how can you find the sets of programmers with structurally identical codes? It took me about five minutes to do this one (though this may be because I already had a solution in mind when I wrote the question). I mention the solution here:


(defun code-to-struct (code)
  "Replace all atoms in a code with 'foo in order to compare structures"
  (when code
    (if (listp code)
        `(,(code-to-struct (car code)) . ,(code-to-struct (cdr code)))
        'foo)))

(defun cluster-structure (codes)
  "Find programmers who have structurally identical codes."
  (let ((ht (make-hash-table :test #'equal))
        (groups nil))
    (dolist (code codes)
      (push (car code) (gethash (code-to-struct (cdr code)) ht)))
    (maphash #'(lambda (key val) (push val groups)) ht)
    groups)))

Let me restate that in case it went by too fast: it's possible to write a Common Lisp code which will find cheaters, and to do it in under 20 lines of code. Furthermore, that code is reasonably efficient: unless the Lisp implementation chooses a really terrible hash function, the cost of the above code is linear in the sums of the lengths of the programs. It doesn't take long to write, it doesn't take long to run, it doesn't take long to improve.

When I took compilers as an undergrad, a bunch of folks in the class got caught plagiarizing each others codes. This is, I think, one of the stupidest plagiarism cases I can think of: here you are in a class, the whole point of which is to automatically analyze programs. You are being taught by someone who is, presumably, very good at writing tools to automatically analyze programs. What, it seems like writing a program to find similar programs would be an impossible task for this guy? In Common Lisp, it takes an overworked graduate student five minutes to write a detector for structurally identical codes; it would take me another fifteen minutes to modify code-to-struct to catch workarounds (e.g. re-ordering of instructions or other superficial changes to program structure). It would take somewhat longer for other languages, but not so long as you might think -- this sort of detector doesn't need to know about every syntactic bell and whistle in a given language (and anyhow, it's possible to find complete grammars for many languages for free on the net).

Don't cheat in a compilers class seems to me to be only slightly more difficult to figure out than don't piss on the electric fence. Here's hoping that this class feels the same...

Monday, September 05, 2005

Law and linear algebra

The Economist has an article on using linear algebra network analysis tools to identify important legal cases. This is the same technology used by some search engines to find important web sites (hubs and authorities).

I learned about this from a posted on NA digest.

Currently drinking: Coffee

Cycling for digits

There's a computer language called Nickle which grew out of something Keith Packard started working on two decades ago because he needed to do a little arbitrary-precision calculation. One of the entertaining features of the language, particularly if you're trying to teach people about compilers, is the treatment of rational numbers. Every rational number can be written as a decimal expansion that starts with some fixed digit string, and then has a repeating part, and this is the form Nickle uses for printing and (I assume) reading rational numbers. For example, the number 5/6 in Nickle would print as 0.8{3}, where the stuff between braces indicates the repeating part.

An entertaining exercise is to write a routine to read a Nickle number and convert it into the form p/q, where p and q are (relatively prime) integers.This turns out to be pretty simple to do in Lisp. Also entertaining is to write a routine that prints a Nickle number. If you find these entertaining, too, you can check out the class page in a couple weeks. I'm not going to post my solution until I've given the exercise for folks to think on between weeks three and four, but I will say that it fits in a page and a half of Lisp, including comments -- the front of the page for reading Nickle strings into Lisp, the back for generating Nickle strings from Lisp rationals.

I think it would be highly entertaining to work through the consequences of different internal representations of these numbers -- for example, if you represented Nickle numbers internally as strings like 0.{3}, then even something as simple as equality becomes nontrivial (since 0.{3} = 0.{33}, even though the two strings are different). You'd probably want to convert the strings into a unique minimal form for comparison, which boils down to a problem in state machine compression. But then you might want to do something different for actual arithmetic. Unfortunately, while I might find it entertaining, these sorts of issues are not likely to be of much interest to the class. So I won't say anything about it unless someone asks.

What I might ask about -- because it actually provides a very entertaining exercise in manipulating finite state machines -- is how you can recognize that a repeating decimal expansion can be written as p/q for some specified q. And what if I allow make the strings finite by allowing rounding? If I allow rounding, it's most natural to introduce nondeterministic finite automata (NFAs) and then convert them to deterministic equivalents (DFAs); this is also the usual chain of tools one uses in building tokenizers for programming languages. If you find such things amusing, I recommend the following two special cases: write a finite state machine to recognize decimal expansions of p/6 which are correctly rounded in the last digit, and another for p/7. You may be able to do the first case in your head, but the latter isn't so obvious. Or is it? I claim that, having done it once, I can now recognize by inspection a correctly-rounded decimal expansion of p/7 for any p. Your calculator probably displays enough digits that you can see the pattern just by playing around with it.

A similar sort of pattern holds in the case q = 17, actually, and in the case q = 47. But observing those patterns on a pocket calculator might be a little harder, unless your calculator displays more digits than does mine. And while I can now remember enough to pretty easily write down what the decimal expansion of p/7 looks like for any given p, remembering the same information for 17 or 47 is not so simple or useful.

But the washer just buzzed, which means that I should move my quilt over to the drier and get back to work.

Saturday, September 03, 2005

On the radio

The AC voltage provided by the local power company is a bit over 60 Hz. Consequently, most of the clocks in the house (excepting the battery-powered ones) end up running fast. They all got reset to the correct time when daylight savings started; now they're about fifteen minutes out of sync. Consequently, my alarm goes off at 7:15, now, even though the clock thinks it's 7:30. I usually don't mind this too much. I don't get out of bed any earlier, but the extra few minutes gives me a chance to wake up gradually while I listen to Morning Edition on NPR.

Listening to the radio in the mornings is a lot more fun when there's happy news. For the past few days, I've awakened in the morning to news of the aftermath of Katrina. I was in New Orleans for a conference less than a month ago; there were a lot of friendly folks there, and I wish I knew that they were still there and still doing okay. But I don't. So I make my monetary contribution and hope that it can help someone, even if I fear that it won't.

Listening to the aftermath of Katrina scares me. It took too long for help to get there, and that means that those without the resources to get out are in a pickle. Listening to the radio this morning, I heard an interview with one man who had carefully planned in advance what he would do if the big one hit. The big one to me has come to mean the next big earthquake; New Orleans has just had its big one, and in the Bay Area we all spend a little time waiting and planning for ours. Hearing about the slow response to the situation in New Orleans makes me think poor people; it also makes me think are we next?

Thursday, August 25, 2005

Fall 05

Classes start next week. How about that?

A couple years ago, I was the TA for the local graduate parallel computing course. That was a lot of fun, though it was also a lot of work. Kathy Yelick has done a lot with performance tuning and language design for parallel machines; and I'm a numerical analyst who happens to know something about systems. So I think the folks taking the class were exposed to a pretty good range of topics, more than they might have been if (for exmple) I was the TA for my advisor.

Since the parallel computing course, I haven't been formally assigned as a TA. I've had a number of students come by my office to ask questions related to projects in numerical linear algebra, and would have probably been the TA for the numerical linear algebra course with Kahan a couple semesters ago if Jim hadn't nixed the idea (perhaps correctly). However, there is -- correctly -- a teaching requirement for graduate students in the CS PhD program; and because it was a graduate course, the parallel computing class didn't fulfill that requirement for me. So I will be teaching this coming semester, in addition to wrapping up my thesis, writing job applications, and trying to make forward progress on my research.

It looks like it will be a busy semester. And it starts next week! And I'm delighted about it. I'm sure I'll be busy, but won't it be fun? I work well when I have several distinct tasks to occupy my time: a few hours proving theorems, a few hours writing code, and a few hours scribbling notes. And yes, a few hours to eat and exercise goes in there, too. A few hours of teaching will fit nicely, I think.

The class for which I'm the TA is CS 164: Compilers and Language Design. Yes, this a somewhat peculiar assignment: if we went by background, I would probably first be assigned to teach a numerical computing course, an algorithms course, or even an operating systems course. But all the numerical computing courses at Berkeley live in the math department, and thus wouldn't fulfill my requirement; and I'm certainly sufficiently broadly qualified to be a TA for many courses in CS, including compilers. It's been a while since my undergrad course, but I do know the material -- I've even used it for my work (academic and commercial) a few times. Besides, I really like this material. The theory is elegant, accessible, and obviously and immediately practical (with the great advantage that I'm thus in a position to be able to crank out a big chunk of the class project code in one or two evenings -- maybe I'll write later about writing an LL(1) parser generator in Common Lisp). And the tradeoffs in language design often center around aesthetic issues of how things are most naturally expressed, issues that can be appreciated by almost anyone who has written a program.

There are about fifty students in the class. So I'll be responsible for two sections of about twenty-five each. With that many people, I would be very surprised if there weren't other people who took just as much joy in the subject as I do. I'd probably also be surprised if there weren't a few who turn out to be the most irritating sorts of language lawyers, but it takes all types -- and I have quite a few friends among the irritating language lawyer crowd.

Anyhow, one of the current requirements for GSIs at Berkeley is that we should take an online course on Professional Standards and Ethics for Graduate Student Instructors. So I came to the cafe to sit and work through the first module. It was, I suppose, an educational experience. But I've already decided that I have some comments for the course feedback system, once I finish the last section:

Yes, we're not supposed to discriminate based on race, creed, sexual preference, ethnicity, hair style, eye color, shoe size, taste in music, etc. Do you have to list all those things more than once? At some point, the eye tires of lists; I don't know about other readers, but I have to force myself to actually pay any attention beyond the first item or two, and in this case I see no point in bothering. Your supposed to be teaching about teaching -- do you really think that legalese is the best language for pedagogy?
Do not write about groups of individuals. What else would make up the groups? Garden vegetables, perhaps? You don't clarify anything with the extra words. This is supposed to be a course on pedagogy, not a course on advertising or professional business obfuscation. Let your yes be your yes, and let your group be your group! Otherwise people like me will get bees in their bonnets and run in circles crying Ow! Ow! Ow! rather than actually absorbing the rest of your sentence.
Similarly, what's the point in writing that discrimination is the granting of preferential treatment to some but not to all? It's a bad definition. Preference implies exclusivity; the phrase to some but not to all is just so much extra verbiage. Even worse is the phrase granting of: treatment is already a noun, you know. And if you remove the excess, this sentence defines discrimination as preferential treatment, which seems to me to be a pretty obvious definition. Bah!
Bees! Ow! Ow! Ow!

With this sort of training, I'm sure that I'll be enabled to act effectively as a more useful and pedagogically sound instrument toward student learning and fulfillment. Or maybe I'll just become a crotchety ass.

Currently drinking: Black coffee

Wednesday, August 24, 2005

Local names

Modern programming languages support the notion of restricted lexical scope: the meaning assigned to a name can be made to hold only over restricted sections of the program text, allowing the programmer to re-use the same symbol to refer to different things depending on the context. It's hard to overstate how useful this is. It's so useful, in fact, that we fake it when we don't have it; if you've ever seen an old C program with function names like my_package_foo() and my_package_bar(), this is exactly what's going on.

It's more difficult to restrict the scope of a definition in an ordinary prose document. We can fake it by saying for this section only, let x be ..., but it's awkward, and readers get to the start of the next section and forget that you no longer mean the same thing by x. This is, I think, a problem with how math is often presented: usually we have an informal way of introducing local variables (anything defined inside the proof of a theorem is usually local to that proof, unless the author makes noises about it being important elsewhere). But sometimes I want a definition just for a few lines, and it's hard to do that with ordinary English text.

When you don't have enough control over where a particular symbol means what, you're left constantly grabbing for more symbols. Programmers sometimes refer to this problem as namespace pollution. A lot of the tedium of mathematical writing -- and mathematical reading, for that matter -- boils down to managing namespace pollution. After a few years, most people get used to it: yes, my paper involves five different concepts, all of them attached to the label B, but I can figure out from context which is which -- and if not, I can always grab another letter. Right? Except that, in any piece of writing longer than a few pages and dealing with more than a few concepts, it is terrifyingly easy to run through all of the Greek and English alphabet (lower case and upper case). It's even easier to run out of symbols if you choose not to bewilder your readers by letting n be a small quantity, epsilon be an integer going to infinity, t be a quantity with units of length, and x be a quantity with units of time.

This issue of notation and namespace pollution is a constant nuisance to me when I'm writing proofs and when I'm reading codes written by mathematicians who aren't experienced programmers. When I'm writing proofs, I run out of symbols too quickly, because I start with a quasi-English notation in which the scope of the assigned meanings is clear. Let C mean this thing in block 1; let it mean that thing in block 2; and let it mean something else in block 3. But when blocks turn into paragraphs, I have a problem. When I'm reading code code written by inexperienced programmers (not just mathematicians, but also engineers and physicists), I see the same problem in reverse: they clearly started from a mental model in which they didn't have any locality of definitions, and so their codes are crowded with global variables, or with local variables with single-character names which are used for seven different purposes over the course of a subroutine. Bah! It makes a code difficult to understand and nearly impossible to debug.

I dare say that, if we had some way of indicating locality of definitions in spoken and written English, it would be a lot easier to convince people when they were making logical fallacies (particularly those involving circular reasoning). But perhaps this is just wishful thinking.

Tuesday, August 23, 2005

Essays interspersed

Richard Gabriel, Paul Graham, and Peter Norvig all have sites of essays. Norvig has more technical stuff than Graham or Gabriel, but all three collections are worth browsing. I've already mentioned Graham's essays, but if you're unfamiliar with the other two: Norvig is Director of Search Quality at Google, but I know him better as the author of The AI Book (together with Russell); and I first learned of Richard Gabriel from his essay on the rise of Worse is Better, which you can learn about from his site. All three are also Lisp hackers extraordinaire, which is why I was fishing through their writing.

I particularly liked Norvig's essay Teach Yourself Programming in Ten Years. Exactly. Norvig's essay is reminsicent of the one expressed by Keith Devlin in his essay on staying the course. The difference between the popular mis-understanding of mathematics and of computer science as disciplines is that so many people who know nothing of mathematics think that it must be inhumanly difficult, while so many people who know nothing of computer science think it's possible to become an expert in three days. Or perhaps thirty days at the outside. Going through a formal computer science education, I've been told, is pointless: you learn so many things that you'll never use! The same logic -- or lack thereof -- would indicate that a professional carpenter really only needs a screwdriver and a hammer, or that a professional writer in the English language needn't be able to form compound or complex sentences. There is such a thing as useless knowledge, which is either so contorted or so fundamentally mistaken (or both) that it represents a backward step with no potential for interesting development or discovery. But for the most part, discarding a misunderstood tool as useless is a sign or a certain lack of imagination.

You needn't be a professional to enjoy and appreciate either mathematics or computation. In the case of computer science, in fact, being a computing professional or having a computer science degree seems to be surprisingly uncorrelated with enjoying and appreciating computer science. There are a variety of folks without formal CS degrees who are nevertheless highly competent both as programmers and as computer scientists -- in fact, CS is a sufficiently young discipline that most of the older generation of CS faculty can be held up as an example of this. And there are also a variety of folks with formal CS degrees who are nevertheless fairly incompetent both as programmers and as computer scientists. If you spend all day sitting in front of a computer and writing tedious, redundant code, you should really wonder: couldn't I make the computer do this? But a lot of folks don't (or, rather worse, they do, but they're overruled by management). Yes, in twenty-one days you can probably learn to write simple programs. But that type of programming is no more the end goal than summing columns of numbers is the end goal of mathematics, or pounding nails into boards is the end goal of carpentry.

Take a look, too, at Mob Software, by Richard Gabriel. I think he has it right, too. And this is the hidden potential of open source as a software model: that it can inspire amateur efforts, in the original and best sense of an amateur as one who does a thing for the sheer love of it. Up with open source projects, with ham radio, with mathematical puzzles and with the coffee mugs that children make for their parents! The drive toward professionalism may make money, but the drive toward amateurism makes things worth doing.

Tuesday, August 16, 2005

Day in the life

Get up. Coffee. Tease flatmate.
Taylor's theorem, integration by parts, Cauchy-Schwartz. Rinse and repeat. Get error estimate.
Check numerically whether estimate holds for simple test case. Discover algebra errors in the proof. Correct proof.
Discover algebra errors in the test program. Correct test program.
Discover errors in the compiler. Swear. Let out cat, who is spooked. Lower optimization level.
Decide result looks too simple not to be standard. Search through references for something similar.
Discover promising reference. Wade through ten pages of dense functional analysis, decide that it's not quite the same, but close enough. Add citation to my notes. Let the cat in.
Add page to thesis. Get up, stretch, dance in circles to a song on the radio. Spook the cat, who I let out again.
Lunch. Let cat back in.
Go onto campus. Check mail. Read blogs for half an hour.
Add feature to HiQLab.
Index error, index error, type error, core dump. Swear. Rinse and repeat.
Fix index errors. Run test case. It fails. Swear.
Discover error in the test case. Fix it, discover a sign error meant I was computing the negative of the quantity I wanted. Fix sign error. Test passed! Code crashes on exit.
Swear. Spend half an hour figuring out and working around quirk in the dynamic linking system which led to the crash.
Walk home. Talk to Winnie on the phone as I go. Make dinner, which is salad, since there's nothing much in the fridge and I actually remembered to eat lunch.
Recommence work on HiQLab. Stall out. Take a tea break.
Tease flatmate.
Decide to work on compilers stuff for a while. Write an LL(1) parser generator in Lisp. Chortle after I reduce the code size to under 200 lines. All tests passed! Realize that it's 1 am.
Brush teeth. Fall into bed.

Sunday, August 14, 2005

Starts with P

Public transportation
I visit Winnie in San Jose most Saturdays. The full trip proceeds in three legs: a BART ride to Fremont, a bus ride to a transfer station at the Great Mall transit center, and a light rail ride to a stop near Winnie's apartment. Depending on how long I spend waiting between each leg (and whether I meet Winnie at her apartment or at Great Mall), the trip takes between two and three hours. Usually it takes about two and a half hours, which means that on Saturdays I spend a total of five hours or so riding the local public transportation.

I do different things in those five hours, depending on my mood and on the contents of my backpack. Some days, I read; some days, I listen to the radio; some days, I sleep; and some days, I stare out the window and ponder. When I walked out the door this morning, it felt like a reading day, so I tucked David Copperfield into my bag.

I didn't touch David Copperfield after that. Here's the story of what I did instead.
Projections
Pretend for the moment that you're interested in the solution to a large system of equations. The system is so large that it's inconvenient to do anything direct -- there are just too many equations and too many unknowns. What do you do to get an approximate solution?

There really is only one thing to do: get rid of some unknowns. Perhaps you know a little in advance about your solution, so that you're able to say in advance that some components are going to be almost zero; so you pretend that they are zero. Now, if you started with equal numbers of equations and unknowns, you create a problem by assuming some of the unknowns are zero: there are too many equations, now, and they can't all be satisfied. So you throw away some equations, too, until you have an equation that you can solve again. And this gives you an approximate solution, which you can then try to evaluate or solve.

Now, you can usually rearrange the unknowns in your problem before doing anything else, so that you really have some confidence that the unknowns you discard will be almost zero. And you can usually rearrange the equations, too, so that the equations you discard are the ones that don't tell you much. For example, your original system of equations might be the stationarity conditions that have to be satisfied at a minimum of some energy (i.e. the conditions that say derivative equals zero at an extremum); in this case, your reduced set of equations might be the equations that minimize the energy subject to the constraint that the components you threw away are zero. But the idea behind this method of approximation, of assuming something about the solution so that you can throw away a bunch of the equations and unknowns, is very general. The high-level idea goes under many names, but I usually call any method of this form a projection method (or sometimes a Galerkin projection method). And projection methods, together with perturbation methods, make up a toolkit that I use almost constantly.

Projection methods ar used for solving partial differential equations (which are infinite-dimensional problems), as well as for solving large finite-dimensional linear systems of equations. If you've heard of the finite element method, or of the conjugate gradient method, they're both examples. If you've ever used a truncated Fourier series to approximate an ODE or PDE solution, you used a projection method there, too. The cool thing about classifying these methods together is that you can analyze lots of different methods in a unified framework -- which, if you believe in the power of carefully directed laziness, is a very good thing.

If you don't believe in the power of carefully directed laziness, then neither mathematics nor computer science is probably the discipline for you.

Anyhow. The task of analyzing any of these projection methods really can be partitioned into two sub-tasks. First, one figures out how nearly the approximate solution matches the best approximation consistent with the simplifying assumptions: how well do you approximate the unknowns that you didn't throw away? Then one figures out whether the approximating subspace contains a good estimate for the true solution: are the unknown values that you discarded really small? The second step is problem- and method-dependent, but the first step can be done in a general setting.

At least, the first step looks like it can be done in a general setting. In practice, you can't get very far without using some additional hypotheses. The strongest results hold when the system of equations is symmetric and positive definite (in a PDE setting, where positive definite systems can nevertheless be arbitrarily close to singular, you need coercivity). If the equations are nonsymmetric but you still have coercivity, then you can say a little less, but the result is mostly the same. But once you've lost coercivity, suddenly there's much less you can say. If you make an additional assumption (called an inf-sup or Babuska-Brezzi condition in the PDE setting; a lower bound on a minimum singular value in the finite setting), then you can still prove something, then you can at least prove that your approximate problem has a unique solution which is within some factor of being optimal -- but that factor may be much larger than it was in the positive-definite case. Besides, proving an inf-sup condition is often involves problem-specific work, which is work that we'd like to avoid as much as possible in the interest of constructive laziness.

There's something bothersome about changing to a much weaker bound the instand that coercivity is lost. There are cases when a minor change makes the difference between an easy problem and an impossible one; but this is not such a case. For the problems I think about, a mild loss of coercivity should not be a mortal sin. How can I develop error bounds which gradually transition between the theory that holds in the coercive case and the theory that holds for general indefinite problems?

This is what I was thinking about for a big chunk of the train rides today, and I think I have a reasonable answer. It generalizes and unifies a hodge-podge of results that I know about in the finite- and infinite-dimensional case, and it involves some mathematical ideas that I find very beautiful (involving perturbation theory and the behavior of the field of values of an operator). I doubt any of it is new in a fundamental way, but I'm not sure these results have been cleanly tied together before. At any rate, I'm excited about it.
Polya
There was already a book in my backpack when I threw in David Copperfield. George Polya's How to Solve It is just as classic as Copperfield (though in a different setting). It's also much shorter, which is why, after pondering projections for a while, I chose to read it over the Dickens book. How to Solve It is about how one goes about solving mathematical problems -- or other problems, for that matter. The techniques that go into such problem-solving are collectively known as heuristic (which is distinguished from the usual definition of heuristic reasoning, since part of the problem-solving process is the rigorous argument that comes after the initial attack). Polya wrote particularly for teachers of mathematics, and I admit that my decision to buy my own copy was partly influenced by a quote from a review:

Every prospective teacher should read it. In particular, graduate students will find it invaluable. The traditional mathematics professor who reads a paper before one of the Mathematical Societies might also learn something from the book: 'He writes a, he says b, he means c; but it should be d.
-- E.T. Bell, Mathematical Monthly, December 1945.
Practical Common Lisp
I have a few Lisp books on my shelf, of which I only really use one: ANSI Common Lisp by Paul Graham. The other books aren't bad, but they're more dedicated to how to program in general, as opposed to how to program in Lisp. But there's something limiting about having just one book on a language, just as there's something limiting in only having a single dictionary, a single history book, or a single book on almost any other topic. The local book stores stock two Lisp books: Graham's book, and Practical Common Lisp by Peter Siebel. After discovering the latter book on the store shelf a couple months ago, I found that it was available online as well, and so I started reading it.

It is an excellent book, one that I've already found very useful. For example, the chapter on Lisp packages and symbol resolution (Chapter 21) includes a section on Package Gotchas, which, in the course of about a page, revealed the source of the woes I'd been having with my attempts to use a Lisp lexer generator that I grabbed from the net. Excellent! Siebel also has good descriptions of the Common Lisp format and loop constructs, better than the corresponding descriptions in Graham's book (probably because neither construct is very Lispy).

I am not good at resisting the call of good books. I got a paper version when we went to the book store.

I spent some time on the return bus ride reading through the first several chapters of the book. At the end of the ninth chapter, there's a short section entitled Wrapping Up, which begins with a brief summary of the code in the chapter, and then goes on to review the development process:

It's worth reviewing how you got here because it's illustrative of how programming in Lisp often goes.

Illustrative of programming in Lisp? Illustrative of programming in general! Or, for that matter, illustrative of how one approaches a broad variety of problem-solving tasks. Remove all the specific references to Common Lip and rewrite the text in Polya's distinctive style, and the last page of the chapter could come directly from How to Solve It.

Friday, August 12, 2005

Messiness and computing

Heidi recently wrote an entry in praise of messiness in science, which set me thinking about messiness in mathematics and in computing. While reading some of Dijkstra's old writings, I came across this tidbit, which seemed apropos:

We should never forget that programmers live in a world of artifacts, a fact that distinguishes them from most other scientists. A programmer should not ask how applicable the techniques of sound programming are, he should create a world in which they are applicable; it is his only way of delivering a high-quality design. To which I should add a quotation from EWD898 (1984)

Machine capacities now give us room galore for making a mess of it. Opportunities unlimited for fouling things up! Developing the austere intellectual discipline of keeping things simple is in this environment a formidable challenge, both technically and educationally.

From Answers to questions from students of Software Engineering (EWD1305)
Prof. Edsger W. Dijkstra

Currently drinking: Coffee

Three books

The Structure of Scientific Revolutions. Thomas Kuhn
You've heard of this book, even if you don't think you have. Kuhn gave the modern meaning to the phrase paradigm shift. You may recall that I mentioned this book -- a month ago, perhaps? It took me a while to read it, just as it took me a while to read it the first time I was exposed to it some ten years ago.

I tried briefly to summarize my thoughts about the contents of this book. It was hard. The difficulty, I think, is that SSR was not written as a book; it was written as an essay, which just happens to be the length of a short book. Indeed, Kuhn refers repeatedly to this essay, and occasionally mentions a book that he hoped to write (and never did, as far as I know) with further details and evidence for all his arguments. The writing is dense, so that if my eyes glazed over briefly, I had to go back and re-read what I'd missed in order to keep the sense of the arguments.

Bah! you may say, what use are you if you can't bring yourself to summarize a few highlights, at least? At least, you might say that if you were inside my head. The only good responses I can give is that I will try again in another few years. If that's too long, you may read it and try for yourself.
University, Inc: The Corporate Corruption of Higher Education. Jennifer Washburn
Are student journalists taught that nobody will read their work unless they describe the physical appearance of any major characters? I ask this seriously, because this is not the first book I've read in which I thought why on earth are you taking time from your argument to tell me that this guy has blue eyes and wild hair? Unless the subject of such scrutiny is an Einstein (or maybe a Carrot-Top), I'll pass on the descriptions of hair styles.

Apart from tickling that particular pet peeve, I enjoyed this book. By enjoyed, I mean that I thought it was coherent, well-argued, and worth-reading. However, I was also troubled by it. The title neatly summarizes the book's thesis, which is that, in the past two decades, commercial influences have increasingly compromised the university's role as a home for educators, long-term scientific researchers, and independent experts. In particular, Washburn focuses mostly on medicine and bio-engineering (and a little on other areas of engineering), where the Bayh-Dole act had an enormous impact by allowing academic researchers to easily patent and commercialize ideas developed in a university setting.

I know little of medical schools, so I will say simply that I was horrified by the behaviors documented in the chapter Are Conflicts of Interest Hazardous to Our Health. There seems to me to be something tawdry and unethical in a professor agreeing to sign off on a paper ghostwritten by a pharmaceutical company, and that was far from the worst of it. But in a way, I found the rest of the book more worrisome. I know and work with many people who are intensely interested in the commercialization of their widgets. It seems to me that many of these folks juggle their commercial and academic interests pretty well; the problem is that those commercial interests tend to creep into other areas of the academic landscape as well. An academic proposal is a sales pitch, but it bothers me when the sales pitch is not for an idea, but for profits to be had from an idea. And that seems to happen a lot.

The problem is that what universities are good at -- or are supposed to be good at -- is building new tools and ideas, and then teaching the world about those tools and ideas. And, in some cases, teaching the world might involve developing a useful application and pushing it all the way to market (I use the scare quotes because the market need not be a commercial market -- witness the growth of various open software products). But I would prefer if the drive were not profits, but passion for making a good idea known. I'm reminded of something Edsger Dijkstra wrote, which summarizes my feelings nicely:

I grew up with an advice of my grandfather's: Devote yourself to your most genuine interests, for then you will become a capable man and a place will be found for you. We were free not to try to become rich (which for mathematicians was impossible anyhow). When Ria and I had to decide whether we should try to start a family, we felt that we had to choose between children and a car; Ria had a Raleigh, I had a Humber, we decided that they were still excellent bicycles and chose the children. Today's young academics, whose study is viewed as an investment, should envy us for not having grown up with their financial distractions.

Under the spell of Liebniz's Dream (EWD1298)

Envy, indeed.

I recommend this book to your reading lists.
Flinch
I was sitting at the bus stop, absorbed in a book, when he tapped me on the shoulder. Do you know anything about pre-Roman Italy? I said that I didn't, and recommended him to a library. Everyone I ask seems to say that, he said, but the books in the libraries are full of lies. Nobody knows the truth, so I'm writing my own book. He carried a yellow pad; I glanced down, and saw that Golgotha was printed in neat block capitals across the top, followed by several paragraphs of text written in a small, neat hand. The word Roswell was written in the corner and underlined.

You see, he continued, all Europeans were caniibals back then. The rest of the world was fine, and everyone got along; then the Europeans went exploring, and spread their sickness throughout the world. I learned in public school that history only goes back 3000 years; how can that be? It's because of the cover-up. So I'm writing a book about it. I think people know about it; all the white people I tell flinch when I tell them about my book, because they've been trying to hide the truth. But there's signs of it everywhere, and when Bush cheers on the Texas Longhorns, that's really the sign of the devil that he's making. It makes me sick. I think my book will come out next September, if you'd like to hear the title?

The bus had arrived as he told me this, and he asked the last question as I was waiting in line to board. I told him I thought he might be mistaken, paid my fare, and boarded. Sitting on the bus, I heard the question again through the window, asked to someone else: Do you know anything about pre-Roman Italian history?

I don't recommend this one for your reading lists.

Thursday, August 11, 2005

Backstroke of West

There's an old story about the phrase the spirit is strong, but the flesh is weak being translated to Russian and back to English to yield the vodka is good, but the meat is rotten. I thought that was funny. I nearly hurt myself reading about this English-subtitled Chinese edition of Revenge of the Sith.

Back to Sobolev estimates.

Tuesday, August 09, 2005

Little things

I like pen and ink drawings, the kind in which every line is critical to the picture. I enjoy short poems, sonnets, and lyrics in which each sound and connotation is placed just so. I appreciate Strunk and White, Kernighan and Ritchie, and Rudin. I love ripe summer tomatoes with naught but a sprinkle of salt.

Romantic landscape paintings, Wagner's Ring Cycle, Gibbon's history, and extravagantly designed chocolate desserts have places of pride, and I would not wish to live in a world without them. But for the most part, I prefer simplicity. One graceful note leaves such room for the imagination, does it not?

Friday, August 05, 2005

Search terms 2

Searching for Lisp tokenizer or Lisp parser -- nearly useless.

Searching for Lisp regular expression -- much more useful.

Google -- priceless.

Macro expanding disclaimers

From the CLiki Common Lisp resource site:

Imagine a fearsomely comprehensive disclaimer of liability. Now fear, comprehensively.

Wednesday, August 03, 2005

Linear B

I gave my advisor an early draft of my thesis early in May; he returned the marked up version to me a couple weeks ago. There was a section in my draft which I cut from previous documentation of one of my codes in which I described the mixed-language structure of the code as a combination of C++, Fortran, Lua, MATLAB, and Minoan linear B script. I forgot to cut the last one from the list before giving the draft to Jim. If this is too obscure: linear B was one of the written forms of the language used by the ancient Minoan culture, and is now mostly of interest to archaeologists -- something which cannot, alas, be said of Fortran 77.

In the marked up version, Jim simply scribbled ref? beside that last language. I'm unsure whether this means he was asserting his own sense of humor, or just didn't recognize the reference. I'm also unsure whether I should remove that bit, or if I should put in the reference as suggested and see whether anyone notices...

Friday, July 29, 2005

Turtles in Austin

Texans believe in air conditioning.

I should rephrase that: there exist at least some Texans who believe in air conditioning at least part of the time. Presumably one of those Texans is involved in the administration of the Austin convention center. It was hot and humid outside, but I wished more than once that I'd paid closer attention to the advice to bring a blazer.

When I opened my bag on the first evening, I found the expected pad of paper, program, and pen, but there was a twist. The pen is a particularly fancy four-in-one widget: at one end there is a pen and a PDA stylus, and at the other end there is a penlight and a laser pointer. I'll have to hold on to this one. Assuming I don't wear down the batteries while playing with the cat, I imagine it will come in useful for future presentations.

It was a productive meeting. I learned a lot, I met some people, I caught up with a surprising number of Berkeley and ex-Berkeley folks, and I scribbled my way through several more pages of potential research leads. And I still haven't finished clean copies of my notes on leads from the other three workshops! An embarrassment of riches isn't a cause for complaints, but I'm frustrated that I have such difficulty keeping up with myself.

I also spent some time exploring Austin with Tsuyoshi. He took the picture of the turtle that now graces the top of this page. When I think of Texas, turtles are usually not the first thing that comes to mind -- but it's cute, no?

I'm glad to be done for a while with the business of traveling and giving talks. Time to buckle down and get some real work done!

Currently drinking: Black tea with strawberry

Saturday, July 23, 2005

In a Barrel

Searching for LISP-based implementations of different programming languages is a task akin to shooting a mixture of metaphors and pre-hatched chickens in a barrel.

Let's see who Google brings to this post based on that sentence.

I think I've written already that next semester I'll be the GSI (graduate student instructor -- a UC-ism for teaching assistant) for the local compilers class, CS 164. The plan is to implement everything in Common Lisp. That means that all the compiler stages (scanner, parser, optimizer, code generator) will be written in Lisp, as will be the virtual machine on which the compiled programs are to run. The language that we're compiling, though, is going to be a Java variant.

I'm looking forward to using Lisp for this class. The last time I used Lisp for very much was when I took an artificial intelligence course as an undergraduate, but I have more than a passing familiarity with it. Indeed, I think that someone who graduates with a degree in computer science and isn't familiar with Lisp and Forth has been robbed. These two languages are fundamental. The fact that they are so fundamental is actually the source of my current frustration, but let me return to that after I explain briefly what I mean when I say that these languages are fundamental.

Like human languages, programming languages have grammar. The grammmar of a programming language describes the types of constructions which make up acceptable programs. Like human languages, the grammar is far from the whole picture; beyond the syntax, or the structure of an utterance, there is the semantics, or the formal meaning of the utterance. The sentence The blue flu blogs at midnight is syntactically valid, but semantically questionable. Programming language syntax is usually described in BNF (Backus-Naur form -- you may insert your own pun on Backus-Bacchus here). It's easiest to describe the notation by example. Suppose I wanted to write down the syntax of simple arithmetic expressions involving plus, minus, times, and divide. I might write the grammar like this:

  expression := 
    addend |
    expression + addend |
    expression - addend
  addend :=
    multiplicand |
    addend * multiplicand |
    addend / multiplicand
  multiplicand :=
    number |
    variable |
    ( expression )

In words, what I've written is that an expression consists of a sums and differences of addends; an addend consists of products and quotients of multiplicands; and a multiplicand is a number, a variable, or a parenthesized expression. So if I wanted to decide whether the expression 1 + 2*(3+4) was valid, I'd have to see if I could describe it according to these rules. The description of the expression in terms of syntactic rules gives us a parse tree, which is sort of like a grade school sentence diagram, but more useful. Since I'm limited by HTML, let me use the following convention for writing the parse tree above: when I use a particular rule, I'll write the name of the thing I produce first, then the list of inputs to the rule; and I'll wrap the whole thing in parentheses to prevent confusion about where I am. In that case, the parse tree for 1 + 2*(3+4) is

  (expression
    (expression
      (addend (multiplicand (number 1))))
    +
    (addend
      (multiplicand (number 2))
      *
      (multiplicand
        oparen
        (expression
          (expression
            (addend (multiplicand (number 3))))
          +
          (addend (multiplicand (number 4))))
        cparen)))

I say that I write things this way because of the limitations of HTML, but even if I were a little more willing to draw sketches for you, I might decide to write the parse tree in the above way. Why? Because it's convenient for computer manipulation, and people have developed powerful tools for dealing with this sort of representation of expressions. It will not surprise you to learn that the above expression is a sort of proto-Lisp.

Of course, if I actually wanted to write the above expression in any dialect of Lisp that someone would use, it would look like

  (+ 1
    (* 2
       (+ 3 4)))

Nevertheless, the point remains: in essence, a Lisp programmer writes a direct representation of the syntax trees for his programs. And trees are things that computer scientists are happy manipulating as data, which means that Lisp programmers can easily write programs that analyze and rewrite programs. And a compiler is essentially a tool for analyzing and rewriting programs, so lots of compiler writers use Lisp, or something very Lisp-like, inside their compiler.

Forth is what you get by turning Lisp on its head, so that instead of writing (+ 1 1) you write 1 1 +. The Forth model will be familiar to anyone who has used an HP calculator or looked at the specs for the Java Virtual Machine.

The problem: since Lisp and Forth are so damn easy to parse, everyone writes parsers for them -- and Google isn't smart enough to distinguish writing about parsers written in Lisp for other languages from writing about parsers written in other languages for Lisp. The latter are ubiquitous, the former a little less so -- hence my problem.

Since my battery is running low, ruminations on the resolution to this problem will have to wait for another day.

Friday, July 22, 2005

Its Own Worst Enemy

I like Best Software Writing I (ed Joel Spolsky) more the more I read. One of the things I like about it is that most of the selected pieces are available in electronic form, so that I can point people to them without saying Oh, but you need to buy the book to see it. That said, this is a book worth checking out, even if you're not a software developer.

I just finished reading Clay Shirky's essay A Group Is Its Own Worst Enemy, a thoughtful analysis of certain aspects of social software design. I may have more comments on this later, but right now I'll just say go read it! I will also mention that I hope Pete has the opportunity to read and respond to this one, since I suspect his comments on this essay would be far more entertaining than my own.

Thursday, July 21, 2005

Pen trip

I will fly to Austin on Sunday so that I can give a talk on MEMS resonator simulation on Monday morning at the computational mechanics conference. I'll be back in Berkeley come Friday, and that will be it for my summer travels. I'd originally planned to go home at the start of August, but I'm tired of travel, and want to spend the rest of my summer enjoying my own bed and my own cooking. And, of course, writing.

To my family and Maryland friends: I hope I'll get to see you -- at least some of you -- for Christmas or the New Year.

On an unrelated note, I am now the proud owner of one new fountain pen and one rehabilitated fountain pen. And I'm cackling over these things, I am. I've had the older pen for quite a while, but for the past couple years I've not used it. It has a fine nib, and it gets quirky when it's not very clean; unfortunately, it's hard to clean the nib when you only use cartridge ink. I got a converter for it and spend a quarter of an hour using the converter to flush water through the nib; at the end of that time, the water was running clear and I was able to write again. Hooray!

The new pen also has a cartridge converter. I filled both pens with blue ink. One day, I hope to be able to perform this task without turning my hands blue.

Tuesday, July 19, 2005

PowerBook accelerometers

The Sudden Motion Sensor (SMS) on the PowerBook is used to turn off the hard drive in the case of sudden acceleratins (like those that occur when you drop the computer, for instance). The ThinkPads have a similar system. The device is almost certainly using a microelectromechanical system (MEMS) sensor, probably one of the ADXL series from Analog Devices -- the same types of sensors that are probably used to control the air bags in your car.

It's possible to read the accelerometers at a user level, too. In general, this seems to be of dubious utility, unless you want to tilt your computer around as a way to control something. But I think it's utterly cool that I can download a little applet that will, in real time, show the orientation of my laptop based on the accelerometer readings. This demo is going into my where are MEMS used? slide for my talk on Monday, and probably for several other talks after that.

Monday, July 18, 2005

On Anonymity

Have you seen the old New Yorker cartoon in which two dogs are sitting at a computer, and one is telling the other On the Internet, nobody knows your a dog?

Hmm. Let me share some reasons why online anonymity is a little more complicated than that.

First, let's dispense with the obvious. If you write a large body of information drawn from your personal experiences, then anyone reading with a reasonably analytical turn of mind will be able to deduce a great deal about you (unless you're deliberately deceptive). Gender, profession, hobbies, and location are the obvious things. This information may give a reader a shrewd guess to who you are; if not, it may at least give a guess that you belong to a small group of people.

Now, let's discuss the more interesting piece: personal style.

If you are reading this as an academic, you will appreciate that writing style, together with a few contextual clues, is enough to often render blind review a polite fiction. You send me a review in which you way not enough attention is paid to the following references and list twelve references with a common co-author; I may well guess who you are. Or I write a review of your paper after we have communicated verbally or in writing; you may well recognize me simply from the way that I choose my words. The blind review is really only hazy, but that's fine. I may still have doubts about my guess, as might you. Similarly, if I were to read an anonymous blog written by someone I knew, I might well be able to guess that person's identity, but it would be hard for me to be sure.

What is it that you identify as a particular writer's style? My guess is that a large fraction of what you identify as style has to do with statistical patterns in language use. The most obvious things involve sentence length and complexity, the sort of metrics that lead one to declare that some author writes at an nth grade level. But many of us have our own personal catch-phrases, too. In music, people have discovered that by looking at the probabilities associated with very short sequences of notes, one can distinguish the music of Bach, say, from the music of Mozart. On Amazon, you may have noticed the introduction of Statistically Improbable Phrases (SIPs) on the book description pages. Many of the best spam filters work by training a classifier to distinguish spam from legitimate mail on the basis of automatically-determined statistical properties in the text. A blog is a large volume of text, which is exactly what's needed for such statistical text classifiers; given the text of a blog and a set of texts (e-mail messages or class papers, perhaps) written by possible authors of said blog, I dare say it would take me about a week to put together a good tool to decide who -- if any -- of my suspects was actually responsible for the writing. Given that there are people out there who know much more about this sort of thing than I do, I would be surprised to find that such a tool does not already exist. And I haven't even said anything about tracing the link structure!

I think it would be interesting, actually, to try to build classifiers in this way to see if there are any two distinct authors who write in such a similar manner that it's impossible to distinguish their writing based only on the probabilities of two- or three-word sequences. Or could such a classifier be trained to automatically recognize regional dialects, identifying from what part of the US a particular writer might hail? How reliable would such a classification be? But I digress.

I suspect that most people would be shocked at how easily their privacy and anonymity could be removed if anyone cared to take the effort (though anyone who has stood in line at a grocery store knows full well that the private lives of the rich and the famous are often anything but private). But usually nobody cares to make the effort. If you wish to be anonymous, the polite thing for me to do when I recognize you is to pretend that I know nothing. But this assumes that I've paid enough attention to recognize you in the first place. It's a little like an inexpensive lock: a deterrent that tells most people that it would take some bother to steal the protected item, and tells the few people who might be able to pick the locks with no difficulty that you'd really consider it rude for them to run off with your stuff.

Currently drinking: Green tea

Blog Archive