Arbtirary thoughts on nearly everything from a modernist poet, structural mathematician and functional programmer.

Sunday, July 27, 2008

On Lambda Calculus and Ordinal Numbers (part 1)

I'm still annoyed at the guy I posted about on Friday. Mostly, I'm annoyed about how he accuses mathematicians of being contrary to critical thought, and of being authoritarian. Those two things don't work in math-- you can preach something as dogma, but if you don't have proof (based on certain clearly stated axioms, and valid lines of reasoning), your words ring hollow.

The other thing I'm annoyed about is his callous rejection of transfinite theory. He seems to think Hilbert's Hotel was supposed to be an intuitive "proof" that infinities work the way they do, rather than a clarification of how they work. Also, he rejects infinities... absolutely and completely. As a formalist, I disagree with the rejection of anything interesting, but something else is crying out in pain at that idea-- perhaps I'm more of a Platonist than I think.

Anyway, I'd like to clarify my comment on the fixed point combinator (Also known as the Y-combinator), because... mostly because I'm a nerd. But also because I think it acts like an infinite ordinal.

I'm going to write this in two parts. The first will be on Lambda Calculus, the second on Ordinals and how to apply LC to ordinals. Hopefully, I can also discuss cardinals, but that may be a bit of a stretch.

First, a bit on (untyped) Lambda Calculus and Church Encoding: Lambda Calculus is a formalism for dealing with functions. For those who have heard of set theory, think of it as an analog, only everything is a function (or an argument of a function), rather than a set (or element of a set). The basics are as follows: You have some function f, and some argument a you apply the function to the argument and get a result ((f)a) -> b. So b is the result of evaluating f. It is defined based on how f is defined, so if we define ((f)a) -> a, then for any a we use we get the same value back (this function is called the identity, often just I.) When it isn't ambiguous, we can drop the parenthesis:
Ia is the same as ((I)a) which will evaluate a.

You can have any number of arguments:
(f)ab) -> c, or (..(f)a... z) -> A.
But, you can treat f as a function which returns a new function. Looking at the first example above [(f)ab -> c], we can say (f)a -> fa, where fa is a new function, which will take some new argument. In this way, a function taking any number of arguments can be turned into a series of "chained" functions, all returning new functions.

What's useful about this, is we can forget that non-functions exist (until it's convenient to do otherwise), and just work with functions that return other functions. Also, we can abuse our notation and use ((f)abcd...) to be ((((((f)a)b)c)d)...); the first is obviously cleaner.

Now we can also define what Church calls "abstractions". Abstraction is just a way of defining a function. It looks like:
f := λa.b
This is the same as what I showed earlier with ((f)a) -> b, but it's more precise notation (technically, my earlier notation means something different). What it says is f is a function which takes 1 argument and returns b. So the function I (the identity) is I := λa.a
We can do the same thing with multiple arguments:
f := λabc.d OR f := λa.λb.λc.d.
Notice how the second version can be interpreted to mean f takes one argument a, and returns a function which takes another argument, which returns a function....

I won't get into the gritty details, but there are only a few things we're allowed to do in lambda calculus:

*Define a function, using abstraction (and maybe a special notation... we'll cross that bridge when we come to it.),

*Rename variables (this is called alpha-conversion,). There are certain rules for this (which we don't need to pay attention to), but the only real purpose is to avoid confusion, and to make
proofs easier to follow.

*Apply abstract functions. (Called beta-reduction) This only comes into play with abstractions, but since we can only use functions which have been defined via abstraction (or in terms of already defined functions), this can be applied to any function. If we have some abstraction λx.fx (where f is some function we already know), applying the abstraction looks like this:
((λx.fx) a) -> fa.
All that really happens is we return what's after the "." with everything before the dot replaced with everything outside of the parenthesis.

*Replace equivalent expressions with each other (called eta-conversion). So, for example, I and λx.x can replace each other in any expression.
This comes into play when we have, for example ((I)a). To evaluate this:
((I)a) -> ((λx.x)a) , which is an application, that beta-reduces to a. This is, of course, mind-numbingly tedious, but in actual calculation, most of these steps are done implicitly

***
Now, it's a little more complicated than that, but not much. However, we can construct all of our numbers with it. I'll only focus on the natural numbers (0,1,2, etc), but it can be extended to integers, rationals and the reals using messy definitions in much the same way as in set theory.

Now, a "Church Numeral" is an encoding of a number using lambda calculus. Basically, the goal is to use LC to make definitions for all of our numbers (well, counting numbers) as well as multiplications, addition and exponentiation (exponentiation, just because it's easy). We'll ignore subtraction and division because they are actually rather messy. What we need are functions that interact the same way our numbers and operations interact.

Let's start with numbers. An easy way to define a number is to say that a number n takes a function, and applies it n times. This makes sense, but it isn't quite lambda calculus. So, we start with 0:
0 := λf.λa.a The astute reader will notice that what it returns is the identity:
0 := λf.I

1 is defined in the same way:
1 := λf.λa.fa Ignoring the a, we notice that this is the identity. The one difference is that we restrict the valid arguments to function values. (In general, f will always be a function. a can be, but whether it is necessarily a function is dependent on context, and will be clarified)

Now, we can define a successor. That is, the number after the number we're given. It takes (awkwardly) 3 arguments (see above about arguments, and see below about bracket notation):
SUCC := λn.λf.λa.f (nfa) (i.e. ((f) (((n)f)a))
What it says is given any number n, the successor is defined by applying the function argument of n (that is, f) to the other argument of n (that is, a) one extra time.
This is how we get to each number from the previous one. As an example:
[Don't worry if you can't follow my examples, they shouldn't actually be necessary.]
SUCC 2 -> (λn.λf.λa.f (nfa)) λf.λa.ffa -> λf.λa. f ((λf.λa.ffa) f a)
applying the "f a" at the end we get:
λf.λa. fffa; when we count the f's this means 3.

Now, using the same logic as the successor function, we can defined addition:
[M+N] := plus m n := λm.λn.λf.λa. m f (n f a) {remember that m and n are numbers}
The thing to notice about this is (n a) is the "a" in m. I.e. ((λf.λa.f...f a) f) (n a) -> (λa.f...fa) (n a) -> f....f...fa , where the first "f..." is m f's, and the "f...f" is n f's. So we have m+n f's, as we want.
Another example:
[2+3] -> (λm.λn.λf.λa. m f (n a)) 2 3 -> λf.λa. 3 f (2 f a) -> {dropping the leading λf.λa for cleanliness}
3 f ((λf.λa.ffa) f a) -> 3 f (ffa) -> ((λf.λa. fffa) f) ffa -> (λa.fffa) ffa ->{adding the λs back in} λf.λa.fffffa -> 5

Next,
[M*N] := times m n := λm.λn.λf. n (m f) {i.e. ((n) (m f))}
Notice that the function being applied n times is (m f). So, we have (remembering back to 2nd grade) n sets of m f's, or n*m f's. Again, just as we want.

Finally, exponentiation. I don't know if I'll work this into the second part... but we'll see.:
[M^N] := exp m n := λm.λn.λf n m f
This is very much like multiplication, only n is applied straight through (without the little "m f" detour), so What ends up happening is we get n copies of M all multiplied together.

Anyway, tomorrow I'll have a post about ordinals, and how to apply ordinal arithmetic to LC

No comments:

Creative Commons License Cory Knapp.