Learning Haskell: All You Need is Lambda

posted 2 years ago

All You Need is Lambda

A calculus is a method of calculation or reasoning.

Lambda calculus is Turing complete, so it can simulate any Turing machine.

A Turing machine is a really, really simple computer. Because it's so simple, computer scientists and mathematicians can use it to study the properties of computation.

Lambda calculus is the foundation of Haskell because Haskell is a lambda calculus.

Functional programming languages are all based on the lambda calculus.

Haskell is a pure functional language because it doesn't incorporate features that aren't translatable into lambda expressions.

What is Functional Programming?

Functional programming is a computer programming paradigm that relies on functions modeled on mathematical functions.

Programs are a combination of expressions. Expressions include concrete values, variables, and also functions.

Functions are expressions that are applied to an argument or input, and once applied, can be reduced or evaluated.

In Haskell, functions are first-class citizens: they can be values or passed as arguments to other functions.

Referential transparency means that the same function, given the same values to evaluate, will always return the same result (purity).

Haskell utilizes functions as building blocks that can be assembled and reassembled, emphasizing abstraction.

What is a Function?

A function is a relation between a set of possible inputs and a set of possible outputs.

Imagine a function named 'f' that defines:

f(1) = A
f(2) = B
f(3) = C

The input set (domain) is {1, 2, 3} and the output set (codomain) is {A, B, C}.

'f' will always return the value A given the input 1 with no exceptions.

In contrast, this is not a valid function:

f(1) = X
f(1) = Y
f(2) = Z

This function is not valid because given the same input, the output is not predictable.

This function is valid because having the same output for more than one input is valid:

f(1) = A
f(2) = A
f(3) = A

Imagine a function that takes an argument (defines the relationship):

f(x) = x + 1

The argument is named 'x'. The relationship between the input, x, and the output is described in the function's body.

The function will add 1 to whatever value x is and return that result.

f applied to 1:

f(1) = 1 + 1

f(1) = 2

The Structure of Lambda Terms

The lambda calculus has three basic components (lambda terms): expressions, variables, and abstractions.

An expression can be a variable name, an abstraction, or a combination of those things.

The simplest expression is a single variable.

An abstraction is a function. It is a lambda term that has a head (a lambda), and a body and is applied to an argument. The argument is the input value.

A simple function might look something like this:

πœ†π‘₯.π‘₯

This function is an anonymous function because it has no name.

Ξ» x . x
^_^
└──── extent of the head of the lambda.
Ξ» x . x
   ^────── the single parameter of the
    function. This binds any
    variables with the same name
    in the body of the function.
Ξ» x . x
     ^── body, the expression the lambda
     returns when applied. This is a
     bound variable.

The dot (.) separates the parameters of the lambda from the function body.

Alpha equivalence is the realization that the variable 'x' is not semantically meaningful except in its role in that single expression.

Alpha equivalence (these are all the same function):

πœ†π‘₯.π‘₯
πœ†π‘‘.𝑑
πœ†π‘§.𝑧

Beta Reduction

When we apply a function to an argument, we substitute the input expression for all instances of bound variables within the body of the abstraction.

You can also eliminate the head of the abstraction, since its only purpose was to bind a variable.

Reducing the function:

πœ†π‘₯.π‘₯

Let's do the first beta reduction using the number 2. We apply the function above to 2, substitute 2 for each bound variable in the body of the function, and eliminate the head:

(πœ†π‘₯.π‘₯) 2
2

The only bound variable is that single x, so applying this function to 2 returns 2.

This function is called the identity function. All it does is accept a single argument 'x' and return that same argument.

'x' has no inherent meaning, but, because it is bound in the head of this function, when the function is applied to an argument, all instances of 'x' within the function body must have the same value

An example that incorporates basic arithmetic:

(πœ†π‘₯.π‘₯ + 1)

We can also apply our identity function to another lambda abstraction:

(πœ†π‘₯.π‘₯)(πœ†π‘¦.𝑦)

In this case, we can substitute the entire abstraction in for 'x'.

Reducing the application:

(πœ†π‘₯.π‘₯)(πœ†π‘¦.𝑦)
[π‘₯ ∢= (πœ†π‘¦.𝑦)]
πœ†π‘¦.𝑦

Applications in lambda calculus are left associative. Unless specific parentheses suggest otherwise, they associate, or group, to the left:

Rewriting:

(πœ†π‘₯.π‘₯)(πœ†π‘¦.𝑦)𝑧

To:

((πœ†π‘₯.π‘₯)(πœ†π‘¦.𝑦))𝑧

Reduced:

((πœ†π‘₯.π‘₯)(πœ†π‘¦.𝑦))𝑧
[π‘₯ ∢= (πœ†π‘¦.𝑦)]
(πœ†π‘¦.𝑦)𝑧
[𝑦 ∢= 𝑧]
𝑧

This cannot be reduced any further as there is nothing left to apply, and we know nothing about 'z'.

Functions can have multiple heads and also free variables (variables in the body that are not bound by the head)

The process of beta reduction stops when there are either no more heads, or lambdas, left to apply or no more arguments to apply function to.

Take the following expression:

πœ†π‘₯.π‘₯𝑦

The 'x' in the body is a bound variable because it is named in the head of the function. 'y' is a free variable because it is not.

When we apply this function to an argument, nothing can be done with the y. It remains irreducible.

1. (πœ†π‘₯.π‘₯𝑦)𝑧
We apply the lambda to the argument 𝑧.

2. (πœ†[π‘₯ ∢= 𝑧].π‘₯𝑦)
Since π‘₯ is the bound variable, all instances of π‘₯ in the body of the
function will be replaced with 𝑧. The head will be eliminated,
and we replace any π‘₯ in the body with a 𝑧.

3. 𝑧𝑦
The head has been applied away, and there are no more heads
or bound variables. Since we know nothing about 𝑧 or 𝑦, we can
reduce this no further.

NOTE: Alpha equivalence does not apply to free variables.

Multiple Arguments

Each lambda can only bind one parameter and can only accept one argument.

Functions that require multiple arguments have multiple nested heads.

This is commonly known as currying.

Take the following:

πœ†π‘₯𝑦.π‘₯𝑦

This is a convenient shorthand for two nested lambdas (one for each argument, 'x' and 'y').

Written out:

πœ†π‘₯.(πœ†π‘¦.π‘₯𝑦)

A simple example with the identity function:

1. πœ†π‘₯.π‘₯

2. (πœ†π‘₯.π‘₯) 1

3. [π‘₯ ∢= 1]

4. 1

"Multiple" argument lambda:

1. πœ†π‘₯𝑦.π‘₯𝑦

2. (πœ†π‘₯𝑦.π‘₯𝑦) 1 2

3. (πœ†π‘₯.(πœ†π‘¦.π‘₯𝑦)) 1 2

4. [π‘₯ ∢= 1]

5. (πœ†π‘¦.1𝑦) 2

6. [𝑦 ∢= 2]

7. 1 2

Meaningful application:

1. πœ†π‘₯𝑦.π‘₯𝑦

2. (πœ†π‘₯𝑦.π‘₯𝑦)(πœ†π‘§.π‘Ž) 1

3. (πœ†π‘₯.(πœ†π‘¦.π‘₯𝑦))(πœ†π‘§.π‘Ž) 1

4. [π‘₯ ∢= (πœ†π‘§.π‘Ž)]

5. (πœ†π‘¦.(πœ†π‘§.π‘Ž)𝑦) 1

6. [𝑦 := 1]

7. (πœ†π‘§.π‘Ž) 1 We still can apply this one more time.

8. [𝑧 ∢= 1] But there is no 𝑧 in the body of the function, so there is
nowhere to put a 1. We eliminate the head, and the final result
is

9. π‘Ž

It is more common in academic lambda calculus materials to refer to abstract variables rather than concrete values.

The process of beta reduction is the same regardless of whether the variables are abstract or concrete.

Take a more difficult expression:

(πœ†π‘₯𝑦.π‘₯π‘₯𝑦)(πœ†π‘₯.π‘₯𝑦)(πœ†π‘₯.π‘₯𝑧)

To make the reduction easier to read and get rid of the convoluted 'x's, let's implement alpha reduction:

1. (πœ†π‘₯𝑦𝑧.π‘₯𝑧(𝑦𝑧))(πœ†π‘šπ‘›.π‘š)(πœ†π‘.𝑝)

Simplify:

2. (πœ†π‘₯.πœ†π‘¦.πœ†π‘§.π‘₯𝑧(𝑦𝑧))(πœ†π‘š.πœ†π‘›.π‘š)(πœ†π‘.𝑝)
We’ve not reduced or applied anything here, but made the
currying explicit.

3. (πœ†π‘¦.πœ†π‘§.(πœ†π‘š.πœ†π‘›.π‘š)𝑧(𝑦𝑧))(πœ†π‘.𝑝)
Our first reduction step was to apply the outermost lambda,
which was binding the π‘₯, to the first argument, (πœ†π‘š.πœ†π‘›.π‘š).

4. πœ†π‘§.(πœ†π‘š.πœ†π‘›.π‘š)(𝑧)((πœ†π‘.𝑝)𝑧)
We applied the 𝑦 and replaced the single occurrence of 𝑦 with the
next argument, the term πœ†π‘.𝑝. The outermost lambda binding 𝑧
is, at this point, irreducible because it has no argument to apply
to. What remains is to go inside the terms one layer at a time
until we find something reducible.

5. πœ†π‘§.(πœ†π‘›.𝑧)((πœ†π‘.𝑝)𝑧)
We can apply the lambda binding π‘š to the argument 𝑧. We keep
searching for terms we can apply. The next thing we can apply
is the lambda binding 𝑛 to the lambda term ((πœ†π‘.𝑝)𝑧).

6. πœ†π‘§.𝑧
In the final step, the reduction takes a turn that might look
slightly odd. Here the outermost, leftmost reducible term is πœ†π‘›.𝑧
applied to the entirety of ((πœ†π‘.𝑝)𝑧). As we saw in an example
above, it doesn’t matter what 𝑛 got bound to, πœ†π‘›.𝑧 uncondition-
ally tosses the argument and returns 𝑧. So, we are left with an
irreducible lambda expression.

Evaluation is Simplification

There are multiple normal forms in lambda calculus, but in the book when they refer to normal form they mean beta normal form.

Beta normal form is when you cannot beta reduce (apply lambdas to arguments) the terms any further.

This is important to know so that you know when you're done evaluating an expression.

For example, the reduced form of 2000/1000 is 2. 2 is considered the normal form because the expression has been fully evaluated.

The identity function πœ†π‘₯.π‘₯, is in normal form because it hasn't yet been applied to anything.

(πœ†π‘₯.π‘₯)𝑧 is not in beta normal form because the identity function hasn’t been applied to a free variable 𝑧 and hasn’t been reduced. If we did reduce it, the final result, in beta normal form, would be 𝑧.

Combinators

A combinator is a lambda term with no free variables.

Combinators serve only to combine the arguments they are given.

So the following are combinators because every term in the body occurs in the head:

1. πœ†π‘₯.π‘₯

π‘₯ is the only variable and is bound because it is bound by the enclosing lambda.

2. πœ†π‘₯𝑦.π‘₯

3. πœ†π‘₯𝑦𝑧.π‘₯𝑧(𝑦𝑧)

And the following are not because there’s one or more free variables:

1. πœ†π‘¦.π‘₯

Here 𝑦 is bound (it occurs in the head of the lambda) but π‘₯ is free.

2. πœ†π‘₯.π‘₯𝑧

π‘₯ is bound and is used in the body, but 𝑧 is free.

Divergence

Not all reducible lambda terms reduce neatly to a beta normal form.

This isn't because they're already fully reduced, but rather because they diverge.

Example of a lambda term called omega that diverges:

1. (πœ†π‘₯.π‘₯π‘₯)(πœ†π‘₯.π‘₯π‘₯)

π‘₯ in the first lambda’s head becomes the second lambda

2. ([π‘₯ ∢= (πœ†π‘₯.π‘₯π‘₯)]π‘₯π‘₯)

Using [π‘£π‘Žπ‘Ÿ ∢= 𝑒π‘₯π‘π‘Ÿ] to denote what π‘₯ has been bound to.

3. (πœ†π‘₯.π‘₯π‘₯)(πœ†π‘₯.π‘₯π‘₯)

Substituting (πœ†π‘₯.π‘₯π‘₯) for each occurrence of π‘₯. We’re back to where we started and this reduction process never ends β€” we can say omega diverges.

Terms that diverge are terms that don't produce an answer or meaningful result.

Summary

  • Functional programming is based on expressions that include variables or constant values, expressions combined with other expressions, and functions.

  • Functions have a head, and a body and are those expressions that can be applied to arguments and reduced, or evaluated, to a result.

  • Variables may be bound in the function declaration, and every time a bound variable shows up in a function, it has the same value.

  • All functions take one argument and return one result.

  • Functions are a mapping of a set of inputs to a set of outputs. Given the same input, they always return the same result.

Definitions

  • The lambda in lambda calculus is the Greek letter used to introduce, or abstract, arguments for binding in an expression.
  • A lambda abstraction is an anonymous function or lambda term.

    (πœ†π‘₯.π‘₯ + 1)

    The head of the expression, πœ†π‘₯., abstracts out the term π‘₯ + 1. We can apply it to any π‘₯ and recompute different results for each π‘₯ we applied the lambda to.

  • Application is how one evaluates or reduces lambdas, this binds the argument to whatever the lambda was applied to. Computations are performed in lambda calculus by applying lambdas to arguments until you run out of arguments to apply lambdas to.

    (πœ†π‘₯.π‘₯)1

    This example reduces to 1, the identity πœ†π‘₯.π‘₯ was applied to the value 1, π‘₯ was bound to 1, and the lambda’s body is π‘₯, so it just kicks the 1 out. In a sense, applying the πœ†π‘₯.π‘₯ consumed it. We reduced the amount of structure we had.

  • Lambda calculus is a formal system for expressing programs in terms of abstraction and application.

  • Normal order is a common evaluation strategy in lambda calculi. Normal order means evaluating (ie, applying or beta reducing) the leftmost outermost lambdas first, evaluating terms nested within after you’ve run out of arguments to apply. Normal order isn’t how Haskell code is evaluated - it’s call-by-need instead.

Sources

A Tutorial Introduction to the Lambda Calculus - http://www.inf.fu-berlin.de/lehre/WS03/alpi/lambda.pdf

Introduction to Lambda Calculus - http://www.cse.chalmers.se/research/group/logic/TypesSS05/Extra/geuvers.pdf

Proofs and Types - http://www.paultaylor.eu/stable/prot.pdf

Haskell Book - https://haskellbook.com/