2011-02-04 (updated 2014-03-04)

Can your programming language do this?

% Remco Bloemen % 2011-02-04, last updated 2014-03-04

I have occasionally criticized several programming languages. These critiques should be taken with a heap of salt, since languages I criticize most tend to be the ones I use and prever most (i.e. C++). But still, I should not complain but improve, so here is the start of my attempt at "The Perfect Language".

First off: some language features that I find inspiring and would like the language to have. Two thing should be noted: (1) I did not include anything that is related to syntax or easily implemented as syntax suggar, since I’m focussing on the core of the language right now and (2) some of these features can be implemented in terms of another, if this implementation is trivial and free of overhead then this implementation can replace the feature.

In order of subjectively increasing difficulty, I’d like the language to have:

Simple pure functions
Functions with side effects
Flow control, loops and recursion
Data structures
Multiple return values
First class functions
Closures
Continuations and coroutines
Reflection
Extensibility
Verification

Now what does this all mean? Let me give some examples:

1 Simple pure functions

Pure functions allow you to write a procedure once and use it many times elsewhere in the code, without having to be concerned with the internals. The return value of the function is completely determined by the function arguments, just like mathematical functions and hence they are called pure. The simple part refers to the fact that I have not allows and form of loops or recursion yet. These kinds of functions exist in any language that has only the vaguest notion of function, so only the most essoteric languages exclude them.

A nice example is the function that takes two numbers and returns the average of the numbers:

Example 1.1: Average of two numbers.

number Average (numer a, numer b):
	return (a + b) / 2

2 Side effects

In the majority of popular languages functions are not required to be pure, their return values may depend on some external information source. In fact, some form of impure functions are necessary for the language to be useful; a function that returns the current time would be impure by definition, or a function that asks the user for some value, or basically anything that involves the world outside of the language.

Some functional languages go to great lengths to avoid having impure functions and they user elaborate constructions to keep the language as pure as possible (uniqueness types, monads, etc.). I’d like to avoid these constructions, but I do share the strong preference for pure functions.

Example 2.1: Welcome.

This is a popular variation of the ubiquites "Hello World!" program:

Welcome ( ):
	name ≔ ask ("You name please?")
	write ("Hello ".name."!");

3 Flow control, loops and recursion

Example 3.1: Factorial.

Since input/output is a touchy subject in functional programming languages, they tend to substitute the following program:

number Factorial (number n):
	if n = 0 return 1
	else return n × Factorial (n - 1)

Example 3.2: Péter-Robinson-Ackermann function.

The Ackermann function is also popular in the functional language community, it is often used as a simple benchmark for recursion. The function is nice because it is not immediately obvious that it terminates on all values, however, this has been proven (in fact, we did so this week during the lectures).

number Ackermann (number m, number n):
	if m = 0 return n + 1
	if n = 0 return A(m - 1, 1)
	else return A(m - 1, A(m, n - 1))

I could go on and list a few other interesting examples, but I’ll keep it brief. A nice function would be one that calculates the Collatz sequence. Wheter the program terminates would be and open mathematical problem.

Data sctructures

Example 5: Trabb Pardo-Knuth algorithm.

This is another algorithm that is intended to demonstrate the language. It demonstrates all of the features seen so far and adds the concept of an array.

Trabb Pardo-Knuth():
	for i from 1 to 11
		S[i] ≔ readNumber()
	reverse S
	for each x in S
		y ≔ √|x| + 5 x³
		if y > 400
			print "TOO LARGE"
		else
			print result

Example 6: Complex numbers.

Most languages do not directly support complex numbers, but allow one to extend the language by introducing a complex type and operations to go with it:

type complex {number r, number i}

complex add(complex a, complex b):
	complex r
	r.r ≔ a.r + b.r
	r.i ≔ a.i + b.i
	return r

complex mul(complex a, complex b):
	complex r
	r.r ≔ a.r × b.r - a.i × b.i
	r.i ≔ a.r × b.i + a.i × b.r
	return r

Tagged unions

Multiple dispatch

Multiple return values

Most languages do not directly support for multiple return values. Theoretically every MRV-function can be split up into several functions, one for each return value. For example a function sincos that returns both the sine and cosine of an angle can always be replaced the functions sin and cos.

Consider an algorithm exists that would calculate both the sine and cosine of an angle simultaneously, implementing a single valued sin and cos using this algorithm has two disadvantages: first, the algorithm must executed twice, which takes more time, and second, there is code duplication.

These disadvanteges are sufficiently strong that methods have been developed to emulate proper MRV-functions. In C++ one could for example return a std::pair or take a non-constant reference as argument and store the result there. Both methods seem like an ugly hack to me.

Example 6: Division algorithm.

A simple example of an algorithm that computes two useful results at once is the naive division algorithm. In any immaginable case this particular algorithm would preform horrible, but some of the fastest division algorithms also calculate the remainder in the process. You may also rightly claim that in practice one would rather use the division and modulo instruction available on the CPU, but this does not extend to division with bigints, polynomials or general integral domains.

(result, remainder) Divide (numerator, denominator):
 result ≔ 0
	remainder ≔ numerator
	while (remainder > denominator):
		remainder ≔ remainder - denominator
		result ≔ result + 1

Example 7: Extended Euclidean algorithm.

The extended Euclidean algorithm hardly makes any sense when the x and y are treated differently, and this is not a consequence of the algorithm, it is a consequence of a symmetry in the function: (gcd, x, y) ≔ Euclid (a, b) is equivalent to (gcd, y, x) ≔ Euclid (b, a). It is therefore a strong case for multiple return values.

(gcd, x, y) Euclid (a, b):
	(d, r) ≔ Divide (a, b)
	if r = 0 return (b, 0, 1)
	(g', x', y') ≔ Euclid(b, r)
	return (g', x', x' - d * y')

First class functions

Closures

Constructs such as objects and control structures can thus be implemented with closures. Closures require garbage collection.

Example : Numerical differentiation.

function Derive (function f, number ε):
	number f'(number x):
	return (f(x + ε) - f(x)) / ε
	return f'

Continuations and coroutines

"In any language which supports closures and proper tail calls, it is possible to write programs in continuation-passing style and manually implement call/cc."

Reflection

Example 8: Symbolic differentiaton.

function Derive (function f, variable x):
	if f = c      return 0
	if f = x      return 1
	if f = a + b  return Derive(a, x) + Derive(b, x)
	if f = a * b  return Derive(a, x) * b + a * Derive(b, x)
	if f = a / b  return (Derive(a, x) * b + a * Derive(b, x)) / (b ^ 2)
	if f = a ^ b  return Derive(exp(ln(a) * b), x)
	if f = sin(a) return cos(a) * Derive(a, x)
	if f = cos(a) return -sin(a) * Derive(a, x)
	if f = exp(a) return exp(a) * Derive(a, x)
	if f = ln(a)  return Derive(a, x) / a
	if f = a(b)   return Derive(a(x), x)(b) * Derive(b, x)

Of course this is very sloppy notation, but it should be clear what the intended algorithm is: recursively apply differentiation rules untill one reaches either the variable or a constant. The derivative function is constructed in the returns. Note that there is a possiblity for infinite recursion when trying to differentiate functions like f(x) = tan(x), because there is no special rule for tan, it will indefinitely recurse on the last rule.

To implement it one needs to be able to express the algorithm without any ambiguety and sloppyness. Furthermore, the way it is written above uses a fair amount of pattern matching.

Extensibility

Example 9: Regular expressions.

astNode RegexpHandler (string sourceCode, …):
	… // Regular expression compiler

Compiler.AddSyntaxRule("s/*/*/", RegexpHandler)

write s/W.r.*ld/World/("Hello Wereld!")