On Scheme

Thoughts on Scheme and teaching Scheme

Archive for the ‘Exploring Scheme’ Category

Type Systems and Scheme (Summary)

Posted by Peter on May 11, 2006

This is not really a new post, but merely a convenient place to gather my old posts on type systems and how they might improve Scheme. The 5 parts are:

Part 1: My first misguided type system
Part 2: A critique of type systems
Part 3: A second critique of type systems
Part 3.5: The criterions for a good type system
Part 4: My second proposal for a Scheme type system

I am sorry that I haven’t been working on Scheme much lately, but I have been busy with philosophy.

Posted in Exploring Scheme | Leave a Comment »

Type Foolishness

Posted by Peter on May 4, 2006

Today’s post is meant as a warning to others: don’t make the same foolish implementation choices I did.

My original approach to the type system was the following: to create a type description for each type that was abstract, i.e. could contain self-references and unbound variables, so that type inference could be usefully done with it. And it will, when it is finished, serve this purpose admirably (at least I don’t see any major problems with it at the moment). The problem came when I assumed that this same system could be used to tag the objects generated at run time. Since it can code for any type, no matter how abstract, I figured that it would be simpler to simply label the type of run time objects in the same way. What I didn’t take into consideration was how an object at run-time is generated. For example consider a simple cons cell. In general we can describe the type of the cons function as follows (-> (a b) ((pair-of a b))), meaning that the type of it’s result is dependant on the type of its parameters. However when we generate the result of the cons operation we cannot label as a type containing two free variables, because eventually we might need to provide type checks on the result (for type safety), and a type of unbound gives us no useful information. What is needed then (or so I thought), was to look at the type of the parameters and then construct a new type, at run-time, using this information. This is only sensible as long as you don’t examine it that closely. For example consider adding a new element to a list already containing 100 members. The type of this new element would then contain type information for the 100 existing descending elements embedded in it. Clearly this is an unreasonable amount of overhead.

The correct solution is relatively simple, make the abstract types and the type encoding at run-time different. The run-time information simply needs to provide an indication how to obtain information for the type of related objects at runtime, then when type checks need to be made (rarely of course due to type inference optimizations) we can get all the information we need without duplicating it unnecessarily. Of course there still needs to be some glue between the purely abstract type system and the practical type system, for example when dispatching by type at runtime, but that is simply a detail.

Hopefully I’ll have the type system working soon, so that I can move onto the parser of my little project.

Posted in C/C++, Exploring Scheme | 5 Comments »

Real Criticisms of Scheme

Posted by Peter on April 30, 2006

Recently I have noticed a tendency in the Common Lisp community to criticize Scheme, possibly to make the Scheme community look equally bad in its handling of such viewpoints. So to head off silly and/or baseless attacks on Scheme I have created a brief list of things that Scheme actually does have problems with. Maybe someone will even turn them into a SRFI eventually, when we reach some kind of a consensus as to what the best solutions are. Remember these are problems with the language, not an implementation. Even if a specific implementation has fixed them it doesn’t fix the language itself. The language needs to address such issues so that the implementations don’t become too fragmented.

Types
Types can be used to catch stupid programmer errors (the kind mere mortal make all the time, i.e. “which order do those parameters go in again?”) at compile time as well as to optimize away unnecessary run-time checks. Unfortunately the only types that Scheme supports are purely run-time. A type system also makes overloading functions based on the type of their parameters possible (efficiently, checking all parameter types on each call is not efficient), which can be extremely useful.

Macros
The macro system, although being nice for having hygienic safety and pattern matching, is at times cumbersome. It should be easy to create Common Lisp style macros when you want them. The macro system is also poorly documented (in my opinion).

Modules
The module system doesn’t support versioning well, nor does it support module “trees” (i.e. modules which have sub-modules declared within them). The fact that you can’t set! variables provided by modules feels arbitrary as well.

Objects
Scheme doesn’t have objects integrated into the language itself. Although I personally have argued that this can be considered a positive feature (see: here), I admit it can be annoying when to work on five different projects you have to learn five different object systems.

Errors
Trapping errors thrown at runtime, such as type errors, can be cumbersome and implementation dependant. Additionally there is no built-in syntax for providing the try-catch features available in other languages, although you can define your own with continuations.

Threads
Once again threads should be provided by the language, not by implementations.

Multiple Values
The values construct is cumbersome to work with. It requires special procedures to handle multiple values when it should be as simple as (f (p)) if f expects two parameters and p returns two values.

Lack of Compiler Hinting
This has partly been addressed under the problems with types, but as a whole Scheme lacks methods of hinting to the compiler what you are trying to do so it can optimize rationally. For example if defun was made part of the language we might allow the compiler to make the assumption that symbols created with defun could not be set!, and thus such functions could be inlined safely, ect.

Too Minimalist
Many of these problems resolve to the complaint that Scheme is too minimalist for its own good. Problems with types, threads, errors, objects, and modules are all symptoms of Scheme’s desire to be simple. Minimalism is good of course, because it makes the language easy to learn, implement, and optimize, but it has to be balanced with functionality, so that implementations don’t diverge to far when they are forced to implement features that the language doesn’t support.

Why use Scheme if has all these problems?

From this long list of faults you may think that I am encouraging you to drop Scheme. Many of these deficiencies have been resolved by individual implementations, which means that you can work as though they had been resolved, although there is a possibility that eventually you will have to update you code to bring it in line with the official solutions. Also Scheme still has some cool features that make it shine in comparison with Common Lisp, for example real continuations, guarantees of tail call optimization (no need for nasty do loops), and of course it is a Lisp-1. Additionally there is a formal process for revision, the SRFI, which allows the community to have their input, ensuring that Scheme improvements are really improvements, not fiat from some language deity. Even though these problems are currently open hopefully Scheme can later implement the best solutions to them, leaving the language as elegant as it is currently, unlike the solutions in Common Lisp, which often feel like hacks (just my opinion). If anyone does propose a SRFI to address one of these issues I would be grateful if they would let me know.

Posted in Exploring Scheme | 7 Comments »

Actual Code

Posted by Peter on April 27, 2006

I know it seems like my blog has turned into a place for me to talk without actually accomplishing anything. To remedy this I am releasing a working draft of my asynchronous garbage collector. Unfortunately for you it is written in C++, and is meant to be compiled in windows. (Bless me father, for I have sinned. It has been 2 function calls and a continuation since my last garbage collection.) Porting it shouldn’t be that hard if you want to. Anything that is platform dependant is conveniently located in platform_dependant.cpp. Also you need to define the symbol GC_COUNT_POINTERS if you want the sample to compile. Finally I make use of the hash_map and hash_set classes, which may not be part of your compiler’s stl.

How it works

In this system there are two types of references, hard and soft. A hard reference means that you guarantee that the reference will not become part of a cyclic data structure. Usually you want to acquire a hard reference to an object you are using directly. Soft references are references from one object to another. If a cyclic data structure made up of soft references occurs the garbage collector will be able to collect it. When you alter an object’s member pointers you must call gc->ReferenceSoft and gc->DereferenceSoft as appropriate on the pointers as appropriate. Finally when altering member pointers of an object that has existing references pointing to it you must surround the code altering it with gc->BeginEdit and gc->FinishEdit calls.

For more specific details I suggest that you check out the example provided. Note that the functions defined in the sample do not reflect how I actually expect to define Lisp functions (in terms of parameters).

Source is available from Sourceforge (see sidebar ->) under “Garbage Collector”.

Posted in C/C++, Exploring Scheme, Windows | Leave a Comment »

Lisp Without Parentheses

Posted by Peter on April 16, 2006

Well it’s impossible to remove all of the parentheses in Lisp, but I can think of several methods to eliminate the bulk of them. One alternative to parentheses is being sensitive to white space, in a way that is reminiscent of Python. The simple description is as follows: each line automatically has an opening parenthesis added before it, and a closing parenthesis added after it, except under the following circumstances: if the following line has one extra level of indentation omit the closing parenthesis. If the line in question is indented two or more levels beyond the previous line add an extra opening parenthesis for every new level of indentation beyond the first. If the line following the line in question is less indented than the line we are examining add an extra closing parenthesis for each level less. (the end of the file should be treated as a line with 0 indentation of course)

Let’s apply these rules to some simple cases:

A cond statement of the following nature:

(cond
	((< x -1) (* x x))
	((> x 1) (* x x))
	((< x 0) (sqrt (abs x)))
	(else (sqrt x)))

would become:

cond
	(< x -1) (* x x)
	(> x 1) (* x x)
	(< x 0) (sqrt (abs x))
	else (sqrt x)

if we are willing to take up some more lines it could also become:

cond
	(< x -1)
		* x x
	(> x 1)
		* x x
	(< x 0)
		sqrt (abs x)
	else
		sqrt x

Admittedly this expansion looks rather verbose for such a simple cond statement, but if the cond was more complex, and had more instructions in each branch, it would not be too much of an expansion, because it is likely the programmer would have already indented it in this fashion.

Now consider the let statement:

(let
	((a (+ 1 v)) (b (car z)) (c (avg 8 3 w)) (d w))
	(body-goes-here))

This statement could be written as:

let
	(a (+ 1 v)) (b (car z)) (c (avg 8 3 w)) (d w)
	body-goes-here

or as:

let
		a (+ 1 v) 
		b (car z)
		c (avg 8 3 w)
		d w
	body-goes-here

So far everything looks pretty nice, but there are a few problems with this method. For example: consider the case where programmers break a function containing many, possibly complicated, arguments onto several lines:

(my-large-function
	(car (cdr y))
	(lambda (x) (/ x 2))
	my-list)

we want to break it up like so:

my-large-function
	car (cdr y)
	lambda (x) (/ x 2)
	my-list

but unfortunately the rules we have laid out dictate the my-list should be wrapped with parentheses, and thus will be called like a function. This can also be a problem when you want to put more than one function call on a single line, for example:

(lambda (x)
	(display x) (newline)
	(* x 42))

is not properly translated as:

lambda (x)
	(display x) (newline)
	* x 42

since the result of (display x) is not a function to be applied to the result of (newline).

There are several ways to overcome this, for example using values (in an ideal world, where it worked like you expected it to) and begin. However a simpler solution is to simply add a special prefix the language that when encountered before a line instructs the compiler/reader to add one less pair of parentheses around the line. I would suggest the \ character, since it has been used in other contexts as an escape character. Our examples then would be written (correctly) as follows:

my-large-function
	car (cdr y)
	lambda (x) (/ x 2)
	\ my-list


lambda (x)
	\ (display x) (newline)
	* x 42

Note: \ is not notation for the continuation of a line.

Well, it’s something to think about at least. I know some people will object for the same reasons they do to Python’s white space sensitivity, but consider this before you dismiss it out of hand: one of the biggest barriers to the adoption of Lisp is that the sheer bulk of parentheses can be intimidating, and a method such as this would hide the majority of them, especially the large mass of closing parenthesis that seems to end every complicated function.

As you may know I plan on experimenting with my own Scheme variant soon, and this is one of the ideas I think I will try out in it (as a compiler option of course, not a mandate), unless I hear a good reason not to.

Posted in Exploring Scheme | 23 Comments »

 
Follow

Get every new post delivered to your Inbox.