On Scheme

Thoughts on Scheme and teaching Scheme

Type Foolishness

Posted by Peter on May 4, 2006

Today’s post is meant as a warning to others: don’t make the same foolish implementation choices I did.

My original approach to the type system was the following: to create a type description for each type that was abstract, i.e. could contain self-references and unbound variables, so that type inference could be usefully done with it. And it will, when it is finished, serve this purpose admirably (at least I don’t see any major problems with it at the moment). The problem came when I assumed that this same system could be used to tag the objects generated at run time. Since it can code for any type, no matter how abstract, I figured that it would be simpler to simply label the type of run time objects in the same way. What I didn’t take into consideration was how an object at run-time is generated. For example consider a simple cons cell. In general we can describe the type of the cons function as follows (-> (a b) ((pair-of a b))), meaning that the type of it’s result is dependant on the type of its parameters. However when we generate the result of the cons operation we cannot label as a type containing two free variables, because eventually we might need to provide type checks on the result (for type safety), and a type of unbound gives us no useful information. What is needed then (or so I thought), was to look at the type of the parameters and then construct a new type, at run-time, using this information. This is only sensible as long as you don’t examine it that closely. For example consider adding a new element to a list already containing 100 members. The type of this new element would then contain type information for the 100 existing descending elements embedded in it. Clearly this is an unreasonable amount of overhead.

The correct solution is relatively simple, make the abstract types and the type encoding at run-time different. The run-time information simply needs to provide an indication how to obtain information for the type of related objects at runtime, then when type checks need to be made (rarely of course due to type inference optimizations) we can get all the information we need without duplicating it unnecessarily. Of course there still needs to be some glue between the purely abstract type system and the practical type system, for example when dispatching by type at runtime, but that is simply a detail.

Hopefully I’ll have the type system working soon, so that I can move onto the parser of my little project.


Posted in C/C++, Exploring Scheme | 5 Comments »

Watch Me Beg For Money

Posted by Peter on May 1, 2006

I am applying for Google Summer of Code. The following is my application this year (submitted to LispNYC and Google):

One Sentence Summary:
I would like to develop a new Lisp language/variant in order to experiment with various ideas that might be later used to improve any Lisp language, hopefully a mainstream one.

Why Another Lisp?
The purpose of this language wouldn’t be to complete with various Common Lisp or Scheme implementations, but to try out new language ideas. Hopefully by playing around with ideas in an actual language instead of on paper the good ones will be separated from the bad ones based on actual usefulness. There is a large list of features that I would like to explore, such as hygienic macros without define-syntax, a module system that supports nested modules and versioning, a transparent system for errors, threads, and multiple values, and probably other ideas that will occur to me later, or I run across in a research paper. The primary idea, which motivates me to write another Lisp from the ground up, instead of just adding macros to existing implementations, is to try out type inference. I want to see if strong type inference, as well as function overloading by type, would be useful to Lisp or simply a cumbersome burden. If you want to see a list of my complaints with Scheme in order to have a better sense of the “problems” that I want to tackle you can go here: https://pschombe.wordpress.com/2006/04/30/real-criticisms-of-scheme/.

Evidence that I can actually accomplish this:
Obviously implementing a Lisp language isn’t the easiest task in the universe, especially if you want that language to be able to compile to native code as I do. As proof that I know what I am doing, and that I can accomplish it in one summer I would direct you to Centum (http://centum.sourceforge.net), which is a language I designed and implemented two summers ago. Although I continued working on it (and published it publicly) during the year the majority of the work was accomplished over summer, despite the fact that I had a job and summer school. Centum was also a horrible hack of a language, and since Lisp can be written for the most part in Lisp it should be a much easier task. Finally, I have already begun to work on parts of the implementation, for example the asynchronous garbage collector (see here: https://pschombe.wordpress.com/2006/04/27/actual-code/).

Technical Details:
This is going to be a Windows project, unless I find someone to help me port it that is. Although I have been interested on working on Linux for some time, because this project is closely tied to the operating system’s capabilities (especially when compiled) I think it is best to stick with what I know, for I fear that the time I would spend leaning how to do the things I want under Linux might take as long as the rest of the project. Also the code itself will be in C++, since once again I need to get close to the hardware in order to accomplish such feats as run-time compilation and asynchronous garbage collection. Of course the primary goal of this project is to experiment with ideas, not with code, and ideas are always cross-platform. I plan on releasing the source under a BSD license.

Plan of attack:
Now – June 17: Framework code written: garbage collector, types, parser
June 17: School Ends
June 17 – June 26: Get a running version working (just the basics)
June 27 – July 12: Implement type inference, enable compilation
July 13 – July 22: Work with modules, threads, macros
July 23 – Aug 2: Work with error handling, debugging options (i.e. step mode)
Aug 3 – Aug 12: This time is budgeted for unforeseen circumstances, documentation
Aug 13 – Aug 22: Work on libraries, such as GUI, add remaining primitive functions, more documentation
Aug 23 – Sept 2: Final week for debugging, more documentation
Sept 2 – Sept 18: Polish code, write examples, libraries, finish documentation
September 18: School Begins

Why you might want to give me money:
Since I have already started working on this project you may wonder why you would want to give me money, because it is obvious that I will continue working on this project without funding. First funding would allow me to focus the majority of my efforts on the project, instead of worrying about part-time jobs or other money making ventures. Secondly my fear of actual deadlines will cause me to work much faster than you would expect, leaving more time to polish the result and to eliminate bugs. Obviously the resulting product will not make you any money, nor is it likely become famous or well loved. The benefit to you is that ideas that could improve Lisp, or any language, are actually put to the test, so that when the next “serious” Lisp language is developed you will know which features to bother with and which to throw away. Obviously, since everyone needs programming languages to get work done improvements in this area benefit businesses and hobbyists alike. Extra bonus: if you want I’ll let you pick the language’s name.

Thanks for your consideration,
Peter Schombert

Posted in Uncategorized | 17 Comments »

Real Criticisms of Scheme

Posted by Peter on April 30, 2006

Recently I have noticed a tendency in the Common Lisp community to criticize Scheme, possibly to make the Scheme community look equally bad in its handling of such viewpoints. So to head off silly and/or baseless attacks on Scheme I have created a brief list of things that Scheme actually does have problems with. Maybe someone will even turn them into a SRFI eventually, when we reach some kind of a consensus as to what the best solutions are. Remember these are problems with the language, not an implementation. Even if a specific implementation has fixed them it doesn’t fix the language itself. The language needs to address such issues so that the implementations don’t become too fragmented.

Types can be used to catch stupid programmer errors (the kind mere mortal make all the time, i.e. “which order do those parameters go in again?”) at compile time as well as to optimize away unnecessary run-time checks. Unfortunately the only types that Scheme supports are purely run-time. A type system also makes overloading functions based on the type of their parameters possible (efficiently, checking all parameter types on each call is not efficient), which can be extremely useful.

The macro system, although being nice for having hygienic safety and pattern matching, is at times cumbersome. It should be easy to create Common Lisp style macros when you want them. The macro system is also poorly documented (in my opinion).

The module system doesn’t support versioning well, nor does it support module “trees” (i.e. modules which have sub-modules declared within them). The fact that you can’t set! variables provided by modules feels arbitrary as well.

Scheme doesn’t have objects integrated into the language itself. Although I personally have argued that this can be considered a positive feature (see: here), I admit it can be annoying when to work on five different projects you have to learn five different object systems.

Trapping errors thrown at runtime, such as type errors, can be cumbersome and implementation dependant. Additionally there is no built-in syntax for providing the try-catch features available in other languages, although you can define your own with continuations.

Once again threads should be provided by the language, not by implementations.

Multiple Values
The values construct is cumbersome to work with. It requires special procedures to handle multiple values when it should be as simple as (f (p)) if f expects two parameters and p returns two values.

Lack of Compiler Hinting
This has partly been addressed under the problems with types, but as a whole Scheme lacks methods of hinting to the compiler what you are trying to do so it can optimize rationally. For example if defun was made part of the language we might allow the compiler to make the assumption that symbols created with defun could not be set!, and thus such functions could be inlined safely, ect.

Too Minimalist
Many of these problems resolve to the complaint that Scheme is too minimalist for its own good. Problems with types, threads, errors, objects, and modules are all symptoms of Scheme’s desire to be simple. Minimalism is good of course, because it makes the language easy to learn, implement, and optimize, but it has to be balanced with functionality, so that implementations don’t diverge to far when they are forced to implement features that the language doesn’t support.

Why use Scheme if has all these problems?

From this long list of faults you may think that I am encouraging you to drop Scheme. Many of these deficiencies have been resolved by individual implementations, which means that you can work as though they had been resolved, although there is a possibility that eventually you will have to update you code to bring it in line with the official solutions. Also Scheme still has some cool features that make it shine in comparison with Common Lisp, for example real continuations, guarantees of tail call optimization (no need for nasty do loops), and of course it is a Lisp-1. Additionally there is a formal process for revision, the SRFI, which allows the community to have their input, ensuring that Scheme improvements are really improvements, not fiat from some language deity. Even though these problems are currently open hopefully Scheme can later implement the best solutions to them, leaving the language as elegant as it is currently, unlike the solutions in Common Lisp, which often feel like hacks (just my opinion). If anyone does propose a SRFI to address one of these issues I would be grateful if they would let me know.

Posted in Exploring Scheme | 7 Comments »

Shameless Self Promotion

Posted by Peter on April 29, 2006

I have just started another blog which will focus on my other passion in life, philosophy. This is the only time I will mention that blog here, since I know for many of you philosophy and computer science don’t exactly overlap much.

Posted in Uncategorized | Leave a Comment »

Don’t Be An Expert

Posted by Peter on April 28, 2006

In this post I am going to share a lesson I learned too late with you: never become an expert in anything. Be good at it, but not an expert.

Who is an expert?

Not every programmer is an expert with the tools (environments and languages) they use. For convenience let us call someone an expert when they have mastered a large percentage of the features of the tool they are using, which means not only do they know that these features exist and what they do, but they are familiar with the specific details (for example a Java expert would be familiar with most of the standard libraries, and for the most used parts of the language they would also know possible causes of exceptions and the exact type of exception thrown, as well as the relevant class hierarchy). The expert probably owns books on the subject as well, and has invested money in commercial solutions if they are available (once again, an expert Java programmer has likely bought at least one expensive Java development tool and a whole shelf load of Java books). Personally I don’t think I am an expert in any language, but I am somewhat of an expert with respect to Windows programming. I have invested money in Visual Studio, and am well acquainted with the Win32 API, DirectX, device driver programming, as well as many other diverse topics.

Problems with being an expert

The primary problem with being an expert is that when you have reached this level every task is easiest using the tools you are already an expert with. For example I once dabbled in Perl, but despite Perl’s reputation at being great at dealing with text it took me longer to make a parser for a particular file format in Perl than in C++. This doesn’t mean that Perl is bad, it just means that I am too much of an expert in C++. Once you become an expert it becomes hard to pick up new languages and tools because to pick up these new tools you must work on projects with them. However it is always faster for you to use your old tools, and thus you don’t begin projects using the new tool you were initially interested in, as it always seems more important to get the project done than learn something new; after all you can always learn it later. Thus in the end you remain an expert in what you already know and only expand your horizons when it is literally impossible to accomplish what you want with your preferred tools.

Experts are harmful to the community

As you may know Windows Vista is going to be making a gradual move to a new API. As a Windows “expert” my gut reaction is be opposed to this move. It means I will have to recode my projects, relearn all the intricate details I have picked up, and repurchase new versions of my tools. If Windows was run by the community I am sure that the old API would have been supported forever. Fortunately for progress in Windows it’s not, but there are other languages and tools that are run by such a community. Here I am obviously making an oblique reference to Common Lisp. Common Lisp experts have invested money in the commercial tools, and much like me and Windows they are reluctant to leave that investment behind. Unlike me and Windows however they have a say in what changes are made with respect to Common Lisp, and thus can ensure that the language remains basically static.* Another example of an expert dominated community may be emacs, which hasn’t changed significantly for many years, but since I don’t personally use emacs I can’t say for sure.

How can I stop being an expert?

Good question, and if I knew the answer I would probably be running Linux on my other machine at the moment. I mean I could go and work on my new Lisp implementation under Linux, but that would mean learning how to use gcc, and how to: inline assembly, write naked functions, create dynamically shared code objects, create libraries, manipulate files, do multi-threading, figure out the object file format, invoke the linker within a program, write an installer, and many more tasks that I can already do without too much thinking in Windows. In fact learning how to do these things in Linux would probably take as long as the project itself. I think to myself “maybe I can port it when it is done,” but in reality when it is done I will want to either make improvements or start a new project, not port it to an operating system I rarely use, and thus the cycle reinforces itself. If you have any ideas on how to solve this problem feel free to voice them.

* How do they have such a say if the language has commercial implementations? Couldn’t the company implement new features? The reason the commercial implementations remain basically static is thus: if they wanted to do something radical, like change CLOS, the experts, who are the majority of their customers, would leave for one of the other vendors. Thus the companies are forced to make only little extensions that don’t break the standard.

Posted in Uncategorized | 1 Comment »