On Scheme

Thoughts on Scheme and teaching Scheme

Actual Code

Posted by Peter on April 27, 2006

I know it seems like my blog has turned into a place for me to talk without actually accomplishing anything. To remedy this I am releasing a working draft of my asynchronous garbage collector. Unfortunately for you it is written in C++, and is meant to be compiled in windows. (Bless me father, for I have sinned. It has been 2 function calls and a continuation since my last garbage collection.) Porting it shouldn’t be that hard if you want to. Anything that is platform dependant is conveniently located in platform_dependant.cpp. Also you need to define the symbol GC_COUNT_POINTERS if you want the sample to compile. Finally I make use of the hash_map and hash_set classes, which may not be part of your compiler’s stl.

How it works

In this system there are two types of references, hard and soft. A hard reference means that you guarantee that the reference will not become part of a cyclic data structure. Usually you want to acquire a hard reference to an object you are using directly. Soft references are references from one object to another. If a cyclic data structure made up of soft references occurs the garbage collector will be able to collect it. When you alter an object’s member pointers you must call gc->ReferenceSoft and gc->DereferenceSoft as appropriate on the pointers as appropriate. Finally when altering member pointers of an object that has existing references pointing to it you must surround the code altering it with gc->BeginEdit and gc->FinishEdit calls.

For more specific details I suggest that you check out the example provided. Note that the functions defined in the sample do not reflect how I actually expect to define Lisp functions (in terms of parameters).

Source is available from Sourceforge (see sidebar ->) under “Garbage Collector”.


Posted in C/C++, Exploring Scheme, Windows | Leave a Comment »

Theory and Application

Posted by Peter on April 25, 2006

No one seems to like the computer science curriculum offered at universities. The complaints I hear most often are that there isn’t enough theory, and that there isn’t enough practical application of concepts. Initially I thought that these goals were mutually exclusive, and that people complained simply because they were taking the wrong kind of courses. However, when you look at what actually is taught, you come to realize that some courses manage to leave out both theory and application when they could easily have had both. Before I show how both theory and practice can work together I probably need to defend the importance of each to computer scientists.

Let’s talk about theory first. When I think of theory I think immediately of math, and I think this connection tends to scare away some students. The theory side of computer science, ideally, deals with computation as an abstract process, which can be studied in order to prove, often with math, what is the best solution for a given problem. For example, theory teaches how to analyze algorithms under different conditions for speed and space efficiency. The theory side also encompasses topics such as how to mathematically model networks, lambda calculus, information theory, ect. In all honesty learning theory isn’t much fun, but it is necessary. The most important consequence of theory is that it encourages new “practical” advancements. For example the theory of lambda calculus gave birth (partly) to Lisp. Knowing more about information theory will help you design better network transport protocols and compression schemes. Advancements can be made without a theoretical background, but they tend to be made slowly with many revisions. Ideally theory can tell you directly what the best solution to a problem is, allowing you to implement that best solution in the beginning, not after 15 beta versions.

Practical knowledge however is also important. Obviously without actual programming experience you can’t put any of your wonderful theoretical knowledge to use. Programming and other practical applications also give insights into the capabilities of current software and hardware. Often the physical limits of the computer will determine what theories can be made into a real system, and you won’t be familiar with these capabilities unless you deal with them regularly. Finally, the desire to actually implement something is an excellent vehicle for learning more about computer science in general. For example I first learned how networks functioned before I even went to college because I was trying to create a chat program (in order to cheat on the computer science exams, but that is another story).

When I look over the classes offered at my affiliated university I notice some gaping holes, in both the applied and theoretical curriculums. The most noticeable missing applied class is one on debugging, which has always struck me as odd. Programmers can waste a lot of time trying to find bugs, I know I do, but for some reason students aren’t taught how to debug and are expected to learn on their own. As a result students new to programming often spend more time trying to find errors in their programs than they do writing them. In the theoretical side of the curriculum a class on information theory is clearly missing. Information theory has its fingers in every aspect of computer science, because from one point of view all any program does is manipulate information. Once again universities seem reluctant to actually teach students this material, expecting students to pick it up on their own.

Ideally application and theory should go hand in hand in the classroom. For example if you were teaching a class on mathematical approach to networks, with regards to speed, connectivity, ect it would be reasonable to have the students implement some of the best models that result from the mathematical study. Likewise purely practical classes, say a class on XML, could be enriched by applying that practical knowledge to an abstract domain, for example implementing abstract syntax trees in XML. Instead what we have are mediocre classes that do neither. For example take a look at any text book on operating systems. Inside you will find plenty of general information, but you will not find any code, not even pseudo-code. Nor will you find any theoretical underpinnings for why operating systems work the way they do. Instead these books stick to giving outlines on topics such as “what is a file system?”. Classes on networking, hardware, artificial intelligence, and many others, all follow this pattern, with students being exposed to little theory and less concrete application of the concepts.

Of course if you are a self motivated computer science student this won’t be too much of a problem for you. Now that you know what kinds of things your education is lacking it is easy enough to learn them from the internet (for the most part). Of course if you aren’t a self motivated student then this isn’t much help, but then again if you aren’t a self motivated student why are you in computer science?

Posted in Teaching Scheme | 2 Comments »

Miscellaneous Day

Posted by Peter on April 25, 2006

I have a couple points I want to bring up today, none of which are lengthy enough to warrant their own post.

Point 1: Debugging multi-threaded applications sucks (in C++ at least). As you may know I have been working on asynchronous garbage collection. I have been debugging some of the basic parts in the last few days, and am constantly being bit by multi-threading. Some of it admittedly is my own fault, for example I forgot how to write a blocking queue properly. Unfortunately any mistakes, no matter how small, turn into major bug hunting excursions, because they are affected by timing in the most unpredictable ways, making it is hard to repeat them. For example putting in print statements to find out where a bug was caused the bug to disappear. Secondly I can’t step through the threaded code with the debugger, meaning that I have to rely on print statements. At the moment I think I have the bugs on the defensive, but if any more pop up I am going to have to implement an asynchronous logging routine.

Point 2: My bile got posted to reddit; i.e. last time’s post “In Defense of Change”. Admittedly it is nice to have so many people hear your opinions, but at the same time it has a downside, since many of them will come to the conclusion that I am full of hate after reading it and will never visit the blog again.

p.s. If you want to take a look at the garbage collector I wouldn’t be opposed to sending the code out on request, but since it is not really ready for public consumption it isn’t on Sourceforge yet.

Posted in Uncategorized | Leave a Comment »

In Defense of Change

Posted by Peter on April 21, 2006

Recently there has been a burst of criticisms concerning Common Lisp. If you don’t know what I am talking about, or simply want to catch up on the current discussion before you read the rest of this piece you should visit: Steve Yegge’s blog, the usenet response, a usenet debate on CL suckage, and a usenet proposal for improving CL.

Although they have been accused of hating Lisp, my impression is that the majority of these critics love Lisp, and simply want to see it become better. For those of you who got lost in the flames let me provide a quick summary of the major complaints: 1- macros: there is no hygienic defmacro, but define-syntax may be too cumbersome for some, and also it is hard to hook some aspects of the source with macros. 2- CLOS: CLOS is becoming dated, looks too much like a hack, and is too hard to use. The object system needs to be rewritten, possibly from scratch. 3- There is no type system outside of run-time checking. 4- The Lisp community has become too fragmented, even Common Lisp implementations tend to provide features that makes them incompatible with each other. 5- Common Lisp has stagnated for too many years, and there is no way to implement changes that will be adopted outside of a single implementation, which ensures that no serious improvements are made.

In my opinion these are valid criticisms and, although they are directed primarily at Common Lisp, Scheme is not without its own flaws. I don’t believe that accepting these criticisms as valid implies that one should drop Lisp, even with its flaws I think that Lisp is still the best language, it just could become better.

However I am not writing this piece to discuss the criticisms or to propose solutions, what I want to address here is the response of the community to them, which I found appalling. Some people’s responses were scarily reminiscent of religious authorities trying to cast out heretics, which is not an appropriate attitude for a computer scientist. As scientists we should always be looking to improve our tools and methods, and this means trying new things to find out what changes constitute an improvement.

Let me discuss what makes some responses I have heard especially ridiculous.

The difficulty argument:
We should leave confusing / ill designed feature untouched because it makes Lisp harder, and thus keeps out the riff-raff. This is a flawed argument because a large volume of users is more important than a few brilliant users. Even mediocre programmers can write useful libraries / promote the language. Libraries especially make a language attractive to even more programmers, and for use in commercial applications. Basically if you want to make a living using Lisp then you should hope more programmers use Lisp, even if they aren’t as “133t” as you.

The fear of change argument:
We shouldn’t change feature X (often CLOS) because it is used so widely that changing it would force me to learn new things / update my code. This is a flawed argument because other languages (with supposedly dumber programmers) are able to change themselves without problems. If you depend on a removed feature you can always use an older compiler for the code, for example Visual Basic 6 is still used to build some older applications that aren’t compatible with Visual Basic .Net.

The perfection argument:
Feature X is perfect they way it is. This is unbelievable because perfection tends not to be humanly possible; we simply try to approximate it with a greater and greater degree of accuracy. It is possible that a change could be worse than the current way of doing things, but once we are open to the idea of change there is nothing stopping us from changing it again until we revert it to the original state or really do find something better.

The “it could be worse argument”
Sure Lisp has problems, but it is better than languages X, Y, Z and that is good enough for me. I admit that Lisp is better now, but you can’t really be sure that will always be the case. If Lisp doesn’t move forwards than we can rest assured that something better than Lisp will be invented eventually.

The diversity argument
Fragmentation gives you options. Sure it does, you just can’t have them all at once. If you want feature Y in implementation A you have to give up Z in implementation B. Experimenting with variants is important of course, but that is why people write toy versions of languages, to try out new ideas. Ideally the best features are then incorporated into the next version of the standard implementation(s). The problem with Common Lisp is that such features don’t make their way to the spec., and it looks like they never will. Also, confusing the versioning of a language with Common Lisp’s fragmentation is so ridiculous that I am not even going to address it.

The personal attack argument:
You don’t really use Common Lisp! Go out and code instead of bitching! God forbid we not be content with the status quo!

What should be done?

So what should you do if you think Lisp could use some fixing? Go out and fix it. Go write a toy version, it doesn’t have to be fast or efficient. However if you can use it to demonstrate something cool that fixes a problem people will either help you make it faster and smaller, or your feature will be adopted by a more professional implementation. Do this enough and you just might end up with a new popular Lisp dialect.

Posted in Common Lisp | 2 Comments »

Lisp Without Parentheses

Posted by Peter on April 16, 2006

Well it’s impossible to remove all of the parentheses in Lisp, but I can think of several methods to eliminate the bulk of them. One alternative to parentheses is being sensitive to white space, in a way that is reminiscent of Python. The simple description is as follows: each line automatically has an opening parenthesis added before it, and a closing parenthesis added after it, except under the following circumstances: if the following line has one extra level of indentation omit the closing parenthesis. If the line in question is indented two or more levels beyond the previous line add an extra opening parenthesis for every new level of indentation beyond the first. If the line following the line in question is less indented than the line we are examining add an extra closing parenthesis for each level less. (the end of the file should be treated as a line with 0 indentation of course)

Let’s apply these rules to some simple cases:

A cond statement of the following nature:

	((< x -1) (* x x))
	((> x 1) (* x x))
	((< x 0) (sqrt (abs x)))
	(else (sqrt x)))

would become:

	(< x -1) (* x x)
	(> x 1) (* x x)
	(< x 0) (sqrt (abs x))
	else (sqrt x)

if we are willing to take up some more lines it could also become:

	(< x -1)
		* x x
	(> x 1)
		* x x
	(< x 0)
		sqrt (abs x)
		sqrt x

Admittedly this expansion looks rather verbose for such a simple cond statement, but if the cond was more complex, and had more instructions in each branch, it would not be too much of an expansion, because it is likely the programmer would have already indented it in this fashion.

Now consider the let statement:

	((a (+ 1 v)) (b (car z)) (c (avg 8 3 w)) (d w))

This statement could be written as:

	(a (+ 1 v)) (b (car z)) (c (avg 8 3 w)) (d w)

or as:

		a (+ 1 v) 
		b (car z)
		c (avg 8 3 w)
		d w

So far everything looks pretty nice, but there are a few problems with this method. For example: consider the case where programmers break a function containing many, possibly complicated, arguments onto several lines:

	(car (cdr y))
	(lambda (x) (/ x 2))

we want to break it up like so:

	car (cdr y)
	lambda (x) (/ x 2)

but unfortunately the rules we have laid out dictate the my-list should be wrapped with parentheses, and thus will be called like a function. This can also be a problem when you want to put more than one function call on a single line, for example:

(lambda (x)
	(display x) (newline)
	(* x 42))

is not properly translated as:

lambda (x)
	(display x) (newline)
	* x 42

since the result of (display x) is not a function to be applied to the result of (newline).

There are several ways to overcome this, for example using values (in an ideal world, where it worked like you expected it to) and begin. However a simpler solution is to simply add a special prefix the language that when encountered before a line instructs the compiler/reader to add one less pair of parentheses around the line. I would suggest the \ character, since it has been used in other contexts as an escape character. Our examples then would be written (correctly) as follows:

	car (cdr y)
	lambda (x) (/ x 2)
	\ my-list

lambda (x)
	\ (display x) (newline)
	* x 42

Note: \ is not notation for the continuation of a line.

Well, it’s something to think about at least. I know some people will object for the same reasons they do to Python’s white space sensitivity, but consider this before you dismiss it out of hand: one of the biggest barriers to the adoption of Lisp is that the sheer bulk of parentheses can be intimidating, and a method such as this would hide the majority of them, especially the large mass of closing parenthesis that seems to end every complicated function.

As you may know I plan on experimenting with my own Scheme variant soon, and this is one of the ideas I think I will try out in it (as a compiler option of course, not a mandate), unless I hear a good reason not to.

Posted in Exploring Scheme | 25 Comments »