On Scheme

Thoughts on Scheme and teaching Scheme

Archive for the ‘Teaching Scheme’ Category

Theory and Application

Posted by Peter on April 25, 2006

No one seems to like the computer science curriculum offered at universities. The complaints I hear most often are that there isn’t enough theory, and that there isn’t enough practical application of concepts. Initially I thought that these goals were mutually exclusive, and that people complained simply because they were taking the wrong kind of courses. However, when you look at what actually is taught, you come to realize that some courses manage to leave out both theory and application when they could easily have had both. Before I show how both theory and practice can work together I probably need to defend the importance of each to computer scientists.

Let’s talk about theory first. When I think of theory I think immediately of math, and I think this connection tends to scare away some students. The theory side of computer science, ideally, deals with computation as an abstract process, which can be studied in order to prove, often with math, what is the best solution for a given problem. For example, theory teaches how to analyze algorithms under different conditions for speed and space efficiency. The theory side also encompasses topics such as how to mathematically model networks, lambda calculus, information theory, ect. In all honesty learning theory isn’t much fun, but it is necessary. The most important consequence of theory is that it encourages new “practical” advancements. For example the theory of lambda calculus gave birth (partly) to Lisp. Knowing more about information theory will help you design better network transport protocols and compression schemes. Advancements can be made without a theoretical background, but they tend to be made slowly with many revisions. Ideally theory can tell you directly what the best solution to a problem is, allowing you to implement that best solution in the beginning, not after 15 beta versions.

Practical knowledge however is also important. Obviously without actual programming experience you can’t put any of your wonderful theoretical knowledge to use. Programming and other practical applications also give insights into the capabilities of current software and hardware. Often the physical limits of the computer will determine what theories can be made into a real system, and you won’t be familiar with these capabilities unless you deal with them regularly. Finally, the desire to actually implement something is an excellent vehicle for learning more about computer science in general. For example I first learned how networks functioned before I even went to college because I was trying to create a chat program (in order to cheat on the computer science exams, but that is another story).

When I look over the classes offered at my affiliated university I notice some gaping holes, in both the applied and theoretical curriculums. The most noticeable missing applied class is one on debugging, which has always struck me as odd. Programmers can waste a lot of time trying to find bugs, I know I do, but for some reason students aren’t taught how to debug and are expected to learn on their own. As a result students new to programming often spend more time trying to find errors in their programs than they do writing them. In the theoretical side of the curriculum a class on information theory is clearly missing. Information theory has its fingers in every aspect of computer science, because from one point of view all any program does is manipulate information. Once again universities seem reluctant to actually teach students this material, expecting students to pick it up on their own.

Ideally application and theory should go hand in hand in the classroom. For example if you were teaching a class on mathematical approach to networks, with regards to speed, connectivity, ect it would be reasonable to have the students implement some of the best models that result from the mathematical study. Likewise purely practical classes, say a class on XML, could be enriched by applying that practical knowledge to an abstract domain, for example implementing abstract syntax trees in XML. Instead what we have are mediocre classes that do neither. For example take a look at any text book on operating systems. Inside you will find plenty of general information, but you will not find any code, not even pseudo-code. Nor will you find any theoretical underpinnings for why operating systems work the way they do. Instead these books stick to giving outlines on topics such as “what is a file system?”. Classes on networking, hardware, artificial intelligence, and many others, all follow this pattern, with students being exposed to little theory and less concrete application of the concepts.

Of course if you are a self motivated computer science student this won’t be too much of a problem for you. Now that you know what kinds of things your education is lacking it is easy enough to learn them from the internet (for the most part). Of course if you aren’t a self motivated student then this isn’t much help, but then again if you aren’t a self motivated student why are you in computer science?

Posted in Teaching Scheme | 2 Comments »

Convincing Programmers They Want To Use Lisp

Posted by Peter on April 7, 2006

A reader commented yesterday that one, unmentioned, barrier to the adoption of Lisp as a universal language is convincing programmers of other languages that Lisp is a better choice. Its hard enough to convince someone to move from one language to another when they look the similar, for example from C to C++, so convincing someone that they want to use Lisp is even harder than normal.

Everyone says how elegant Lisp syntax is and how useful functional programming is. This may be true but it is definitely not convincing to the experienced programmer. The experienced programmer is not intimidated by complicated code, which can have its own kind of elegance, what they want is safety, readability, and extensibility.

For example you may, badly, try to convince them that Lisp is better by showing them how a universal map function is easier to write.
Your example may look like the following:

(define map
  (lambda (function first rest combine list)
    (if (null? list)
        '()
        (combine (function (first list)) (my-map function first rest combine (rest list))))))

You point out the elegance of this, how its operation is extendable to any kind of data structure, for example it can be called on lists as follows: (map (lambda (x) (+ 1 x)) car cdr cons '(1 2 6 7)), and expect the beauty of first order functions to win the programmer over.

However the programmer responds with the following:

interface ConCell<T>
	{
	public T car();
	public ConsCell<T> cdr();
	static public ConsCell<T> cons(T a, ConsCell<T> b);
	}

interface ElementFunction<A, B>
	{
	static public A operate(B x);
 	}

class Map<A, B>
	{
	public static ConsCell<A> doMap(ElementFunction<A, B> func, ConsCell<B> list)
		{
		if (list)
			{
			return ConsCell<A>.cons(func.operate(list.car()), Map<A, B>.doMap(func, list.cdr()));
			}
		else
			{
			return null;
			}
		}
}

Yes, the definition is much longer, but it can be called in the same basic way: Map<int, int>.doMap(myAdditionFuncion, NumList.MakeList(1, 2, 6, 7));. The experienced programmer doesn’t care that the definition is substantially longer, since they can use it in their code in basically the same way. In fact they may prefer their definition because it is both type safe and extendable. For example if you wanted every member of the data structure to implement an index function you could add public int index(); to the cons cell definition, and either force structures to implement it or throw an exception. Yes the same extensibility, and possibly type safety as well, could be accomplished in Lisp using an object system, but then your definition of map is not much shorter or more elegant than theirs.

Now that I have demonstrated the wrong way of doing things what is the right way? The right way is to show them macros. Give the example of a networked file system, where when you open a remote file you send the in use message, and when you close it you send the finished message. Thus in the code you will see a lot of the following pattern:

SendInUseCommand(f);
…
SendClosedCommand(f);

All it takes is to forget to send the closed notification once and you have a hard to find bug on your hands. The experienced programmer knows that while they may always remember to do this, since they wrote the system, their colleagues may not. Now show them the following macro:

(defmacro with-net-file
  (lambda (file &rest body)
    `(progn (OpenNetFile file) ,@body (CloseNetFile file))))

Now to write their network file operations you write:

(with-net-file f
   …)

And now no one can forget to close the file, plus it’s shorter as well. There is no way to do this in a non Lisp language. The closest you can come is to create an object whose constructor sends the open command and destructor sends the closed command, but even this approach has problems. Languages such as Java don’t have reliable destructors, so this option is flat out for them. In other languages you can either create the object on the stack, which means that the file will be in use for the entire function call, which is definitely not desirable (maybe a function called within it wants to open the same file, and you only needed the file for the very beginning of the function), but the only other alternative is to dynamically create and destroy the object when you need it which has the same problem, the programmer needs to remember to dynamically destroy it.

Once you have caught the programmer’s interest with simple macros show them some other cool things they can do, like language extensions. Experienced programmers are always looking for more control, and macros give it to them. Later you can tell them that functional programming is cool too, but they probably won’t need that incentive.

On a personal note macros are what convinced me to move to Lisp. For the longest time I was a C++ programmer. Even though I had seen some Scheme and Lisp back in school it never grabbed me, probably because they didn’t mention macros. However once I read On Lisp, and saw how cool macros were I was hooked. Why I ended up favoring Scheme, with its ugly and complicated macro system instead of Common Lisp is a story for another day however.

Posted in Common Lisp, Scheme Community, Teaching Scheme | 29 Comments »

Java is Bad for Students

Posted by Peter on March 26, 2006

I’ll be back with more on type systems tomorrow, hopefully, but today I want to state my objections to Java being used to teach students computer science, especially in college. Joel on Software also wrote a piece on Java schools, and his conclusion was that Java was too easy for students, that it did too much for them and thus never forced them to learn the concepts. I think this is true in terms of the number of libraries built in, for example students never really need to know how to create a linked list, however in terms of the language itself I think that Java is just the opposite, it is too complex.

Not everyone begins programming at the college level knowing about variables and classes and functions, and because of this many promising students are turned away from programming. To create even a simple hello world program in Java you need to be able to import from a library, create a class, create a function in that class, and write a complicated prototype for the main function. On the other hand in C, for example, you just need to write a very simple main function and use printf. Most teachers simply tell students to simply copy the boiler plate code, which is fine, but to do anything in Java you need to use objects, so there is a lot of code students need to copy and paste. In even the simplest programs you are using System.out to print your results and to manipulate text you need Strings, both of which require the student to use objects and method calls before they are ready to. This alone adds more and more complexity that the students are just told to ignore, or to copy and paste without it being explained to them. Even using Strings can lead to confusing problems as students wonder why String s = new String() is possible but int i = new int() isn’t. Eventually some students are able to make sense of this, but for many weeks, before objects are introduced students are engaging in a copy-paste mentality for the majority of their work.

This copy-paste mentality makes the program seem much more confusing than it really is. By the time students are introduced to objects and the difference between a reference and a value most of them are thoroughly lost. Besides, how do you explain a reference to students who don’t know that data has addresses in memory, much less what a pointer is? The answer is that you just do some hand-waving, and students learn enough that they don’t make too many mistakes, and also have no idea what it really is. Because of this the bar has to be greatly lowered on introductory programming courses, and in subsequent courses, as students enter into the system without a real understanding of what the program is doing, but only with a handful of tricks that seem to make things work.

The benefits of beginning with C/C++/ect is that in these languages it is possible to write complete programs with only a very small portion of the language. Also it is possible to explain more complex features in terms of simpler ones, for example a struct can be explained in terms of an array with named slots. Also students who understand one of these languages have a much easier time understanding assembly language, and thus what the machine is actually doing when their programs run, which in turn leads to fewer bugs and faster programs. Scheme and Lisp also have the advantage that complete programs (at the REPL) can be run with only a small portion of the language. Basically all you need to do is explain define and function calls and you are ready to go. Admittedly Scheme and Lisp don’t give students an understanding of the low level operation of the computer, however they make up for this by making it easy to teach about complex algorithms, functional programming, and data structures. For example students can be taught about binary trees using only lists, without any need for complicated structures or objects.

Unfortunately Java, which is easy to learn for programmers who have already mastered the fundamentals of programming, has been adopted in the majority of colleges, probably because of the obsession with giving their students “practical” skills. Thus this generation of programmers is coming out of college barely able to code.

Symptoms of Java Education:
-Every bug is approached with the debugger or with print statements
-Programs are never optimized
-Difficulty learning new languages
-Difficulty leaning new styles of programming

Posted in Teaching Scheme | Leave a Comment »

The Importance of Graphics

Posted by Peter on March 12, 2006

Students learn best when they have something they want to do on the computer. Usually this means some kind of personal project. For example I bet many beginning programmers learned how to write network code not in a class, but because they wanted to make an online game or a chat program. Beginning programmers are often intrigued by the idea of making their own games, no matter how simple. Thus introducing them to the graphics libraries, or at least some limited subset of them is a good idea. For Scheme the best graphics for beginners are the viewport graphics. Nice features of the library include built-in ways to draw most simple graphics as well as a fairly straightforward interface for getting user keyboard and mouse input (although it could be simplified a bit). There are a couple of disadvantages to using this library however. One is that if the window is closed by using the operating system close commands (such as clicking the X in the upper right) the program has no way of determining this, and will continue indefinitely until the programmer manually breaks its operation. Another is that the drawing commands are implanted in an object oriented system (and usually one doesn’t talk about object oriented concepts until much much later). For example you don’t draw a line with (draw-line viewport pos1 pos2 color) instead you must write ((draw-line viewport) p1 p2 color). Admittedly I could eliminate some of this with wrapper functions, but students seem to be able to use the library well enough without them, even if they don’t understand it fully. Despite my satisfaction with it I may still write my own graphics library for beginners, simply because of the bug where the program will continue to run even though the viewport is closed, and if I do I promise to post it here.

One closing remark: As their first project I have students write a version of the snake game, and it seems to be fairly effective in sparking their interest in learning more on their own.

Posted in Teaching Scheme | 1 Comment »

Explaining Let in Terms of Lambda

Posted by Peter on March 11, 2006

Most explanations of let simply explain it as creating temporary symbols which can be assigned values, and that these symbols can only be reached within the body of the let. This is a perfectly valid and informative way of describing how let functions, but I decided to explain it in a different fashion, by appealing to programming constructs already understood by the student. (I assume that the student has already mastered the basics of the lambda statement, realizing that it can be returned from functions, passed to functions, as well as called immediately.)

To motivate the learning process I ask the student how they would go about printing user input, acquired from (read) twice. At first the student may think this is impossible, as if they call read again they will be prompting the user for input again, which is not what they want. So then I ask them if they can write a function taking one argument that prints its argument twice. This is easy enough to do, and once they have this function than printing the user input twice is no problem. Now, to make life more difficult I ask if they can accomplish the same thing in a single function. The brightest students may make the conceptual leap and see that they can do this with lambda, but after demonstrating it even those don’t make the conceptual leap seem to get it. So the program we are writing may look like this:

(defun printtwice ()
  ((lambda (val)
      (display val)
      (display val))
    (read)))

To make things sink in we then cover more examples using this same technique, which also has the benefit of reinforcing how lambda can be used. After the students seem to get it I then introduce let as a kind of shorthand for this construction. (now that I think of it writing let as a macro in terms of this construction may be a good exercise for macros.) I then rewrite our examples using let, so our first example becomes:

(defun printtwice()
  (let ((val (read)))
      (display val)
      (display val)))

The only real difficulty I have had is explaining why there are two parenthesis around ((val (read))) instead of one. Examples involving more than one name defined by the let statement show the students why it is necessary in order for the computer to understand their code, but even so they have a tendency to forget them.

Posted in Teaching Scheme | Leave a Comment »