Posted by Peter on May 4, 2006
Today’s post is meant as a warning to others: don’t make the same foolish implementation choices I did.
My original approach to the type system was the following: to create a type description for each type that was abstract, i.e. could contain self-references and unbound variables, so that type inference could be usefully done with it. And it will, when it is finished, serve this purpose admirably (at least I don’t see any major problems with it at the moment). The problem came when I assumed that this same system could be used to tag the objects generated at run time. Since it can code for any type, no matter how abstract, I figured that it would be simpler to simply label the type of run time objects in the same way. What I didn’t take into consideration was how an object at run-time is generated. For example consider a simple cons cell. In general we can describe the type of the cons function as follows
(-> (a b) ((pair-of a b))), meaning that the type of it’s result is dependant on the type of its parameters. However when we generate the result of the cons operation we cannot label as a type containing two free variables, because eventually we might need to provide type checks on the result (for type safety), and a type of unbound gives us no useful information. What is needed then (or so I thought), was to look at the type of the parameters and then construct a new type, at run-time, using this information. This is only sensible as long as you don’t examine it that closely. For example consider adding a new element to a list already containing 100 members. The type of this new element would then contain type information for the 100 existing descending elements embedded in it. Clearly this is an unreasonable amount of overhead.
The correct solution is relatively simple, make the abstract types and the type encoding at run-time different. The run-time information simply needs to provide an indication how to obtain information for the type of related objects at runtime, then when type checks need to be made (rarely of course due to type inference optimizations) we can get all the information we need without duplicating it unnecessarily. Of course there still needs to be some glue between the purely abstract type system and the practical type system, for example when dispatching by type at runtime, but that is simply a detail.
Hopefully I’ll have the type system working soon, so that I can move onto the parser of my little project.
Posted in C/C++, Exploring Scheme | 5 Comments »
Posted by Peter on April 27, 2006
I know it seems like my blog has turned into a place for me to talk without actually accomplishing anything. To remedy this I am releasing a working draft of my asynchronous garbage collector. Unfortunately for you it is written in C++, and is meant to be compiled in windows. (Bless me father, for I have sinned. It has been 2 function calls and a continuation since my last garbage collection.) Porting it shouldn’t be that hard if you want to. Anything that is platform dependant is conveniently located in
platform_dependant.cpp. Also you need to define the symbol
GC_COUNT_POINTERS if you want the sample to compile. Finally I make use of the
hash_set classes, which may not be part of your compiler’s stl.
How it works
In this system there are two types of references, hard and soft. A hard reference means that you guarantee that the reference will not become part of a cyclic data structure. Usually you want to acquire a hard reference to an object you are using directly. Soft references are references from one object to another. If a cyclic data structure made up of soft references occurs the garbage collector will be able to collect it. When you alter an object’s member pointers you must call
gc->DereferenceSoft as appropriate on the pointers as appropriate. Finally when altering member pointers of an object that has existing references pointing to it you must surround the code altering it with
For more specific details I suggest that you check out the example provided. Note that the functions defined in the sample do not reflect how I actually expect to define Lisp functions (in terms of parameters).
Source is available from Sourceforge (see sidebar ->) under “Garbage Collector”.
Posted in C/C++, Exploring Scheme, Windows | Leave a Comment »
Posted by Peter on April 15, 2006
I know I haven’t posted for a while, and I blame my persistent illness for making me too lazy to work seriously. I have been working on my own brand of asynchronous copying garbage collection. Unfortunately I now realize that the method I have been developing is too inefficient, and needs to be re-done.
This leaves me with basically three choices:
The write barrier method (with an asynchronous copying collector): The write barrier method has the advantage of being able to use plain pointers for access. Unfortunately it has the unattractive problem of needed to update the addresses of pointers (since it is a copying collector), which means the collector needs to be able to safely tamper with the stack and global data in the C source to update pointer values. This can be a bit complicated, and is one of the problems my own solution was struggling with.
The handle method (with an asynchronous copying collector): In this method you don’t deal with pointers but with unique handles that you can “lock” in order to get at the memory they currently reference. This method has the advantage of being relatively simple, and safe. Unfortunately it requires frequent hash lookups of handles, which means that it is effectively both a read and write barrier method, which means it is probably too slow.
Asynchronous reference counting: In this method you push the tasks of referencing, dereferencing, and checking for cycles into their own thread, which improves speed in the mutator section of the code. The big disadvantage of this method is that it is not a copying solution.
p.s. Don’t know what a write barrier / read barrier is? Go here for a quick definition.
Posted in C/C++, Design | Leave a Comment »
Posted by Peter on April 10, 2006
I apologize for my short, and still ongoing, hiatus from doing meaningful work, but I have been, and still am, a bit sick. Hopefully I will be able to begin work on an asynchronous copying garbage collector soon. This will be the first step in implementing my own version of Scheme so that I can test my ideas about language alterations instead of just talking about them. For example, I want to try out a type system. Although it would be easy to make a stop-start garbage collector or an asynchronous non-copying collector (which I have done already), I want to eliminate the need for the program to pause for garbage collection while at the same time compacting the memory used as much as possible to help the operating system. I also have some ideas about manually paging unused memory to disk in order to reduce the burden of paging (which is aggravated by garbage collection). I will discuss these issues more fully in the future, with hopefully an implantation to back me up. Until then if you are interested about garbage collection in general I suggest you read “Garbage Collection: Algorithms for Automatic Dynamic Memory Management” by Richard Jones and Rafael Lins, which despite its name is an excellent book, well written, informative, comprehensive, and compact.
Posted in C/C++, Exploring Scheme | Leave a Comment »
Posted by Peter on April 2, 2006
If mzc doesn’t work properly for you, and it doesn’t work (completely) for me, or if you simply want to use Visual Studio for your coding and compiling then you have a bit of work ahead of you. Actually the process itself isn’t that complicated, but good luck figuring out how to do it with the documentation, which is a bit sketchy on how to do things without mzc.
First create a new win32 project and choose the .dll option. (the finished extension is a dll file that Scheme will load at run-time).
Next go to tools->options->projects->VC++ directories (in Visual Studio .Net), and add the path for the PLT libraries and include files to the directories searched for .h and .lib files at compile time. (the directories to add are …\PLT\include and …\PLT\lib\msvc respectively).
Thirdly go to Project->Properties->Linker->Input and add the following three files to the additional dependencies: mzdyn.obj, libmzsch301_000.lib, and libmzgc301_000.lib. Actually you don’t need all of these, but it doesn’t hurt to include them just to be sure.
Finally create a new .def file (for example myproject.def) with a text editor. The body of this file should be the following:
LIBRARY [name of project (the output file minus the extension) goes here]
Now add this file to your project (as if it were a new source file), and go to Project->Properties->Linker->Input again and enter the name of this file for the Module Definition File entry. This .def file tells the compiler that these are functions which the .dll needs to export, and that they have to be named exactly as shown (because the Scheme interpreter expects to find them as such, and if they aren’t present the extension will not be loaded). The scheme_initialize_internal function is provided by a library, you don’t need to write it. The other three functions you must define (as is outlined in the documentation). You don’t need to mark them in any special way for export, the .def file takes care of that.
Now, if you want a tutorial on writing an extension, well that is the job for a different post.
Posted in C/C++, Windows | 3 Comments »