OPLspr11

Home Assignments Lecture Blog Resources Project Discussion Group

91.301 Organization of Programming Languages, Spring 2011
Prof. Fred Martin, ⚠ (:html:)<a href="http://mailhide.recaptcha.net/d?k=01COSqrfJ-58cc94fQb2pI1A==&c=iZBP8kCznrjdnfw8QFFKADFtsIimnLdVHk581djoISQ=" onclick="window.open('http://mailhide.recaptcha.net/d?k=01COSqrfJ-58cc94fQb2pI1A==&c=iZBP8kCznrjdnfw8QFFKADFtsIimnLdVHk581djoISQ=', '', 'toolbar=0,scrollbars=0,location=0,statusbar=0, menubar=0,resizable=0,width=500,height=300'); return false;" title="Reveal this e-mail address">click for fred's email</a>(:htmlend:)
TA: Jian (Jeff) Lu,
Mon/Wed/Fri, 10 am – 10:50 am, Olsen 402

We will be using the following book:

Structure and Interpretation of Computer Programs (2nd edition, 1996, ISBN 0070004846)
Hal Abelson and Jerry Sussman

The Abelson/Sussman book is available online (for free) here. If you like holding a book in your hands, used hard copies are available between $30 and $40. Make sure to get the 2nd edition, published in 1996. Here are links: bigwords, alibris, amazon, bookfinder

Rationale

Basically all departments that award bachelor's degrees in Computer Science have a course like OPL. Such a class is also required by CSAB, the professional organization that accredits CS degrees. So, everyone's got one—why?

In my opinion, this course exists to give you a different way of thinking about computing. A way that is really quite apart from the “professional” programming languages like C, C++, and Java, all of which are based on an edit/compile/debug/deploy model of computation.

There are basically two variants of the OPL-type course at CS departments. One variant is a survey of the ideas in many languages that have been created and implemented. The other variant is a deep-dive into a language favored by language researchers, often Scheme or CAML. Both of these languages are “meta-languages”—they are languages for making languages.

At UMass Lowell, we take the 2nd approach (deep dive) in our undergrad course, and the survey approach in our grad class. For many years here, 91.301 has been a close implementation of the famous “6.001” course at MIT, Structure and Interpretation of Computer Programs. This course was created in the 1970s and has been hugely influential.

Now, it so happens that MIT has just implemented a major overhaul of their undergrad EECS curriculum, and as of Fall 2008, the 6.001 course is no longer being offered. Given that, why are we still teaching it, you might ask?

Isn’t Scheme a Dead Language? (aka, Why do I have to take this class?)

Scheme isn't exactly dead. There is a committed community involved in on-going development of the version of Scheme we'll be using (Racket). Scheme itself is a streamlined, pedagogically pure version of Lisp; Lisp is an expanded version of the language, with lots of libraries useful in real-world applications. While not hugely popular, there are still significant real-world systems being built in Lisp. The Orbitz flight reservation system is a leading example.

More importantly, the ideas behind Scheme—e.g., functional programming—are valuable, even if you're not coding in Scheme.

  • Jane St. Capital, a Wall Street trading company, does a lot of programming in OCaml, and object-oriented functional programming language, and explicitly recruits Scheme programmers. (See their research paper at http://portal.acm.org/citation.cfm?id=1394798).
  • It is now clear that processors aren't going to be getting faster at the rate they have been over the last 20 years, and that multi-core systems and their associated programming will be increasingly important in order to continue the performance gains we expect. Functional programming is much easier to parallelize than traditional imperative styles, and many believe that expertise in thinking in functional ways is becoming more and more important. See the recent Dr. Dobbs article, “It's Time to Get Good at Functional Programming.
  • Apple Computer is supporting the open-source LLVM compiler project, including the Clang language. Based on these technologies, Apple has developed a language extension called “blocks” in their latest operating system release (Snow Leopard). As John Siracusa explains, blocks “add closures and anonymous functions to C and the C-derived languages C++, Objective-C, and Objective C++.” Clang's blocks allow much more efficient programming style when writing callbacks and code with parallelism. Closures and anonymous functions are ideas that originated in Lisp and Scheme.
  • Aside from the particular concepts in Scheme or Lisp, they represent a fundamentally different approach to building languages than the dominant method, which is based on compilers and binary executables. Scheme and Lisp are implemented as interpreters, and interpreters allow a much more iterative and interactive style of code development. Python is the presently the most popular language that is based on the interpreter approach. Also, interpreters are often built into domain-specific applications as powerful, accessible scripting environments within those applications; Tcl and Lua are other languages often built into systems as interpreters. AutoCAD and GIMP are two well-known systems that include Lisp or Scheme interpreters.

What Is The Big Idea Then?

There are actually several “big ideas“ that we will bring out in OPL:

  • Program-as-data. In typical programming languages, there is a sharp distinction between what is code and what is data. Data structures are allocated explicitly, and code is written to manipulate them. The two things are of a different nature. (Of course, “bits are bits,” but you're not going to be placing executable code into an array unless you're implementing a buffer overrun attack.)
But in Scheme, the fundamental notation for describing code and data (known as the S-expression) is the same thing. Data structures and executable procedures are both nested lists. Furthermore, it is commonplace for code to produce data that is executed as code.
This leads to the next point...
  • Functions as first-class objects. In Scheme, procedures (also known as functions) can accept procedures as inputs (arguments). But they also can easily create procedures as outputs (return values). This leads to a style of programming known as functional programming, in which functions are composed and applied to lists of data to produce results, instead of the more prevalent approach of sequentially manipulating data structures.
Please don't confuse C/Java functions with Scheme-style functional programming. In C and Java, functions are really imperative programming—a series of commands—with lots of side-effects (mutation of data structures and variable values). In functional programming, given the same input, the result of evaluating a function is always the same (think mathematical functions). There is no global or external state that gets involved.
Functional programming has some advantages in transparency and simplicity, particularly in language and symbolic processing, and is facilitated by a language like Scheme in which code can easily construct and output code on the fly. Also, a functional program, with its lack of side-effects and mutation of data structures, is much easier to parallelize—you can run multiple functions separately and concurrently, without worrying about them fighting over the same shared data structures.
  • Data abstraction. This was one of the really big contributions of SICP—the idea of abstracting data structures from the interfaces for manipulating them. It then becomes possible to re-implement an underlying data structure without changing the code that uses it. For example, if there is a concatenate operation that appends one string to another, code that uses concatenate doesn't need to change even if the underlying representation of a string changes.
This idea, which is of course the basis of object-oriented programming, is so well-established that it may now seem obvious. But this was hardly the case in the 1960s and 1970s when SICP was developed. Scheme has a somewhat different way of bundling together data representations and the procedures (methods) for manipulating them, which allows a high degree of flexibility.
  • The environment and persistence. In typical programming environments, data and objects are created anew each time the program launches. If you need to return to a previous execution state, then you read in data (e.g., from files on the disk or off the net) and reconstitute the data structures that hold that data. Object-oriented languages typically provide some way to serialize objects—converting them into a flat-file format (e.g., XML) for saving and loading across execution runs.
Scheme handles things totally differently. Once you create an object, it's just there—existing in the environment in which it was created. As long has you have a “handle” to it (i.e., you've named it, or it's part of another object that you have access to), the object will persist. When you quit Scheme, the entire environment including all objects gets saved to disk. Next time you relaunch, you reload the environment file and everything is exactly as you left it. (There once were Lisp Machines, and the concept of “quitting Scheme” didn't exist.) Smalltalk, the language developed by Alan Kay as part of the Xerox Star project [a.k.a., the machine that led to the Macintosh and the WIMP windows-icon-mouse-pointer interface], also implemented code and data persistence through a saveable environment image. In fact, Squeak, a current implementation of Smalltalk, includes code and data objects that were created back in the 1970s during the original Smalltalk work—as a “sourdough yeast starter” reproduces itself through the years.
As part of the implementation of the environment, garbage collection was introduced. Objects that had no way of being accessed (i.e., they had no names, no pointers to them) could be removed from memory, and the space they took up could then be freed for other purposes. Once controversial because of its complexity, automatic garbage collection is now considered an obvious part of a modern language design.
  • Interpretation and the Listener. Scheme was historically an interpreted language, and provided a “Listener” console for interactively constructing expressions and evaluating them. (At one time, the fact that it was interpreted was considered a significant performance liability, but compiled versions of Scheme and Lisp now exist, removing this as a concern.)
Because of the Listener, developing Scheme programs feels quite different than working in a typical edit-compile-test language. After a single procedure is defined, you can try it out interactively, giving it inputs and examining its outputs. Combined with the concept of the environment, you end up iteratively and alternately developing data structures and the code for operating on them.
When you become accustomed to the Listener, you feel stymied without it. It becomes annoying to write main functions simply for the purpose of exercising your routines—why can't you just talk to them directly? Similarly, the environment is a powerful construct—you build up a library of objects that are part of your project, and once created, they are part of your software system.
Indeed the whole experience of computing becomes one of building objects that of course persist and seem “alive.” Rather than writing recipes that only temporarily instantiate objects, you create them directly, knowing that they will be there for you later.

Course Structure and Grading

The class will have regular weekly assignments, which will be graded and returned. Cumulatively these assignments are worth 25% of your overall grade. Assignments will be accepted up to 1 week late with a 50% reduction in that assignment's value. If you fall behind on your homework, it is much better to cut your losses and work on the current assignment, instead of running behind trying to catch up.

There will be two in-class quizzes during the semester. Each is worth 10% of your overall grade.

There will be a cumulative final, worth 20% of your overall grade.

Classroom participation is worth 10% of your overall grade. In practice, if your other grades put you on a marking boundary, this will push it one way or the other.

You may notice that this leaves 25% remaining. Based on last semester's success, I am continuing with a course final project, which will be conducted in the last three weeks of the semester. We will exploratory research and discussions before then, though, so you can start preparing for it.

In the final project, you will apply the ideas developed in the class in an original software implementation. You may thus connect the ideas of the class with your own interests—music, robotics, art, databases, the web, networking, gaming, etc. The learning goal of the project is to have you find some real-world relevance of the ideas in the class.

To summarize:

25% Weekly homeworks
20% Two quizzes
20% Final
25% Project
10% Classroom participation

Discussion Group / E-Mail List

We will use Google Groups for class conversation and announcements. Please join this group. I'd advise setting your preferences to immediate, individual delivery of messages—click the “Edit my membership” tab.

⚠ (:html:) <table style="border:1px solid #aa0033; font-size:small" align=center> <tr> <td rowspan=3> <img src="http://groups.google.com/groups/img/groups_medium.gif" height=58 width=150 alt="Google Groups"> </td> <td colspan=2 align=center><b>Subscribe to 91301-s11</b></td> </tr> <form action="http://groups.google.com/group/91301-s11/boxsubscribe"> <tr> <td>Email: <input type=text name=email></td> <td> <table style="background-color:#ffcc33;padding:2px;border:2px outset #ffcc33;"> <tr> <td> <input type=submit name="sub" value="Request"> </td> </tr> </table> </td> </tr> </form> <tr><td colspan=2 align=center> <a href="http://groups.google.com/group/91301-s11" target="new">Browse Archives</a> </td></tr> </table> (:htmlend:)

The group address is 91301-s11@googlegroups.com. You have to be a member to send to the list.

Collaboration Policy

You are welcome to discuss ideas in the class with your peers. However, pair programming or other sharing of code is not allowed. By turning in an assignment, you attest that you have written the code that it includes. Please be familiar with the university's academic integrity policy.

Honors Section

Students who are registered for the honors section of the class are expected to have exemplary work, including classroom participation, written work, and the course project. Additional and more difficult problems will be assigned in most of the weekly problem sets.

Acknowledgment

Much of this course design is based on work done by UML Prof. Holly Yanco.