OPLspr14

Home Assignments Lecture Blog Resources Project Discussion Group

91.301 Organization of Programming Languages, Spring 2014
Prof. Fred Martin, ⚠ (:html:)<a href="http://mailhide.recaptcha.net/d?k=01COSqrfJ-58cc94fQb2pI1A==&c=iZBP8kCznrjdnfw8QFFKADFtsIimnLdVHk581djoISQ=" onclick="window.open('http://mailhide.recaptcha.net/d?k=01COSqrfJ-58cc94fQb2pI1A==&c=iZBP8kCznrjdnfw8QFFKADFtsIimnLdVHk581djoISQ=', '', 'toolbar=0,scrollbars=0,location=0,statusbar=0, menubar=0,resizable=0,width=500,height=300'); return false;" title="Reveal this e-mail address">click for fred's email</a>(:htmlend:)
TA: Swathi Kurunji, skurunji@cs.uml.edu. Office hours: 2 to 4 pm on Tuesdays.
Mon/Wed/Fri, 1p – 1:50p, Olsen 402

We will be using the following book:

Structure and Interpretation of Computer Programs (2nd edition, 1996, ISBN 0070004846)
Hal Abelson and Jerry Sussman

The Abelson/Sussman book is available online (for free) here. If you like holding a book in your hands, used hard copies are available between $30 and $40. Make sure to get the 2nd edition, published in 1996. Here are links: bigwords, alibris, amazon, bookfinder

Rationale

Hal Abelson (in introduction to SICP):

“programs must be written for people to read and only incidentally for machines to execute.”

Fred Martin (email communication):

That is overstating things, isn't it?
Most code is written to do actual practical work in the world.
So the primary audience is the machine.

Franklyn Turbak (computer scientist at Wellesley College; email communication, edited slightly):

But it is the quote that inspired me to become a computer scientist, and it is the quote that has guided me in most of my teaching and research.
I divide up computer scientists into two groups. There are “machinists,” who view programming as a way to control a machine, and there are “linguists,” who view programming as a way of expressing ideas. The SICP quote sums up the linguist viewpoint.
Dividing people into two groups is a vast oversimplification, and certainly most people in the linguist camp write code to do practical work. But when they're writing that code, they're thinking about how to organize it with the right abstractions to make it easy to for them and others to understand and reuse. In this case the code is not just a utilitarian thing to get a job done; it's a mechanism for conveying knowledge. For a linguist, the code not only has to work, it has to be beautiful.

Basically all departments that award bachelor's degrees in Computer Science have a course like OPL. Such a class is also required by CSAB, the professional organization that accredits CS degrees. So, everyone's got one—why?

In my opinion, this course exists to give you a different way of thinking about computing. A way that is really quite apart from the “professional” programming languages like C, C++, and Java, all of which are based on an edit/compile/debug/deploy model of computation.

There are basically two variants of the OPL-type course at CS departments. One variant is a survey of the ideas in many languages that have been created and implemented. The other variant is a deep-dive into a language favored by language researchers, often Scheme or CAML. Both of these languages are “meta-languages”—they are languages for making languages.

At UMass Lowell, we take the 2nd approach (deep dive) in our undergrad course, and the survey approach in our grad class. For many years here, 91.301 has been a close implementation of the famous “6.001” course at MIT, Structure and Interpretation of Computer Programs. This course was created in the 1970s and has been hugely influential.

Now, it so happens that MIT has just implemented a major overhaul of their undergrad EECS curriculum, and as of Fall 2008, the 6.001 course is no longer being offered. Given that, why are we still teaching it, you might ask?

Isn’t Scheme a Dead Language? (aka, Why do I have to take this class?)

Scheme isn't exactly dead. There is a committed community involved in on-going development of the version of Scheme we'll be using (Racket). Scheme itself is a streamlined, pedagogically pure version of Lisp; Lisp is an expanded version of the language, with lots of libraries useful in real-world applications. While not hugely popular, there are still significant real-world systems being built in Lisp. The Orbitz flight reservation system is a leading example.

More importantly, the ideas behind Scheme—e.g., functional programming—are valuable, even if you're not coding in Scheme.

  • Jane St. Capital, a Wall Street trading company, does a lot of programming in OCaml, and object-oriented functional programming language, and explicitly recruits Scheme programmers. (See their research paper at http://portal.acm.org/citation.cfm?id=1394798).
  • It is now clear that processors aren't going to be getting faster at the rate they have been over the last 20 years, and that multi-core systems and their associated programming will be increasingly important in order to continue the performance gains we expect. Functional programming is much easier to parallelize than traditional imperative styles, and many believe that expertise in thinking in functional ways is becoming more and more important. See the recent Dr. Dobbs article, “It's Time to Get Good at Functional Programming.
  • Many languages, including C++, now incorporate ideas that originated in Lisp/Scheme, including closures and anonymous functions.
  • Aside from the particular concepts in Scheme or Lisp, they represent a fundamentally different approach to building languages than the dominant method, which is based on compilers and binary executables. Scheme and Lisp are implemented as interpreters, and interpreters allow a much more iterative and interactive style of code development. Python is the presently the most popular language that is based on the interpreter approach. Also, interpreters are often built into domain-specific applications as powerful, accessible scripting environments within those applications; Tcl and Lua are other languages often built into systems as interpreters. AutoCAD and GIMP are two well-known systems that include Lisp or Scheme interpreters.

What Is The Big Idea Then?

There are actually several “big ideas“ that we will bring out in OPL:

  • Program-as-data. In typical programming languages, there is a sharp distinction between what is code and what is data. Data structures are allocated explicitly, and code is written to manipulate them. The two things are of a different nature. (Of course, “bits are bits,” but you're not going to be placing executable code into an array unless you're implementing a buffer overrun attack.)
But in Scheme, the fundamental notation for describing code and data (known as the S-expression) is the same thing. Data structures and executable procedures are both nested lists. Furthermore, it is commonplace for code to produce data that is executed as code.
This leads to the next point...
  • Functions as first-class objects. In Scheme, procedures (also known as functions) can accept procedures as inputs (arguments). But they also can easily create procedures as outputs (return values). This leads to a style of programming known as functional programming, in which functions are composed and applied to lists of data to produce results, instead of the more prevalent approach of sequentially manipulating data structures.
Please don't confuse C/Java functions with Scheme-style functional programming. In C and Java, functions are really imperative programming—a series of commands—with lots of side-effects (mutation of data structures and variable values). In functional programming, given the same input, the result of evaluating a function is always the same (think mathematical functions). There is no global or external state that gets involved.
Functional programming has some advantages in transparency and simplicity, particularly in language and symbolic processing, and is facilitated by a language like Scheme in which code can easily construct and output code on the fly. Also, a functional program, with its lack of side-effects and mutation of data structures, is much easier to parallelize—you can run multiple functions separately and concurrently, without worrying about them fighting over the same shared data structures.
  • Data abstraction. This was one of the really big contributions of SICP—the idea of abstracting data structures from the interfaces for manipulating them. It then becomes possible to re-implement an underlying data structure without changing the code that uses it. For example, if there is a concatenate operation that appends one string to another, code that uses concatenate doesn't need to change even if the underlying representation of a string changes.
This idea, which is of course the basis of object-oriented programming, is so well-established that it may now seem obvious. But this was hardly the case in the 1960s and 1970s when SICP was developed. Scheme has a somewhat different way of bundling together data representations and the procedures (methods) for manipulating them, which allows a high degree of flexibility.
  • The environment and persistence. In typical programming environments, data and objects are created anew each time the program launches. If you need to return to a previous execution state, then you read in data (e.g., from files on the disk or off the net) and reconstitute the data structures that hold that data. Object-oriented languages typically provide some way to serialize objects—converting them into a flat-file format (e.g., XML) for saving and loading across execution runs.
Scheme handles things totally differently. Once you create an object, it's just there—existing in the environment in which it was created. As long has you have a “handle” to it (i.e., you've named it, or it's part of another object that you have access to), the object will persist. When you quit Scheme, the entire environment including all objects gets saved to disk. Next time you relaunch, you reload the environment file and everything is exactly as you left it. (There once were Lisp Machines, and the concept of “quitting Scheme” didn't exist.) Smalltalk, the language developed by Alan Kay as part of the Xerox Star project [a.k.a., the machine that led to the Macintosh and the WIMP windows-icon-mouse-pointer interface], also implemented code and data persistence through a saveable environment image. In fact, Squeak, a current implementation of Smalltalk, includes code and data objects that were created back in the 1970s during the original Smalltalk work—as a “sourdough yeast starter” reproduces itself through the years.
As part of the implementation of the environment, garbage collection was introduced. Objects that had no way of being accessed (i.e., they had no names, no pointers to them) could be removed from memory, and the space they took up could then be freed for other purposes. Once controversial because of its complexity, automatic garbage collection is now considered an obvious part of a modern language design.
  • Interpretation and the Listener. Scheme was historically an interpreted language, and provided a “Listener” console for interactively constructing expressions and evaluating them. (At one time, the fact that it was interpreted was considered a significant performance liability, but compiled versions of Scheme and Lisp now exist, removing this as a concern.)
Because of the Listener, developing Scheme programs feels quite different than working in a typical edit-compile-test language. After a single procedure is defined, you can try it out interactively, giving it inputs and examining its outputs. Combined with the concept of the environment, you end up iteratively and alternately developing data structures and the code for operating on them.
When you become accustomed to the Listener, you feel stymied without it. It becomes annoying to write main functions simply for the purpose of exercising your routines—why can't you just talk to them directly? Similarly, the environment is a powerful construct—you build up a library of objects that are part of your project, and once created, they are part of your software system.
Indeed the whole experience of computing becomes one of building objects that of course persist and seem “alive.” Rather than writing recipes that only temporarily instantiate objects, you create them directly, knowing that they will be there for you later.

Course Structure and Grading

The class will have regular weekly assignments, which will be graded and returned. Cumulatively these assignments are worth 25% of your overall grade. Assignments will be accepted up to 1 week late with a 50% reduction in that assignment's value. If you fall behind on your homework, it is much better to cut your losses and work on the current assignment, instead of running behind trying to catch up.

There will be two in-class quizzes during the semester. Each is worth 10% of your overall grade.

There will be a cumulative final, worth 20% of your overall grade.

Classroom participation is worth 10% of your overall grade. In practice, if your other grades put you on a marking boundary, this will push it one way or the other.

You may notice that this leaves 25% remaining. Based on last semester's success, I am continuing with a course final project, which will be conducted in the last three weeks of the semester. We will exploratory research and discussions before then, though, so you can start preparing for it.

In the final project, you will apply the ideas developed in the class in an original software implementation. You may thus connect the ideas of the class with your own interests—music, robotics, art, databases, the web, networking, gaming, etc. The learning goal of the project is to have you find some real-world relevance of the ideas in the class.

To summarize:

25% Weekly homeworks
20% Two quizzes
20% Final
25% Project
10% Classroom participation

Autograder

We will use the “Bottlenose” autograder system for assignment submission and grading. Click on the Assignments tab at the top of the web page to see the Assignments.

You already have been made an account in Bottlenose, using your @student.uml.edu email address. You should have a link in your inbox with a key to log in.

If you lose the email, you can just enter your email address at the Bottlenose main screen and it will email it to you again.

We'll go over how to use Bottlenose in class.

Discussion Group / E-Mail List

We will use Google Groups for class conversation and announcements. Please join this group.

There are two ways to sign up:

⚠ (:html:) <table style="border:1px solid #aa0033; font-size:small" align=center> <tr> <td rowspan=3> <img src="http://www.cs.uml.edu/ecg/uploads/OPLspr14/googlegroups_logo.gif" height=58 width=150 alt="Google Groups"> </td> <td colspan=2 align=center><b>Subscribe to 91301-s14</b></td> </tr> <form action="http://groups.google.com/group/91301-s14/boxsubscribe"> <tr> <td>Email: <input type=text name=email></td> <td> <table style="background-color:#ffcc33;padding:2px;border:2px outset #ffcc33;"> <tr> <td> <input type=submit name="sub" value="Request"> </td> </tr> </table> </td> </tr> </form> <tr><td colspan=2 align=center> <a href="http://groups.google.com/group/91301-s14" target="new">Browse Archives</a> </td></tr> </table> (:htmlend:)

If you do the Google web method, I'd advise setting your preferences to immediate, individual delivery of messages—click the “Edit my membership” tab.

If you do the email method, and you provide an address that's not linked to your Google account, I'll choose those delivery options for you, and then if you don't like it you'll have to ask me to change it, and we'll both be annoyed, so don't do that.

In either case you can send email to the list with mailto:91301-s14@googlegroups.com.

If you sign up on the web, you can browse conversations at the group link (see top of this page).

Lecture blog and lecture capture

I will strive to maintain a daily blog of highlights of what happened in class each class meeting. These notes will be recorded in the Lecture Blog page.

In-class activity will be recorded using the University's Echo360 lecture capture system. This material is intended for your use if you must miss class, or if you want to go over again something that was presented/discussed in class.

A link to the Echo360 recordings is at the top of the Lecture Blog page.

Echo360 makes a high quality recording of the classroom's data projector and the instructor's voice.

It also produces a low-resolution recording of the front of the room (i.e., me walking around writing stuff on the white board), and a low-volume capture of student remarks (depending how close to the microphone you're sitting, and how loud you are).

If you arrive at class after 1p, and walk across the front of the room, you'll be captured by the low-res overview cam.

If you're a talker and you're at a decent volume and near a mic, people will be able to hear you on the recording. (They won't see you sitting in your chair—maybe the back of your head if you're at the front of the room.)

If it bothers you that your voice will be captured, and you want to feel comfortable speaking up, please let our TA know and the TA will anonymously refer your concerns to me. We'll set up the captures in a more private way.

Collaboration policy

Individual work. Most assignments must be completed individually. You are welcome to discuss ideas in the class with your peers. You may not look at each others' code, nor allow others to look at your code. If you need to post code on our own course forum for help, or a public forum, do not post more than three lines.

When turning in an individual assignment, you attest that, beyond any starter code I have provided or has been provided in standard API and reference documentation, you are the sole author the code that it includes.

Pair programming. A few specific assignments may allow pair programming. There will be highly structured rules for these (which are intended to make sure both partners have a substantial learning experience).

This will be discussed later in class, and this document will be updated at that time.

Academic integrity. Please be familiar with the university's policy on academic integrity.

Acknowledgment

Much of this course design is based on work done by UML Prof. Holly Yanco.