Data Structures, all the way Open

I’ve taught Data Structures (the 3rd programming course in most computer science programs) for many years. My course, like every other Data Structures course I know of, revolves around large programming projects. Students are required to build a data structure, usually with a very tightly prescribed interface.

For example, when I started teaching the course, I had students completely re-implement the “LinkedList” class from Java. The “List” interface is specified by Java, and so there is little need (or desire) for creativity. Back when I was a teaching assistant at University of Washington we gave similar projects, such as a project requiring students to build a space-partitioning tree that would be used in simulating the way multiple celestial bodies interact through gravity.

In each case, the instructor specified the API or interface that the student was to implement, and each student creates a solution to the same problem. Inherent in this type of course design is an absolute ban on code sharing. Since each students is solving the exact same problem, in exactly the same pre-specified way, plagiarism is a serious risk.

Learn by copying

It took me a long time to notice something strange about these exercises that I give my students: What the students actually do, to lear, has almost no resemblance to how I learned to be a computer scientist.

In the early 90s, just before the birth of the internet, I was trying to teach myself how to make video games. How does one make 3D graphics, music, interactivity? My best friend and I worked in x86 assembly language on 386 and 486 machines. Already, by that time, there was a wealth of information about programming on local BBS systems, and available via Gopher and USENET. We taught ourselves computer graphics by downloading a simple graphics program, and modifying it. We found examples of 3D particle systems, and even a program that played a passable imitation of the “Inspector Gadget” theme song. All of these we appropriated, re-designed, and re-purposed, to create our own unique programs.

We created a fully-working paint program, in assembly language. We created a 3D particle system that displayed a rippling American flag. We set our own goals, indulged our creativity … and learned by copying. Exactly the opposite of the methods used in most computer science classrooms.

No more secrets

This semester I am trying to reverse course, to turn all of these traditional practices over, and see what happens. Instead of giving students a detailed specification, I give a set of aesthetic goals, which they can satisfy any way they want. Each student’s project is very different from the others.

More importantly, all student code is published on GitHub.com. This means it is truly public, available for the world to see. No more secret source code, no more prohibition on sharing, no more implication that each student ought to be learning on his/her own. Instead, we are all in it together, all learning from each other.

This approach should have a myriad of benefits:

  1. Less temptation to plagiarize, because code is more distinctive
  2. More motivation to work on the project, because each person sets his/her own goals
  3. Grading is much more pleasant, because each project is unique
  4. Models the learning technique/experience of most top programmers
  5. Allows students to experience regular code-review (similar to critique in design courses) very early in the curriculum
  6. Encourages students to start building their portfolios (something most computer science programs don’t emphasize)

You can help!

I’m interested to know of other computer science instructors that have asked students to publicly post all of their code, and encourage sharing and re-use. I have sent several requests over Twitter and Facebook, but so far I haven’t had much luck. Do you know someone that uses open code repositories in their teaching? Send them my way, @MiamiUBo on Twitter.

3 Responses to “Data Structures, all the way Open”

  1. As a student at Northeastern University, I had at least two professors who were perfectly happy to *allow* me to publish my code on a public website, although that was not part of the course design. For assignments that had a fixed goal (as opposed to student-design specs) we were asked to keep our code private until after grading, but otherwise we were free to publish.
     
    In Marsette Vona’s Computer Graphics class, our major project was self-designed as you describe (“incorporate at least N of these things we have learned about this semester”), and as you say, there was little or no incentive to plagiarize. Additionally, I learned a great deal from that project — since we had free choice of programming languages, I used it as an opportunity to learn Clojure. This was a highly satisfying way of putting my new knowledge into practice.
     
     

  2. Chris says:

    This sounds like an interesting experiment. Can’t wait to see how it turns out.

    What kind of “aesthetic goals” do you have in mind?

  3. Bo Brinkman says:

    You can see some examples of student work for the first project here: http://www.youtube.com/playlist?list=PLhnYJWF1EPXhqlBNUWOochovPmFb7oanR
     
    The main learning objective was to learn about the row-major order representation of 2D data using a 1D array. The secondary learning objective was to learn some C++ (most students in the course have only had Java at this point in their careers).

Leave a Reply