Amanda Clare: January 2012

Wednesday, 25 January 2012

We need more than one programming language

I teach Haskell as a programming language to our undergraduates. I'm sure it's a continual subject of debate in computer science department coffee rooms up and down the country: "Which programming languages should we teach in our CS degrees?" The module that I teach is called "Concepts in Programming", and the idea is that there are indeed concepts in programming that are fundamental to many languages, that you can articulate the differences between programming languages and that these differences give each language different characteristics. Differences such as the type system, the order of evaluation, the options for abstraction, the separation of data and functions.

"We need more than one" is the title of a paper by Kathleen Fisher, a Professor in Computer Science at Tufts University. Her short, eloquent paper describes why we will never have just one programming language ("because a single language cannot be well-suited to all programming tasks"). She has had a career spanning industry and academia, has been the chair of the top programming language conferences (OOPSLA, ICFP), and has been the chair of the ACM Special Interest Group on Programming Languages. Her paper On the Relationship between Classes, Objects and Data Abstraction tells you everything you ever needed to know about objects. She knows about programming.

Her recent work has been looking at that problem of how to read in data files when the data is in some ad-hoc non-standard representation (i.e. not XML with a schema, or CSV or anything obvious). So we all have to write parsers/data structures to fit these data files. We write a program in a language such as Perl. She says "The program itself is often unreadable by anyone other than the original authors (and usually not even them in a month or two)". I've been there and done that.

And when we've written another new parser like this for the umpteenth time (especially in the field of bioinformatics) we begin to wonder "How many programs like this will I have to write?" and "Are they all the same in the end?" and Kathleen's paper The Next 700 Data Description Languages looks at just that question. What is the family of languages for processing data and what properties do they have? I love the title of this paper because it instantly intrigues by its homage to the classic 1966 paper The Next 700 Programming Languages by Peter Landin. (To see why Peter Landin's work on programming languages was so important, the in memorium speech for Peter Landin given at ICFP 2009 is well worth a listen, and also the last 3 mins of Rod Burstall's speech discusses this paper in particular).

So perhaps, in computer science, we're doomed to keep inventing new programming languages, which wax and wane in popularity over time: Lisp, Fortran, C, C++, Java, Ruby, Haskell, F#, Javascript and so on. But as computer scientists we should be able to understand why this happens. They're all necessary, all useful and all part of a bigger theoretical picture. We need more than one.

By the way, Kathleen Fisher would use Python as a teaching language.

This is my entry for the BCSWomen Blog Carnival.

Friday, 13 January 2012

Artificial Intelligence and Microscopes

Artificial Intelligence has always been a branch of Computer Science that really catches the imagination of both scientists and the public. Trying to understand and replicate intelligence in all its different forms (reasoning, creativity, decision making, planning, language, etc) is exciting because it helps us to understand ourselves. Computer scientists such as Alan Turing have been pondering the implications and possibilities of AI since the 1940s and 50s. In 1951, Marvin Minsky built the first randomly wired neural network learning machine. He had studied mathematics and biology and was trying to understand brains. He's now famous for his work in AI, but, back in the 1950s, he wasn't just a mathematician, or just a computer scientist, but also studied optics, psychology, neurophysiology, mechanics and other subjects. Perhaps we pigeonholed people less into disciplines back then? Or maybe he was just amazing. Armed with all this knowledge, and a desire to learn about the brain and to look at neurons, he invented a new type of microscope, the confocal microscope. This gets rid of unwanted scattered light so that he could really focus in detail on a very specific part of the item he was looking at. Now he could see things that had never been seen before. He built the first one, and then patented this microscope in 1961. It would be another 20 years before the idea caught on (what would the research impact monitoring committees of today make of that?). Confocal microscopes are now in every biological lab and are taken for granted.

C. elegans is a 1mm long worm which lives in the soil. It is a very simple creature, easy to grow in the lab and it has a brain. Sydney Brenner (who is 85 years old today, 13th Jan 2012) has a Nobel Prize for introducing C. elegans to biologists as a "model organism": an ideal organism for studying the principles of life. In 1986, John White and his colleagues Southgate, Thomson and Brenner published a paper on the structure of the brain of C. elegans. Each worm has just 302 neurons and this number is the same for any C. elegans worm. They worked out where all the neurons were and what their connections to other neurons were, using a confocal microscope. John White had to make substantial improvements to Minsky's microscope design in order to do this. They took 8000 pictures ("prints", because it wouldn't have been digital back then) with the microscope and annotated them all by hand.

So we now have a complete picture of a simple brain. Other scientists have taken the data from White et al.'s work and created models of the brain. We understand a lot about the behaviour of the worm and which of its 302 neurons are responsible for which behviours. We have the entire C. elegans genome, so we know how many genes it has (approx 20,000), how many cells it has (approximately 1000), and we have a technique (RNA interference) for surpressing the behaviour of any gene we want to investigate. Are we nearly there yet? Are we at that tipping point where we've inspected all there is to inspect and found nothing except complexity? Have we already understood intelligence?