Introduction and content.  This course covers distributed systems, both small and large: understanding, designing, and programming them.  It is important to understand the fundamental principles of distributed systems, because the whole Internet is made of many, many distributed systems, including clouds, multicore computers, local and medium area networks, and many applications built on top of these infrastructures as distributed systems.  The first half of the course covers small and static distributed systems, with the fundamental algorithms they use:

  1. Formal models, time, and causality

  2. Specifications, events, and failures
  3. Failure detectors
  4. Reliable broadcast
  5. Causal broadcast
  6. Shared state
  7. Consensus and its applications
  8. Group membership

The second half of the course will cover large and dynamic distributed systems of various kinds.  The actual topics will change from year to year.  Some of the possible topics are:

  1. Structured and unstructured peer-to-peer networks
  2. Super-peer architectures and Skype
  3. Gossip algorithms
  4. Large-scale distributed systems at high load
  5. Content distribution and BitTorrent
  6. Synchronization-free sharing
  7. Cloud hardware and software

We will not have the time to see all the above topics; I will make a selection during the course depending on opportunities and your interest.  Sometimes I will choose topics related to my own research.

Organization.  The teaching assistant for the course will be Mathieu Pigaglio.  The course will have a small number of lab sessions (around one session every two weeks), an optional midterm exam (on week 7), and a mandatory project (last third of the course).  The midterm exam counts for 5 points and corresponds to the first question on the final exam (I will take the maximum of the two).  The project counts for 5 points and must be done during the quadrimester.  The final exam counts for 15 points (10 if you use the points of the midterm).

There will be an in-person lecture every Tuesday 16h15-18h15 in auditorium Mercator 12.  I strongly recommend you attend this lecture.  I will give many intuitions that will help you understand all the formal concepts.  There is a weekly timeslot for a lab session on Wednesday 16h15-18h15 in BARB 20.  There will *not* be a lab every week; usually we will do one lab every two weeks.

Course project.  There will be a project to give you experience in distributed systems.  It is very likely that the project will be done using the Erlang/OTP platform.  Erlang is an excellent language for building resilient distributed systems.  Writing code in Erlang is very easy for students following this course, because Erlang is very similar to the event-based pseudocode that we will use to write the distributed algorithms.  It is by far the easiest and most powerful way to write distributed systems, with very little "boilerplate" to get in your way.  We will introduce Erlang in the lab sessions and bring you up to speed for the project.


Course material.  The course book for the first half is: first edition or [second edition]
    •    Introduction to Reliable [and Secure] Distributed Programming, by [Christian Cachin,] Rachid Guerraoui and Luis Rodrigues
There is no fixed course book for the second half; I will distribute slides and papers depending on the topics that we will cover.