The three of us recently visited Scandinavian Developer Conference in Gothenburg. The location (Svenska Mässan) is huge and we actually got lost on the first day trying to find the conference! However, having a large restaurant and being served at the table during lunch was nice. Besides good food this also created a good opportunity for talking to other participants. Here are some of the highlights of the conference:
Big Data – Niklas Gustavsson
Niklas Gustavsson talked about how Spotify are handling massive amounts of data “Spotify services, the whole is greater than the sum of the parts”. The main points:
- Create small services that are simple to reason about
- Use shallow queues, threadpools and connection pools. Don’t be afraid of dropping requests as it is better to handle some of the load than none of it. They use ZeroMQ with a queue limit of 32. This creates pushback to the clients and prevents the server from being overloaded.
- Really happy with PostgreSQL, but due to problems with single master for writes they are using Cassandra in some places
- Use boring technology such as DNS SRV records for service discovery
- Monitor as much as you can and graph your important metrics. Strive for second latency in the graphs. When you have a production problem it is not that interesting what happened 15 minutes ago. Spotify uses Metrics for Java metrics and a heavily extended derivative of Munin for graphs.
- Log what’s important. Use a structured format. Use syslog on linux. Collect your logs in a central place (e.g. Kafka, RSync). Store your logs and make them analayzable (e.g. HDFS Hadoop Distributed File System).
- Automate deployment and configuration. Spotify uses Debian packages, Puppet and stores all instance configuration in Git.
Process – Dan North
- Quality should not be uniform! Consider what you are optimizing for. Agile is typically optimizing for predictability.
- Coupling is the dual of DRY. Violating DRY lead to less coupling.
Functional – Bodil Stokke
Humorous look at functional programming called What Every Hipster Should Know About Functional Programming. We learned about functors, monads, higher order functions and last but not least kleisli triples.
Big Data – Jonathan Ellis
Jonathan Ellis talked about Massively Scalable NoSQL with Apache Cassandra:
- Cassandra enables its impressive write performance by using append-only writes
- It doesn’t update data or indicies in-place on disk so there’s no intensive synchronous disk operations to block the write
- MongoDB was significantly slower in the benchmarks presented but to be honest I wonder if MongoDB wouldn’t have turned out better if it had been using capped collections without b-tree indexing.
Big Data – Jeremy Hinegardner
Jeremy Hinegardner had a good definition of Big Data: “Any amount of data that you feel uncomfortable processing”. He also recommended Avro as a persistence format since it has built in support for checksums when serializing data to files.
Web – Magnus Thor
Magnus Thor presented his work on a WebRTC project called xsockets. WebRTC seems really cool and allows browser-to-browser communication without going through the server. It can be used for audio and video conferences (like Skype) and it seems pretty simple to use. Of course, Microsoft has it’s own take on WebRTC (CU-WebRTC) that doesn’t follow the standard…
Process – Janice Fraser
Janice Fraser keynote about Lean Startup had a useful analogy for finding the minimal product. Instead of creating a extremely elaborate wedding cake (think American wedding cake) you create a cupcake. It is not that big, but still something you can eat and may have an interesting topping. So when building something new you should think “what is the cupcake version of this?”.