Friday, May 25, 2012


SIGMOD 2012 just wrapped up, and it was an excellent year! Since SIGMOD accepted fewer than 50 papers this year, the program was much easier to navigate, and I could make it to most of the talks that I thought were interesting. Here are some highlights and (mostly systems-focused) papers that I think are promising and should be fun reads:

Research Papers
  • "Calvin: Fast Distributed Transactions for Partitioned Database Systems" from Yale: Calvin builds on previous work from Yale on exploiting determinism to get high performance in scale-out transaction processing. Interesting and controversial ideas.
  • "bLSM: A General Purpose Log Structured Merge Tree" from Yahoo: A careful implementation of LSM trees that do a much better job of managing memory and avoiding spikey latencies for reads and writes by carefully scheduling merges.
  •  "Towards a Unified Architecture for in-RDBMS Analytics", from Wisconsin: Brings together a bunch of techniques from convex programming, incremental gradient descent, machine learning, and database systems to support a large class of machine learning algorithms efficiently in the RDBMS. Great combination of a bunch of well-known powerful ideas.
Industrial Systems:
  • "F1-The Fault-Tolerant Distributed RDBMS Supporting Google's Ad Business", from Google: Describes the transactional layer built on top of Spanner to support ACID transactions on a globally scalable datastore.
  • "TAO: How Facebook Serves the Social Graph", from Facebook: Large scale data store from Facebook to serve all those petabytes of content.

Pat Hanrahan talked about the importance of serving the "Data Enthusiast" -- which I interpreted as making database technology much easier to consume, as well as adding new technologies that enabled non-programmers and non-DBAs to access data. He focused on Tableau-style visual exploration and querying.

Amin Vahdat's keynote on networking was one of the most interesting keynotes I've been to in a while. He talked about the current revolution in networking with Software Defined Networking (SDN) rapidly becoming a reality at Google. He also described (perhaps for the first time in public) some of the details of the networking infrastructure in Google both inside the datacenter and across datacenters. His argument was the networking needs data management in the SDN era to do well, and large-scale data management needs "scale-out" networking to really grow to datacenter scales. He pitched it as a perfect opportunity for collaboration across the data management and networking communities!

No comments:

Post a Comment