Scaling Riak to 25 million ops/day at Kiip

István Soós September 25, 2012 4 min read

Riak is an open source, highly scalable, fault-tolerant distributed database.

Armon Dadgar and Mitchell Hashimoto discuss how and why they are using Riak in production at Kiip, the problems they we facing (increasing demands - 25M+ ops/day) and their solution: migrating from MongoDB to Riak. The talk was recorded at the May San Francisco Riak Meetup, in May, 2012.

You can read the outline of the talk here.

Kiip infrastructure

Kiip is a mobile reward (mobile ad) network: it connects brands and companies (advertisers) with consumers through virtual achievements in their apps.

Kiip started their infrastructure with MySQL, but switched to MongoDB before they had any real traffic. As they were using it, and as their traffic and data grew, they experienced the following problems:

Eventually, after they have found that all their API response times were related to MongoDB, they started to research other DBs. Their finding and opinions were the following:

Data migration to Riak

Fast growing data was the natural candidate to move first to Riak: they have started with session and device data, as they grew at exponential rate.

Sessions have the advantage to be key-value by nature, a good fit for Riak. They updated their data access layer, besides that, not application-level change was required. The python client had some problems in the protobuf implementation, and some errors that are already fixed now (e.g. keepalive header).

The switch was simple and without downtime:

The devices data was a bit different, eventually they have settled with a key-use that was generated from a hash-function. Their other attempt (e.g. using secondary indexes, map/reduce or id indirection) were too slow with unusable latency (0.2-2 seconds).

Riak in production (over 3 months)

Kiip team found Riak extremely solid, with only a few pain points, and they have provided advices and tips that might help others:

But nothing is a magic bullet, Riak is no exception. They still use MongoDB for geospatial data, however, they will migrate those to PostGIS. Kiip uses PostgreSQL for non-key-value, not fast-growing data.

Scaling is hard, but for the horizontally scalable K/V data, Riak seems to be a right choice.

updated: August 29, 2014
István Soós
software engineer, business advisor
Advocates for the maker-movement, self-directed learning and agile methods. His regular topics include: machine intelligence, data and risk analysis, distributed systems and knowledge management.