Redis memory optimization – real life usecase

In this post I describe how we reduced memory usage with over 40% by changing the Redis data representation.

Redis is very easy to get started with and quick as lightning, but as you know have the limitation that all your data has to fit in RAM. To help you with this Redis uses special encoding for hashes, sets and lists under certain circumstances.

Use hashes

A good starting point is to make sure that you use hashes instead of strings where possible. For example the following string (key generated by counter):

set id:101 "my value"

Can be changed into a hash and a field:

hset id:1 01 "my value"

As you see all you need to do is make sure that the generated id starts at 100 (at a minimum) and then split the key so that the last two digits are used as fields and the first part as a hash.

Use integer sets

We have a lot of sets holding country codes that were stored as short text strings. Redis has a very efficient storage of sets of integers (up to a configurable limit). To get an idea of the difference you can create two sets:

sadd countries1 SE DK NO FI GB
sadd countries2 11 12 13 14 15

By using redis-rdb-tools you can get a report of how memory is being used. Analyzing the example sets gives the following result:

  • MyCountries1: 562 bytes, hashtable
  • MyCountries2: 101 bytes, intset

This is a massive difference! Note that Redis automatically uses the compact intset representation when possible. Space will of course be needed to store the mapping between integers and text strings but if you have a limited set of text strings there is still a lot of space to save. We managed to save 40% going from almost 700MB to 400MB.

Implementing string mapping

How do we take advantage of intsets? We need to map from integer to text string and back again. New text strings will also have to be added when as they appear. To handle this we need:

  • A hashset for mapping from id to country, called country:id
  • A hashset for mapping from country to id, called country:value
  • A counter for creating new unique country ids, called country:counter

In my project we used Jedis which is the recommended Redis java client.

Looking up countries is straightforward:

public Set toCountries(Jedis jedis, String... ids){
    List countriesList = jedis.hmget("country:id", ids);
    return new HashSet<>(countriesList);
}

Getting id from a country gets more complicated since new countries may be added:

public String toId(Jedis jedis, String country){
    String id = jedis.hget("country:value", country);
    // If the country doesn't have a mapping add it
    if(id == null){
        id = String.valueOf(addCountry(jedis, country));
    }
    return id;
}

Long addCountry(Jedis jedis, String country){
    List result = null;
    Long newId = jedis.incr(count);
    while (result == null) {
        // We need to detect if someone else modifies the mapping while were at it
        jedis.watch("country:id", "country:value");
        String id = jedis.hget("country:value", country);
        if(id != null){
            // Another client has added the same mapping
            return Long.valueOf(id);
        }
        // A transaction is needed to handle the case with multiple clients using the same database.
        Transaction trans = jedis.multi();
        try{
            trans.hset("country:id", String.valueOf(newId), country);
            trans.hset("country:value", country, String.valueOf(newId));
            // If a transaction isn't completed null is returned
            result = trans.exec();
        }
        finally{
            jedis.unwatch();
        }
    }
    return newId;
}

Conclusion

In many cases large memory savings can be made by optimizing for Redis structures. To find out what optimizations you can make you need to know what your data looks like. Most optimizations make the code more complicated so you need to make sure that they are worth it.

Leave a Reply