Tons of cases of storing data don't require using SQL or tables at all. Having simple ways to save something somewhere and reading that later is more than enough. Imagine saving user settings or all sort of flags (e.g. when user viewed/liked/commented certain piece of content). In most cases you won't need that kind of data to filter/sort/group based on.

Long story short - sometimes you work with data you don't need to do table selects for. These are so called "key-value" cases. So why use tables at all? Well, you can use No-SQL solution, but are you brawe enough to manage multiple DBs after that? Me not. That's why I decided to learn how that can be done using our beloved Mysql server.

InnoDB and clustered indexing

As we've discussed in tuning Mysql, InnoDB uses clustered indexes. This ensures optimized way of accessing row data when being accessed by primary key:

That's why queries like:

SELECT * FROM table WHERE id = 1;

are highly efficient if id is a primary key. And that means we can build our own key-value storage on top of InnoDB tables.

Key-value table structure

Let's create simple structure with 2 columns - key and value:

CREATE TABLE `kv` (
  `key` varchar(32) NOT NULL PRIMARY KEY,
  `val` TEXT,
);

We will store values in value column encoded into JSON.

Storing data

We'll use INSERT...ON DUPLICATE KEY UPDATE query to either insert or update value by key:

INSERT INTO kv
SET val = '{"name": "val"}', `key` = 'test'
ON DUPLICATE KEY UPDATE val = '{"name": "val"}';

This will automatically create new row or update existing row for specified key.

Reading data

In order to fetch data by key column we use simple select:

SELECT val FROM kv WHERE `key` = 'test' LIMIT 1;

This will return JSON-encoded value for specified key.

Optimizing for performance

First of all, having key-value storage on top of Mysql gives all the benefits of Mysql to you. This includes tuning for performance, replication for availability, scaling and other cookies. Still, there are multiple things we can improve for our storage.

1. Limit key size

Try keeping key size as small as possible, this will decrease index size and fit more data into memory. If you plan to use long keys (more than 40 symbols), good idea is to use sha1() checksum for keys in order to convert it to 40-symbol strings:

SELECT sha1('this_is_a_vary_long_key_to_store_in_our_key_value_storage');

This gives the following string:

+-------------------------------------------------------------------+
| sha1('this_is_a_vary_long_key_to_store_in_our_key_value_storage') |
+-------------------------------------------------------------------+
| 8693e6100a85a05a4cc2f7f089ee3858933d61ab                          |
+-------------------------------------------------------------------+

And then insert/select by encoded key value:

SELECT val FROM kv WHERE `key` = sha1('long_key_name...') LIMIT 1;

2. Use scalar values if possible

If you plan to use simple types only (e.g. boolean flags, integer numbers or short strings), you can skip json-encoding:

CREATE TABLE `kv` (
  `key` varchar(40) NOT NULL PRIMARY KEY,
  `val` INT UNSIGNED,
  PRIMARY KEY (`key`)
);

This will not only allow to fit more rows into memory, but also enable type-specific operations possible. For example, numeric types will allow implementing atomic increment operation:

UPDATE kv SET val = val + 1 WHERE `key` = 'test';

3. Use multiple tables

In order to implement namespaces, you can use multiple tables instead of combining all keys inside single table:

CREATE TABLE `kv_settings` (`key` varchar(40) NOT NULL PRIMARY KEY, `val` TEXT,         PRIMARY KEY (`key`);
CREATE TABLE `kv_flags`    (`key` varchar(16) NOT NULL PRIMARY KEY, `val` INT UNSIGNED, PRIMARY KEY (`key`);

This storage can easily handle billions of keys with efficiency and reliability of Mysql. This approach in implementation is used for Mysqly key-value storage component. This solution can be easily extended on top of powerful Mysql infrastructure. For example, adding expiration functionality will allow implementing caching system.

written on Jan, 2022