Johannes Sasongko’s blog

Posts tagged sqlite

Notes on Python’s dbm.sqlite3 module

Python provides a simple key-value database API in its dbm namespace, and since version 3.13 this includes an SQLite backend. As SQLite is neither simple nor merely a key-value database, there are a few quirks to this dbm.sqlite3 library that I’m documenting in this post.

Schema and data types

You can see the SQL schema of a dbm.sqlite3 database at the top of CPython’s Lib/dbm/sqlite3.py. In summary:

The Dict table is created with default settings. In SQLite that means the table has rowid and does not have STRICT typing.

The key and value columns are declared with BLOB affinity, and because the table is non-STRICT, that means in theory they can contain any SQLite data type (null excluded due to the NON NULL declaration). However, dbm.sqlite3 queries always coerce keys and values to BLOBs first; as long as you always edit the database using dbm.sqlite3, all keys and values will only contain BLOBs. This can lead to surprises, for example due to differences in floating-point precision:

>>> db = dbm.sqlite3.open("test.sqlite", "c")
>>> num = 2.000000000000002
>>> num
2.000000000000002
>>> db[num] = num
>>> dict(db)
{b'2.0': b'2.0'}

In practice, you should probably serialise keys and values yourself into Python bytes objects (which round-trip to SQLite BLOBs), because otherwise the database will do it for you in a way that you may not expect. Handling the serialisation yourself also makes you less likely to forget to deserialise the values you get back. If you use Python’s shelve module as a higher-level abstraction to the dbm database, value (de)serialisation is done automatically for you using pickle.

Using the database from multiple threads

Python’s sqlite3 module (which dbm.sqlite3 is built on top of) has a connection-level check_same_thread option that, by default, prevents the connection from being used from multiple threads. This option is not exposed by dbm.sqlite3. In other words, dbm.sqlite3 connections can only be used from the thread it is created on.

If your use case needs check_same_thread disabled, the only solution available at the moment is to copy the dbm.sqlite3 source code to your project and modify it there.

Depending on the value of the sqlite3.threadsafety constant, and especially when check_same_thread is disabled, you may need to regulate access to the database using a mutex. Refer to the sqlite3.threadsafety documentation to see the level of locking you need to perform. In practice, if you’re on a mainstream Linux distribution, SQLite is likely compiled with Serialized threading mode enabled, making it safe to use from multiple threads.

The simplest way to handle the threading issue is to check that sqlite3.threadsafety is 3 (Serialized)—raising an exception otherwise—and hope that none of your users encounter this limitation. But that’s up to you; feel free to handle it according to your needs.