SKS uses Berkeley DB as its storage backend. This embedded database has many knobs to tweak. I’ll try and investigate settings suitable for SKS.
As of version 11.2.5.2, Berkeley DB has a tool for suggesting “a page size that is likely to deliver optimal operation”. Running it on the KDB/key database gives the following information:
$ /usr/local/BerkeleyDB.5.2/bin/db_tuner -h KDB -d key For your input database, we recommend page size = 65536 out of 512, 1024, 2048, 4096, 8192, 16384, 32768, 65536 for you.
So I ran it an all SKS’ databases and it gives the following recommendations:
database | recommended page size |
---|---|
KDB/key | 65,536 |
KDB/keyid | 32,768 |
KDB/meta | 512 |
KDB/subkeyid | 65,536 |
KDB/time | 65,536 |
KDB/tqueue | 512 |
KDB/word | 512 |
PTree/ptree | 4,096 |
Preliminary experiences on PTree/ptree indicate that sks pbuild is mostly CPU bound. So I do not think there is much to improve there just by tuning the database parameters. Thus I do not plan on investigating further in this direction.
We know that performance must be measured. We also know that benchmark are meaningless. Here is my tribute to nonsense.
Benchmarks where done with:
The only benchmark I ran yet is building the KDB databases with the following command (page size = N * 512):
$ time sks build -cache 100 -ptree_cache 100 -pagesize $N sks-dump-*.pgpI have not run fastbuild benchmarks because this way of using the underlying database seems wrong. Building with N = 41 resulted in a deadlock so I did not try values other than those listed in this table. Note that this only changes the page size for the KDB/key database. The other KDB databases do not have a tuning parameter in SKS.
Here are the results.
N | page size (byte) | system (s) | user (s) | CPU (%) | elapsed (h:m:s) | rate (key/s) |
---|---|---|---|---|---|---|
8 | 4,096 | 406.72 | 1601.73 | 9 | 5:44:40 | 145 |
16 | 8,192 | 354.73 | 1976.17 | 14 | 4:32:49 | 183 |
32 | 16,384 | 342.50 | 1959.27 | 15 | 4:10:46 | 199 |
64 | 32,768 | 332.18 | 1895.25 | 14 | 4:11:15 | 199 |
128 | 65,536 | 327.54 | 1759.74 | 15 | 3:47:28 | 220 |
Not surprisingly, best performance is achieved with the recommended page size of 65,536 and it degrades when the page size shrinks. We can also notice that Berkeley DB’s default page size (4,096 with Berkeley DB 5.2.36) does not give good performance.
Run similar benchmark on the KDB/keyid, KDB/meta, KDB/subkeyid, KDB/time, KDB/tqueue and KDB/word databases. As evidenced with Berkeley DB’s diagnostic there could be some opportunities for progress: db_stat shows that the page size used is 4,096.
Due to the usage of KDB/meta and KDB/tqueue I suspect there is not much to be gained there.
Jeffrey Johnson suggests:
the Ptree access pattern should be looked at carefully. I've seen a number of failures attempting to tune with minimal locks usually when accessing (or initially loading) the PTree store. The access pattern on the PTree store is more complex than the raw KDB store.
1 The default page size in SKS is 2048 which means I could not build the keys databases without specifying a custom page size.
Kim Minh KaplanLast updated 2011-11-03.