Hi, I’m a bit of a noob at database tuning, but I just wanted to share a huge performance boost I found for Lemmy that sped up my instance a ton.

The synchronous_commit parameter

Postgres has a parameter called synchronous_commit, which tells postgres whether to wait for data to be written to the disk before returning the commit. By default, this is on, which is good for safety, because any data written to the database and then confirmed by postgres is guaranteed not to be lost, even if postgres or the computer crashes.

However, this does have a performance cost, as every time you write to the database, Postgres will wait until your drive fsyncs the data into flash/spinning rust. These fsyncs can account for over a millisecond of latency on every database commit on slow consumer SSDs, and is especially slow for Lemmy, where every federated upvote/downvote leads to a database transaction.

Turning off synchronous_commit can have a huge impact on Lemmy performance, as now whenever your instance receives a federated upvote, it won’t freeze your entire database while it waits for that upvote to be written to disk. With synchronous_commit disabled, Postgres basically runs like Redis, with all the performance benefits involved: Data is stored in memory, and asynchronously written to disk for persistence.

The tradeoff with turning off synchronous_commit is that if your database crashes while postgres is still writing that commit to disk, it will get rolled-back and lost, even though Lemmy was told that it went through. I think this is a reasonable tradeoff for Lemmy servers, as the majority of database operations are simply upvotes/downvotes which aren’t important if they happened to be lost, and there is still no risk for database corruption with this optimization. The worst case scenario is that your database crashes immediately after you submitted a comment, and your comment is lost. But it’s far more likely that your comment will get eaten by an HTTP error than a database crash, anyways.

Here is the postgres documentation about the synchronous_commit parameter: https://www.postgresql.org/docs/current/runtime-config-wal.html#GUC-SYNCHRONOUS-COMMIT

My extreme-scenario performance impact

Turning off synchronous_commit has made a huge difference for me.

My Lemmy database is stored on a Ceph storage cluster, which makes these disk fsyncs particularly expensive for me. Due to network latency, my slow consumer SSDs, and possibly issues with TRIM on RBD, my Lemmy data has to be fully-written to flash storage twice, on two different computers, before the fsync completes, often taking an entire 15ms. When fsyncs are used, my SSD IOPS drop from 20,000 per-drive, to just 50 cluster-wide. My database was barely keeping up with federation from Lemmy.world.

By turning off synchronous_commit, pgbench showed my database went from 4 TPS to 140 TPS, a 35x speedup! (I also went and disabled postgresql’s fsync parameter, which sped up my database by another 14x to 1.9k TPS. But this can cause db corruption, and I’m only using this because I have continuously replicated backups.)

Disabling synchronous_commit seems to have fixed all of my performance issues on Lemmy.

Applying the optimization and disabling synchronous_commit

First off, you can see the current setting for synchronous_commit by running this in psql:

SHOW synchronous_commit;

You’ll get a result telling you “off” or “on”. (There are some other options too, but these are equivalent to “on”.)

Temporarily disabling synchronous_commit

You can temporarily disable synchronous_commit with this:

SET synchronous_commit=off;

You should notice your instance running a lot faster now that postgres isn’t being bogged down by writing hundreds of upvotes to disk.

You can turn synchronous_commit back on with the following command, or simply restarting the database.

SET synchronous_commit=on;

Making the optimization permanent

To disable synchronous_commit on every start up, you can add this line to your postgresql.conf:

synchronous_commit = off 

If you’re using CNPG on Kubernetes like me, you can add this to your Cluster manifest:

  postgresql:
    parameters:
      synchronous_commit: "off"

Hope this helps anyone struggling with Lemmy database performance like I was!

For further reading, I also found this beehaw post about lemmy database tuning, which includes this synchronous_commit optimization, as well as other optimizations. I’m pretty happy with my database performance now, but I might try these myself in the future.