Obtaining Performance by not Calling Sync

Brian “Krow” Aker:

When I was at Open Source Bridges one of the common themes I heard was “next year we plan to write a new storage engine” by a number of the vendors peddling new databases. This was followed up often by “we attain our performance by not calling sync”. I’ve wondered how many developers have had to stand in front of their bosses and explain “the reason the site will be down for the next day is that we are rebuilding the database because we chose to go with a solution that never actually saved the data”.

One thing I learned some time ago, if you ask anyone who builds databases about how they make sure the data they are storing is not either corrupted or lost, and they say anything other then “we call sync”, then most likely your data has not been saved, and most likely you will lose some data. The song and dance a developer can do around this is pretty funny. I can’t find the link on the OSB site to the MongoDB talk, which is a shame, since there were some good examples found in that talk.

Sync is foundational to data storage integrity, and even it is hard to get right.

So, like Brian, I’m less than impressed with supposedly-persistent data stores that somehow think they transcend the need for sync.

db sync fsync Jun 23 2010