You know that “fun” fsync bug we had with Firefox 3 that primarily hurt Linux? Well, turns out that it also hurts us with applications on solid state drives (like, Portable Firefox), which makes it even more “fun”. For Firefox 3, we checked in a patch that allows for people to change the default synchronous setting that SQLite uses when a database connection is opened. This was really just a stopgap measure. I’ve spent the last month coming up with real solutions to the problem, and I think I’ve finally got one that will greatly help the issue, and not regress our current performance.
The basic idea of this plan is to reduce the number of times we write to the most active tables in places –
moz_history_visits. In order to accomplish this, we need to set up temporary tables that live in memory (to avoid the writes) that shadow these two tables. Every so often (time to be determined), we’ll need to write out the temporary tables to their permanent ones living on disk. In order to prevent the UI from hanging, we’ll have to do this on a background thread. I’ve already got patches up to make mozIStorageConnection threadsafe so we can use a connection object on more than one thread, and to get a background thread setup for places to use.
The solution is a bit more complicated than this, since we’ll also have to keep track of entries that we want to delete and flush those when we write everything out as well. The table that keeps track of entries to delete would also have to live in memory so we don’t write and fsync. The good news is that this only manages to regress performance by about 3.2 x 10-6% in my testing, which is something I think we can take.
There is a cost to this plan though. Any query to the places database is going to get a lot hairier to accomplish to get the most recent data. This is because you have to query against both the temporary and the permanent tables. Essentially, any time you want to select from
moz_places, you have to use a subselect instead of just the table name that looks something like this:
FROM ( SELECT * FROM moz_places_tmp WHERE url IN (:test) UNION All SELECT * FROM moz_places WHERE url IN (:test) AND +id IN (SELECT id FROM moz_places_tmp) ) AS h
That doesn’t even include the bits for deleting an entry, which complicates it a bit more still.
Some of this abstraction can sit behind a view with triggers setup so the code can be simple. The trigger will handle all the details. I’ve put enough thought into this to know that it’s doable, but haven’t figured out the exact triggers that are needed as of yet.
I’m ecstatic that I’ve finally got a viable solution to this issue. I hope to get all the work done and reviewed in time for the next public release of Firefox 3.1 (be it an alpha, or a beta). No bugs have been filed yet (other than the ones previously mentioned) since I haven’t figured out how I want to split the work up yet to make it easy to get reviewed. That will all happen on Monday though, so stay tuned!
5 replies on “More on fsync”
[…] Calendar More on fsync […]
[…] part three in a continuing series about how we are working around the slow fsync issue in Mozilla. Part one can be found here, and part two here. You may find the schema diagram of places to be a bit helpful when reading this […]
[…] part four in a continuing series about how we are working around the slow fsync issue in Mozilla. Part one can be found here, part two here, and part three […]
[…] so that it no longer needs to write to the database every time you visit a page, a project well chronicled on Shawn’s blog. Spurred on by one of Shawn’s performance wins, Ed Lee was […]
[…] we learned in the past with Places, writing to disk and calling fsync can be painfully slow. In session restore code, we are doing […]