Shawn Wilsher – Page 12

More on fsync

You know that “fun” fsync bug we had with Firefox 3 that primarily hurt Linux? Well, turns out that it also hurts us with applications on solid state drives (like, Portable Firefox), which makes it even more “fun”. For Firefox 3, we checked in a patch that allows for people to change the default synchronous setting that SQLite uses when a database connection is opened. This was really just a stopgap measure. I’ve spent the last month coming up with real solutions to the problem, and I think I’ve finally got one that will greatly help the issue, and not regress our current performance.

The basic idea of this plan is to reduce the number of times we write to the most active tables in places – moz_places and moz_history_visits. In order to accomplish this, we need to set up temporary tables that live in memory (to avoid the writes) that shadow these two tables. Every so often (time to be determined), we’ll need to write out the temporary tables to their permanent ones living on disk. In order to prevent the UI from hanging, we’ll have to do this on a background thread. I’ve already got patches up to make mozIStorageConnection threadsafe so we can use a connection object on more than one thread, and to get a background thread setup for places to use.

The solution is a bit more complicated than this, since we’ll also have to keep track of entries that we want to delete and flush those when we write everything out as well. The table that keeps track of entries to delete would also have to live in memory so we don’t write and fsync. The good news is that this only manages to regress performance by about 3.2 x 10^-6% in my testing, which is something I think we can take.

There is a cost to this plan though. Any query to the places database is going to get a lot hairier to accomplish to get the most recent data. This is because you have to query against both the temporary and the permanent tables. Essentially, any time you want to select from moz_places, you have to use a subselect instead of just the table name that looks something like this:

FROM (
  SELECT * FROM moz_places_tmp
  WHERE url IN (:test)
  UNION All
  SELECT * FROM moz_places
  WHERE url IN (:test)
  AND +id IN (SELECT id FROM moz_places_tmp)
) AS h

That doesn’t even include the bits for deleting an entry, which complicates it a bit more still.

Some of this abstraction can sit behind a view with triggers setup so the code can be simple. The trigger will handle all the details. I’ve put enough thought into this to know that it’s doable, but haven’t figured out the exact triggers that are needed as of yet.

I’m ecstatic that I’ve finally got a viable solution to this issue. I hope to get all the work done and reviewed in time for the next public release of Firefox 3.1 (be it an alpha, or a beta). No bugs have been filed yet (other than the ones previously mentioned) since I haven’t figured out how I want to split the work up yet to make it easy to get reviewed. That will all happen on Monday though, so stay tuned!

Tags fsync, places

Mozilla

Sheriff Duty – The Details

Post author By Shawn Wilsher
Post date July 16, 2008

I meant to get to this yesterday, but yesterday turned out to be a busy day. I meant to get to this earlier today, but today turned out to be a busy day too…

As I previously announced, I’m going to try an experiment with sheriffing the tree. While I’m the acting sheriff tomorrow (that will be from 9am – 6pm EDT), you’ll have to run checkins through me. To make this easy as possible, there are going to be a number of ways in which you can get a patch to me (in the preferred order):

Send me an e-mail with your hg bundle that includes the correct commit message.
Attach your hg bundle to the bug you want pushed, and add that bug to the wiki page.
Just post a bug number on the wiki page with the appropriate checkin comment.

I’ll also be going through checkin-needed bugs when things are slow. The general rule here though is that it’s a first-come, first-serve push ordering. I might take things out of order if the queue is big and I see bugs/patches that won’t interfere with each other.

Hopefully that clears up any confusion about tomorrow. I apologize for not getting this out sooner, but today was crazy. :)

Mozilla

Sheriff Duty

I’m up for sheriff duty this Thursday, and I thought I might try something that people have discussed many times. It’s a bit radical, and I’m not sure if people will like it though. The idea is to only have the sheriff push changesets. This way, the sheriff can push things at a rate he/she is comfortable with, and stay on top of all the performance charts and random oranges. If it happens to be a slow day, I’ll be going through the checkin-needed bugs and pushing those as well.

It makes sheriffing a full time job, but I think it’s worth a try to see how it works out. I also think it will be a valuable data point to determine if we want to do it like this all the time.

Feedback wanted (posting to dev.planning as well)!

Firefox Mozilla RTSE

Asynchronous Storage API

That asynchronous storage API I’ve been working on for a while has finally been pushed to mozilla-central. That means you can now run database queries off the main thread without blocking the UI. This includes both read and write statements.

This may not seem like a big deal, but there is a big benefit to using this API over the existing synchronous API. SQLite performs a file system operation called fsync which pushes the data in the file system’s cache to the disk. This operation is inherently synchronous, and on some file systems (like ext3), can take substantial amount of time given the right circumstances. If this is ran on the main thread, the UI is locked up the whole time. By using this new asynchronous API, you won’t have to worry about that fsync holding up the main thread at all!

Perhaps the best part about this new API is that it doesn’t require many code changes. You still create SQL statements the same way, but instead of calling execute or executeStep on the prepared statement, you just have to call executeAsync. The method takes one parameter – a callback that notifies on completion, error, and results. The callback is optional on the off chance that consumers don’t care if something finishes successfully or not.

Iterating through results is not much different from before either. The only difference is that results may be chunked, so the callback may get notified about results several times (with only the new data). Some good example code can be found in the tests that landed with this new API.

I’d really like people to try it out and see if they have any issues with the API. There are already a few refinements with bugs filed, and a few more up in my head that we might want if the need arises.

Personal

Something to be missed

Post author By Shawn Wilsher
Post date July 6, 2008

One of the things that I’m really going to miss when I move out west is yard work. Yes, that probably seems a bit odd to most people, but I really enjoy being outside and working. That may very well be the reason that I like officiating so much too. I’m gonna have to find something to do out there to fill that void…