Skip navigation

Tag Archives: performance

Sometime soon after the beta 8 code freeze, the Places team will be merging the Places branch into mozilla-central. There are a lot of changes we’ve been working on, the most important of which is some major re-architecting how we store data.

The Benefits

The work on the Places branch brings us a number of benefits. In general, we’ve parallelized work, and made it substantially less likely that we’ll block on the GUI thread. Some of the important fixes we have landed are:

  • Faster Location Bar
    The location bar is faster because other database work no longer blocks us from searching, and the queries are much simpler.
  • Asynchronous Bookmark Notifications
    Indicating if the current page is bookmarked in the location bar (with the star) is now an asynchronous operation that does not block the page load.
  • Faster Bookmarks & History Management/Searches
    Simpler queries and other improvements should make this all work faster.
  • Faster Link Coloring
    Link coloring is now executed on a separate database connection so it cannot block other database work.
  • Expiration Work
    Less work at startup, less work at shutdown, and less work when we run expiration.
  • Less Data Stored
    Embedded pages are now tracked only in memory and never hit the disk.
  • Better Battery Management
    Much less work during idle time, which will improve our power consumption behaviors.
  • Fixes 29 blockers and 18 other issues

A bit of History

Way back in the days leading up to Firefox 3.5, we moved from storing all of our history and location data in disk tables to in-memory tables that we’d flush out to disk every two minutes off of the GUI thread. The benefit of this was two-fold:

  1. No longer performing the vast majority of our disk writes on the GUI thread
  2. No longer performing the vast majority of our fsyncs/Flushes on the GUI thread

More details about how we came up with this solution can be found in a series of blog posts.

The Problem

This solution has worked out pretty well for us for a while, but recently, especially on OS X, it has not been. The short story is that our architecture did not scale well due to lock contention between our GUI thread and our background I/O thread. While the common case access case may be fine, the failure case (when we hit lock contention) is pretty terrible. The problem is so terrible that I once described it like this:

the failure case makes us fall on our faces, skid about 100 feet, and then fall off a cliff without a parachute.

Ultimately, the only way we can avoid this situation is to not do any database work on separate threads with the same database connection. It was not an issue in the past because we just did not do enough work on the I/O thread, but as we have added to the workload of that thread, we increase the likelihood of it holding the lock, which means there is a higher probability that the GUI thread will not be able to instantly acquire the lock and do whatever it needs to do. This essentially leaves us with two options:

  1. Move the rest of our database work off of the GUI thread.
  2. Move database work from the I/O thread back to the GUI thread.

The Solution

The second choice is not actually a viable option. Disk I/O completes in a non-deterministic amount of time, which is why we have been moving it from the GUI thread to an I/O thread since Firefox 3.5. The first choice is not entirely viable either due to schedule constraints either (we have tons of API calls that are not used heavily but still synchronous). A hybrid solution exists, however. We can reduce the amount of work we do on the I/O thread by using additional I/O threads. Additionally, we can move the remaining synchronous operations during browsing to an I/O thread. In the end, Places ends up with one read/write thread, and multiple read-only threads.

This wasn’t really an option back in the Firefox 3.5 days because in SQLite readers and writers blocked each other. However, the SQLite developers recently devised a new journaling method called WAL that lets readers not block writers, and writers not block readers. When the Places branch merges into mozilla-central, we will end up with three read-only I/O threads and our original read-write I/O thread. The three read-only threads are used for location bar searches, visited checks (is a given hyperlink visited), and some bookmark operations. Each I/O thread has its own connection to the database, allowing operations to happen in parallel (SQLite is only threadsafe because it serializes all access on each connection object, which is why we had the lock contention in the first place).

Performance Test Issues

One of the things that made this work especially difficult is seemingly random changes in performance numbers. We often had regressions suddenly appear (according to talos) on changesets that would have zero impact on performance, and then backing out the change would cause an additional regression. Other times, when we would merge mozilla-central into Places, we would suddenly get new regressions when comparing to mozilla-central. This could be indicative of a bad interaction with our code and the changes on mozilla-central, however after looking at the changes on mozilla-central that landed with the merge, that appeared to be highly unlikely.

I’m also quite certain that some of our performance tests do not actually test/measure what we actually want to test/measure. I’ll leave that discussion to a future blog posts, however.

Over a week ago, I collected the data I said I was going to look at last time. I finally had a chance to look at the data (startup times with and without add-ons for two profiles on the latest version of 3.6), and my hypothesis was not verified. That means it is back to the drawing board for me. The graphs are not at all interesting, so I am not going to post them. At this point, I think the goal is officially at risk. With the profiles we got, I am not even seeing slow startups with cold startup. It is hard to diagnose a problem you cannot reproduce, sadly.

Next Step

Next week I am going to sync up with limi and get some contact information from the people that sent these profiles to us. We are going to have to do some remote debugging to see why they see such slow startup times.

News on the Past

Paul is feverishly working on a solution to make session restore not kill us on startup. He even has some test builds which you can download and test, but these are experimental. You should make a copy of your profile as a backup when using this test build in case things go boom.

This week, I spent some time looking at some real life profiles that were sent into us by users seeing startup time in the minutes. The tests were ran just like I ran the test on my profile: all add-ons disabled. The results I got are both good and bad, but first the results!

Results

The first shows the raw test run data (which isn’t terribly interesting). The second compares the reported startup time for each test. You will probably want to click to zoom in.

Conclusions

Like I said, the results are both good and bad. Good in that I now have a pretty good idea on why people have bad startup times. Bad in that we don’t have any way to quickly improve the issues that people are seeing. What I see from this data is that profiles in the wild, with add-ons disabled aren’t much slower than a clean profile. This seems to implicate add-ons being at least part of the problem (which we knew) or possibly all of the problem at this point (for the profiles tested). The good news is that the add-ons team is already working on solutions to this, and you should expect some blog posts from them about this soon.

Next Step

Next week I’m going to spend some time getting numbers with these profiles on the latest release of Firefox 3.6 with and without add-ons disabled to compare. This will pretty much confirm or deny my hypothesis of this week’s results.

News on the Past

In my last post, we looked at my profile with various pieces removed to try and figure out why startup might be slow for people. With those results, I identified two issues that would impact startup the most:

  1. Large cookies.sqlite
  2. Many tabs being restored

I also have good news about both of these issues! The cookies.sqlite issue is now fixed and will be a part of beta 4, and Paul has some good data about session restore and tabs (with more to come).

Over the weekend I spent some serious time with my computer running a bunch of tests with standalone talos in 11 different situations. First, a disclaimer: these tests were only designed to give some insight on the areas we should focus on for the goal. Each of these tests was reproduced at least once before I moved onto the next one in order to make sure the numbers were stable.

The Tests

  • Clean profile. This is just the standard profile that we normally run Ts with on tinderbox. This is basically used a baseline for best possible performance.
  • Dirty profile. This is actually my daily profile, with eight tabs that will open through session restore during startup. Because of how talos works, these tabs don’t all have to load for the number to be generated. Even so, you’ll notice a substantial slowdown. Sadly, I fear I modified the profile I was using in a bad way because I can no longer reproduce the numbers I got (but the numbers recorded were reproduced four times before I moved on to the rest of the tests initially).
  • Bookmarks toolbar disabled. This is a variation on the dirty profile test that just disables the bookmarks toolbar.
  • No places. This is a variation on the dirty profile test that removes places files from the profile.
  • No sessionstore.js. This is a variation on the dirty profile test that removes sessionstore.js from the profile. This has the side effect of also not making the eight tabs load at startup.
  • No urlclassifier. This is a variation on the dirty profile test that removes the urlclassifier related files from the profile.
  • No cookies.sqlite. This is a variation on the dirty profile test that removes cookies.sqlite from the profile.
  • No extensions. This is a variation on the dirty profile test that removes all add-on manager bits in the profile.
  • No formhistory.sqlite. This is a variation on the dirty profile test that removes formhistory.sqlite from the profile.
  • No downloads.sqlite. This is a variation on the dirty profile test that removes downloads.sqlite from the profile.
  • No content-prefs.sqlite. This is a variation on the ditry profile test that removes content-prefs.sqlite from the profile.

Results

I’m going to let some graphs do the talking here. The first shows the raw test run data (which isn’t terribly interesting). The second compares the reported startup time for each test. You will probably want to click to zoom in.

Data of the startup time runs

Startup time of the various tests

Conclusions

It looks like the best wins that we can get are related to fixing session restore to not scale linearly with the number of tabs it is restoring, and reduce the startup time costs of loading places files and cookies.sqlite. It should be noted that this test was not measuring the load time for each tab, so something like BarTab would not help in this case. The other good news is that we already have work underway to make cookies.sqlite load time not hurt us so much during startup.

I just uploaded Bugzilla Helper 0.2.0. This improves on the last release by making making the submission of comments an asynchronous operation. It also uses the activity manager in Thunderbird to track the process of the submission, and retry it if an error occurs.

There are still some apparent issues with the REST API that the add-on is using, and I’ll likely include some workaround in upcoming versions. 0.2.0 is available on addons.mozilla.org and is a recommended upgrade. Current users will have to update since sandboxed add-ons do not automatically update.