I’ve just released DOM Inspector 2.0.1. This contains a number of stability and UI fixes. This is available on AMO, but you should automatically receive the update.
Category: Mozilla
Determining a Ts Regression
For those who have been following the tree status of mozilla-central as of late, you probably noticed that I tried to land SQLite once again, but it was backed out due to a nasty Ts regression on Linux. When I had run this through the try server, it had shown no regression so I had thought it was safe (just like the past three or four other times I’ve tried to land this). Luckily, Johnathan, who was the sheriff when I landed, found a linux box that we could use that reproduced this problem. With a lot of his help, we got standalone talos running just Ts, to get strace
logs during startup.
Once I had those logs, I needed some way to parse the files for data so I can use it in a reasonable way. I wrote a python script to parse the strace logs, and then insert them into a sqlite database file (26.8 MB) so I could run some interesting queries on the data.
With that data, I decided to generate some graphs to easily see what was going on. All of these graphs compose the data from the six runs of Firefox that talos ran – the data is all summed up. All the graphs have larger versions available if you click on them.
I figured that the most useful graph for investigating this Ts regression would be execution time:
Note that that is six runs of Firefox, which is why it is as long as it is. Next, I looked at the average execution time for each function call:
And finally, I looked at the number of calls of each of these functions:
We are clearly seeing an increase in the number of fsync calls, and we know that on Linux those can be more painful than they are on other operating systems. My next step is to see if we also see this increase on OS X. If we do, I’m going to assume we see it on windows as well, and get backtraces of every single fsync call to determine why we’ve double the number of calls by upgrading.
I’ll make a new post as more data comes in.
New way to run tests
Did you know that you no longer have to do some complicated command line incantation to run mochi-style tests? Ted recently landed support for make mochitest-plain
, make mochitest-chrome
, make mochitest-a11y
, and make mochitest-browser-chrome
. Today, I just pushed support to specify the --test-path
parameter of old. To use it, just add TEST_PATH=your/path/here
to the make command line.
For those of you who are used to running browser tests, you no longer have to worry about adding in the ../browser
bits to the path either. The makefile handles it all for you. Hurray!
How to solve the fsync problem
This is part four in a continuing series about how we are working around the slow fsync issue in Mozilla. Part one can be found here, part two here, and part three here.
Boy, some of the initial planning to solve this problem has changed. I found several bugs in my triggers, which made my life not so fun. Now if you google the word fsync, my website is the first result returned (scary!). Never fear, however! I have good news. The places team has put together an experimental build that should help greatly. Before I link to the build though, I have a warning:
Do not use your normal profile – crate a copy of it to test this build. This build will modify the places database schema, and if you go back to a normal build, you will experience some strange behavior.
With that out of the way, the builds can be found here and will be available for the next seven days.
Feedback is not only welcomed, but wanted. You can track the overall progress of this task in bug 442967
Issues Issues…
This is part three in a continuing series about how we are working around the slow fsync issue in Mozilla. Part one can be found here, and part two here. You may find the schema diagram of places to be a bit helpful when reading this post.
timeless found an interesting flaw to my previous proposal. If you were to visit a page, delete it, and then visit another page not yet seen before, you will never see it in any of your queries. When you flush out all the “deleted” things, it will go away forever (unless the ordering was a bit different than what I expect, but that’s besides the point). This is because of our id selection algorithm used in the insertion triggers.
I ended up talking to dietrich on irc about this, and we decided that deleting isn’t a common case, so we should feel free to do the write (with the needed fsyncs) at that point. As a result, our view
s change, as well as a few of our triggers.
moz_places_view
now looks like this:
SELECT *
FROM moz_places_temp
UNION ALL
SELECT *
FROM moz_places
WHERE id NOT IN (SELECT id FROM moz_places_temp)
And moz_historyvisits_view
now looks like this:
SELECT *
FROM moz_historyvisits_temp
UNION ALL
SELECT *
FROM moz_historyvisits
WHERE id NOT IN (SELECT id FROM moz_historyvisits_temp)
The trigger for deletion on moz_places_view
now looks like this:
CREATE TEMPORARY TRIGGER moz_places_view_delete_trigger
INSTEAD OF DELETE
ON moz_places_view
BEGIN
DELETE FROM moz_places_temp
WHERE id = OLD.id;
DELETE FROM moz_places
WHERE id = OLD.id;
END
Lastly, the trigger for deletion on moz_historyvisits_view
now looks like this:
CREATE TEMPORARY TRIGGER moz_historyvisits_view_delete_trigger
INSTEAD OF DELETE
ON moz_historyvisits_view
BEGIN
DELETE FROM moz_historyvisits_temp
WHERE id = OLD.id;
DELETE FROM moz_historyvisits
WHERE id = OLD.id;
UPDATE moz_places_view
SET visit_count = visit_count - 1
WHERE moz_places_view.id = OLD.place_id;
END
The net result is two fewer in-memory tables, and slightly less complicated view queries. There isn’t much of a change with the triggers.
Many thanks to timeless for catching that. If you have any other concerns or spot any issues, please let me know with a comment here, or feel free to send me an e-mail.