Looks Good Works Well: June 2008

Wednesday, June 25, 2008

Velocity Conference '08 Notes

Velocity Conference was great!

Takeaway
I think the greatest takeaway is how seriously the Firefox and IE8 team are taking client performance. In particular, the work of Steve Souders, the Yahoo! performance team, others who are writing about Ajax performance is the playlist for what to optimize. As Eric Lawrence of Microsoft said to Steve, "Our mission is to make your book on performance be out of date."

This is a very different picture from a few years back.

I still find it very disappointing that the Safari team is always absent at these discussions. I realize that Safari is kicking butt in performance (I will post examples from our round trip stats to show that) but it really is important that they are approachable by the community. But this seems to be the way of Apple unfortunately.

Some Great Talks
There were a lot of good talks, but here are the ones that stood out to me (I did miss both mornings).

Building Faster Pages in Firefox and Internet Explorer - This is not the same presentation as was presented & Mozilla's is missing, hopefully they will post their slides.
Even Faster Web Sites - This is the start of Steve's next book on Web Site performance.
High Performance Ajax Applications - Julien's in depth look at tuning the world of DHTML.
Hotmail's Performance Tuning Best Practices - This has some great material. Great lessons learned takeaways. A favorite for me.
IE 8 What's Coming - Takeaway... its not Javascript engine that is slowing things down... it is rendering and layout.
Image Optimization: How Many of These 7 Mistakes Are You Making - Crush those PNGs and more.
Lessons Learned in Live Search Moving to and then Away from Ajax - Eric Shurman is a great speaker and full of experience. Great lessons here.

Some Great Announcements/Tools

Keynote Systems Launches KITE - Performance Analysis Tool (Free)
Jiffy: Open Source Performance Measurement and Instrumentation - Scott & Whitepage's team announcing Jiffy. See also my firebug tool.
AOL Page Test - Kind of like a yslow. It runs stand alone as well as from a URL. But it generates network graphs, etc. Looks nice.

My talk is also available for download.

Already looking forward to Velocity '09.

Blogged with the Flock Browser

Sunday, June 22, 2008

Velocity Conference: Improving Netflix Performance Presentation

My quick talk on Improving Netflix Performance Experience from June 23 at the Velocity Conference is available here:

| View | Upload your own

Friday, June 20, 2008

Announcing: Jiffy Firebug Extension for Viewing Client Side Performance Data

In the previous article I discussed our full cycle tracing mechanism we built at Netflix.

Firebug Extension: First Attempt
When I was trying to explain the capabilities of this system to others in the organization, I started doodling in Apple Keynote to see if I could create a single diagram that captured all of the timings and what was really happening during a full cycle trace. This is where the diagram I used in the previous article came from. Here is the diagram:

roundtrip-blog-capture-alltimes.png (by billwscott)

This got me thinking. What if I could create a tool that would take the timings from the web page and render it with real time data. The obvious solution was to build this as a Firebug extension. Here is what the tool ended up looking like once it was plugged into Firebug:

The nice thing about this visualization is that it clearly puts the timings in context. You can see the Response overlapping with the Page rendering (which is a good thing). You get a clear picture of how the measurements fit together.

Currently, this is not available outside of Netflix. There is some code cleanup, upgrade to FF3/Firebug 1.2 needed as well as it is tied to the type of metrics I described previously. If others think it is useful I will try to burn some cycles, make the data source more generic and get it out for public use.

Jiffy Firebug Extension
However, I was showing this to Steve Souders (in relation to the Velocity panel) and he mentioned that one of the other panel members, Scott Ruthfield of Whitepages.com and his team were building an open source metrics and instrumentation tool for Web pages called Jiffy.

In the words of Scott:

Thus we built Jiffy—an end-to-end system for instrumenting your web pages, capturing client-side timings for any event that you determine, and storing and reporting on those timings. You run Jiffy yourself, so you aren’t dependent on the performance characteristics, inflexibility, or costs of third-party hosted services.

Steve suggested adapting what I had done with the current Neflix Firebug extension for use with Jiffy.

I quickly realized that since the focus of Jiffy was for measuring on the client page (as well as a way to log measurements) that I needed a more generic way to view the data. I decided to model this new measurement panel slightly after the Firebug Net panel.

The result is the Jiffy Firebug Extension.

Today I am making this extension available to the public under a simple Creative Commons License. Simply put, this adds a new panel to firebug that provides a nice way to view timing measures either in a collapsed or timeline view. It's also flexible. You can wire it to other libraries besides Jiffy if so desired.

You can learn all about the extension and read more about the Jiffy library at my Jiffy Extension site or go directly to the Jiffy-Web google group.

Blogged with the Flock Browser

Measuring User Experience Performance

One of the projects my team at Netflix is busy with is improving the end-user performance experience. Anyone that reads my blog regularly knows that I am a champion for great user experiences. Often the performance angle is not included in discussions of the user experience. But as we all know the best design can be crippled by sluggish execution, waiting a long time for a page to be fully interactive or feedback that does not come in a timely manner.

Timing "from Click to Done" - Full Cycle Experience
User performance experience focuses on what is the full cycle time from when the user requests a page or invokes an action till a new page is ready for interaction or the action has completed. Most sites have various ways of tracing parts of the performance puzzle. Often backend instrumentation is inserted to time backend services. But a lot of sites don't have a complete picture of just how much time is the user spending from the time they click it till the time they get it back (done).

To address this issue at Netflix, one of the first things I initiated after joining Netflix was the creation of full cycle tracers. Kim Trott (on my team) took the idea, ran with it, fleshed it out and turned it into reality.

The idea of tracing the full cycle time is to:

Capture the start time
Capture various points of interest
Capture the end time
Log the results

Capture the Start Time ("from Click")
To get a full cycle time we need to capture the point in time the user makes a request.

At Neflix we use the following stack of technologies: Java, JSP, Struts2, Tiles2, HTML, CSS and Javascript. Within the normal HTTP request/response cycle our requests are handled via Java servlets (eventually routing through to JSP for Web pages).

roundtrip-blog-capture-starttime.png (by billwscott)

Unload Event
The most logical place to measure the start of a request ("from Click") is on the originating page (see A in figure above). The straighforward approach is to add a timing capture to the unload event (or onbeforeunload). More than one technique exist for persisting this measurement, but the most common way is to write the timing information (like URL, user agent, start time, etc.) to a cookie.

However, there is a downside to this methodology. If the user navigates to your home page from elsewhere (e.g., from a google search), then there will be no "start time" captured since the unload event never happened on your site. So we need a more consistent "start time".

Start of Servlet Response
We address this by providing an alternate start time. We instrument a time capture at the very earliest point in the servlet that handles the request at the beginning of the response (see B in figure above). This guarantees that we will always have a start time. While it does miss the time it takes to handle the request, it ends up capturing the important part of the round trip time -- from response generation outward.

There are a number of ways to save this information so that it can be passed along through the response cycle to finally be logged. You can write out a server-side cookie. You can generate JSON objects that get embedded in the page. You could even pass along parameters in the URL (though this would not be desirable for a number of reasons). The point is you will need a way to persist the data until it gets out to the generated page for logging.

Note that the absolute time captured here is in server clock time and not client clock time. There is no guarantee these values will be in sync. We will discuss how we handle this later.

Capture Intereresting Points Along the Way
After capturing the start point there are a few other standard points of time we would like to capture.

roundtrip-blog-capture-tweentimes.png (by billwscott)

Two points of interest are captured on the servlet side:

When we begin generating the HTML for the page (C)
When we end generating the HTML for the page (E)

Two points of interest are captured on the client side:

When start rendering the HTML for the page (D)
When we end processing the HTML for the page (F)

C & E deal with page generation. D & F deal with page rendering.

In D, we emit a scriptlet immediately after emitting the <HEAD> tag that saves new Date()).getTime() into a Javascript object on the page.

In F, we emit a scriplet immediately after emitting the </BODY> tag that saves new Date()).getTime() into a Javascript object on the page.

When the page starts rendering the HEAD it will execute D and store the time stamp. When the BODY finishes rendering, it will execute F and store the time stamp.

Capture End Time ("to Done")
Once F is finished however, this does mean that the page is completely rendered. In parallel, the browser has been loading CSS, Javascript, requesting images and rendering the page. But the page is not ready for the user to interact with (typically) until the onload event is finished for the page (see G in diagram below).

roundtrip-blog-capture-endtime.png (by billwscott)

We attempt to insert our instrumentation as the very last item to be handled in the onload event for the page.

Logging the Captured Times
Here is a diagram that shows the various time points we capture (8) and the measures we derive (5).

The five measurements are:

Elapsed client request time (G-A). Full cycle time. The total time it took to request the page, receive the response and render the page. Not available if the referring page is not a Netflix page. This is not available for all measurements.
Elapsed server response (E-B). How much time was spent generating the server response.
Elapsed client render (G-D). How much time it took the client to render the page (and get the server response). Since we stream our response to the client, there's no way to differentiate how much of the client rendering time was spent getting the response from the server vs. rendering the page. The browser will start to render the page as it receives the content from the server.
Elapsed server plus client (C-B) + (G-D). Total time to render; full time to generate the response and render it client-side. This is the key measure we use as it includes the most reliable start time (that we can capture all the time). It is the summation of the client render time (G-D) and the elapsed server response (C-B). By taking the deltas on the client and the server separately we can safely add their values together to get a total time (even though they are captured in different server clock times). The caveat is there can be some gap between the time C emits the HEAD tag and the corresponding Javascript time stamp and the time D starts rendering. In practice we have found this time to be sufficiently small on the whole so have kept to this simple approach.
Elapsed client render after body (G-F). This gives us an idea of how much additional time is needed to get the page fully loaded after rendering the body.

At this point we can gather up the five values (persisted so far via server-side and client-side cookies or some other methodology) and write these out to a server log. This could be done via an Ajax call, using beacons, an HTTP request with a NO-CONTENT response or some other mechanism. The back end service caches up results and writes them to a database for later analysis.

The Benefit of Measuring
We could have chosen to implement all the yslow performance suggestions without measuring anything. Most likely they would all be wins for us. But by putting a full user experience performance tracing mechanism in place we can be confident that enhancements, features, or even attempts at performance improvement did not negatively affect site performance.

Recently we fielded a different variation of our star ratings widget. While it cut the number of HTTP requests in half for large Queue pages (a good thing) it actually degraded performance. Having real time performance data let us narrow down on the culprit. This feedback loops is an excellent learning tool for performance. With our significant customer base, large number of daily page hits we can get a really reliable read on the performance our users are experiencing. As a side note, the median is the best way to summarize our measurements as it nicely takes care of the outliers (think of the widely varying bandwidths, different browser performance profiles that can all affect measurements.)

In future articles

Discuss what has worked and not worked.
Discuss an internal firebug extension I wrote to visualize this data for our developers.
Compare Safari vs Firefox vs IE -- the results are really interesting.
Announce a new Firebug Extension that helps visualize performance page data.

Hear More About this at the Velocity Conference
As I mentioned in my previous article I will be giving a 15 minute talk on Improving Netflix Performance at the Velocity Conference. I will cover some of this material as well as other information in that talk.

Credit
Thanks to my excellent UI Engineering team at Netflix, in particular the hard work of Kim Trott. Thanks also to the yslow team for giving us some concrete ideas on improving site performance.

Blogged with the Flock Browser

Velocity Conference 2008

My friend Steve Souders (Mr. yslow and author of High Performance Web Sites) is chairing the upcoming Velocity Conference in Burlingame, CA. It's happening on Mon-Tues, June 23-24.

I will be on the Measuring Performance panel with Steve Souders, Earnest Mueller (National Instrument), Ryan Breen (Gomez) and Scott Ruthfield (Whitepages). We'll be talking about what it means to measure performance for Web pages, why it is important, how to do it, various types of methods of measuring and the unique challenges around measuring Web 2.0 applications.

I will also be doing a quick talk in the afternoon on Improving Netflix Performance. The talk is only 15 minutes but I will be sharing our experience with implementing yslow, building a round trip tracer tool and some other cool stats & tools that we will be sharing. Look for a followup blog article on some of this information.

Hope to see you there.

Blogged with the Flock Browser