August 2005


I’m working on a project that involves the use of a feature-rich content management system.

While the system has the features needed, each additional feature comes at the price of performance. With a web application when you can have an unknown number of visitors (especially during traffic spikes) and a poor performing site could result in error messages, visitors going elsewhere or customer dis-satisfaction.

Once metric that is useful is the pages-per-second. A benchmark tool is used to request a set number of pages (ie 100 or 1000) and records when the request was sent and how fast it received the page. The great the number of pages per second the more pages that can be served and the more visitors able to access the site (assuming all else remains constant).

A side note before we get started: This exercise is assuming that you have ample bandwidth to accomodate your visitors. You can create a site that can deliver hundreds or thousands of pages per second, but if your connected with a modem or slow DSL line, you can easily over-saturate the available bandwidth. Perhaps a follow-up article will discuss calculating bandwidth requirements for a site.

To begin, our out-of-the-box site needs to be benchmarked to determine our starting point. For this, the Apache webserver comes with the “ab” command line tool. This is a very simple tool to use:

ab -c 2 -n 100 http://site.tld/

The -c 2 simulates 2 concurrent users (two requests at a time) and the -n 100 tells ab to request 100 pages of http://site.tld/ ..

ab will return with a wide variety of information including the mean time per request, max time, minimium time, etc. As we are looking for pages per second, we are interested in the line that says “Requests per second”:

Requests per second: 0.63 [#/sec] (mean)

Yikes! According to this, we are currently not able to even achive 1 page per second! If we have a very infrequently accessed site, perhaps this is ok, but with any type of load, we will get sunk very quickly. Assuming a linear request pattern, at this amount, we could serve a maximum of 54,432 pages per day. While that may seem like a lot, given that most requests will occur during the day (about a 10 hour span) this ends up being only a max of 22,680 pages — and if you get small spikes (a few concurrent users) the amount of time for a page to display will be several seconds or more — this may result in visitors leaving the site.

Cache the Script!

As we are using the web scripting language PHP for all the programming, one possibility is to use a PHP accelerator such as Zend Optimizer or the open source mmCache. These tools will compile the PHP source code and keep the compiled version in a cache. As a result, when the page is requested, it can pull the compiled version of the script from the cache and instantly run instead of having to re-read the file, parse the code for syntax and compile into machine code. After viewing the mmCache site, they made a strong argument that their product is as fast or faster than Zend Optimizer and best-of-all, free! So the choice was simple.

Installation was simple. I followed the instructions on their site and within a few minutes, was experiencing much faster access times:

Requests per second: 3.51 [#/sec] (mean)

Without too much effort, I was able to achieve over a 5x performance increase! With this improvement, I can serve a maximum of 126,360 pages in that critical 10 hour time window — no additional hardware needed.

Cache The Content!

Not too bad, but I think we could do better. While mmCache will store pre-compiled scripts, it doesn’t do anything for our content. Everytime someone requests a page, the same requests to the database and same requests to the programming that puts the page together is called. Once option is to start caching the results of these computationally expensive processes. Basically we want to cache elements on the page for as long as they are still valid (content was not updated in the database or the visitor should see the same content as a previous visitor for a given element). Fortunately many content management systems will have some type of content caching built in and in our case, we can simply enable the caching system. As we are caching content, it is VERY important to understand what is being cached and when that cached element will be used. If this is not understood, it is possible that information that should not be cached (ie someone’s shopping cart or worse, someone’s credit card info) could be cached and presented to another visitor on the site.

As this type of caching eliminates a lot of expensive computational processing, I’d expect to see a large jump in our performance .. and as a result, we did have a major jump:

Requests per second: 14.83 [#/sec] (mean)

Now we can serve a maximum of 533,880 pages in that important 10 hour window.

Not too bad of a performance increase for just enabling some caches. :)

Cache the entire page!

For certain parts of the site, it might be a possibility to cache the entire page as no matter who is visiting, it is the same information. As long as the entire page is static (ie no dynamic elements or no elements that change based on who is visiting the site) it would be possible to use the end-users’ browser cache or a reverse proxy cache (ie squid or mod_proxy) to store the rendered pages. This involves our content management system to support cache control headers. A cache control header will send cache management information in the header (non-visible part) of the page. As a result, we might state that a page should be stored in a cache (either end users or reverse proxy) for 6 hours before a new version is requested. As a result, after the first page is rendered, it will stay in the cache for 6 hours until the page is re-requested from the content management system.

As this is very similar to serving static pages, performance should be very good, and as you can see, for the pages that can be configured this way, the performance is excellent:

Requests per second: 595.84 [#/sec] (mean)

This is a maximum of 21.4 million pages in that 10 hour window. Great for pages that do not change frequently and do not have dynamic content elements associated with them.

Adding Hardware

At this point in time, for static pages on our site, we can get 595.84 pages per second and for complex dynamic pages we get 14.83 pages per second. This is off a low-end Celeron 2.4Ghz desktop computer. To improve this further, we will need to look at hardware. Based on performance monitoring tools for this particular computer, I found the processor and to a lesser extend, the RAM were bottlenecks. As web page servers are highly threaded, they are a great application for a multi-processor computer. As a result, getting a dual 3.0Ghz Pentium IV machine and more RAM should easily get our dynamic pages number to 35-40 pages per second — about 1.4 million pages in that 10 hour window.

Specialize the Hardware: Database Server

As of right now, one server is handling it all — the database, the application, the web server, the files and more. The next step would be to specialize the hardware. With different servers focusing on different aspects of the site, we can customize each server for a particular role and, in theory, optimize it. A good place to start is to push out the database server to its own system. Databases can use as much RAM as you provide them. Most database servers (including MySQL) have advance memory management to optimize the time a query takes. As a result, it is not all that uncommon to see database servers with 4GB or more memory and a fast disk subsystem. The faster a database can read data, the faster the query gets performed and the faster the response. Of course, executing a poorly optimized query is not good use of resources — so the correct usage of indexes and evaluation of complex and frequently used queries is essential. It is not uncommon to have several different ways of performing a similar task with a database and having one way that is much more efficient.

Add more web servers!

With our database server separated from our web server, it makes it possible to add another web server quite easily. Simply configuring a second server identical to our first web server should (in theory) double our performance assuming the web server is the bottleneck and not the database server. If each web server handles 35-40 pages per second, we can deliver 2.8 million pages in that ten hour window. With additional web servers in our server farm, we need to get a hardware or software based load balancer to act as the gatekeeper. One big advantage to multiple web servers is if one went offline, a properly configured load balancer would recognize this, and automatically reroute traffic to the other web server — increasing overall availability as a side benefit!

Conclusion

We went from 0.63 pages per second to a maximum of over 1000 pages per second (with two web servers) for “static” content and ~80 pages per second for dynamic pages.

As you can see, there are LOTS of levels that can be tapped into to optimize a website’s performance. Depending on your application or content management system, other options may be available. By maintaining consistency throughout the server farm (identically configured systems, centralized updating process), additional administrative overhead can be minimized.

According to the Microsoft Internet Explorer weblog, the new version, Internet Explorer 7, will address many of the shortcomings of Internet Explorer 6.0 with relation to web standards — particularly XHTML and CSS support.

This includes a huge range of CSS positioning bugs, full support for the :hover tag, alpha-channel PNG files, and CSS 2.1 selector support.

I’m keeping my fingers crossed on this. It seems like a significant amount of time (30-40% or more) in web development is adjusting a standards compliant design to work with Internet Explorer (via a huge range of hacks and ugly reworks). Unfortunately, Internet Explorer 7 will not be released for Windows 2000 or Windows 98 which still commands a significant marketshare — so Internet Explorer 6 will be with us for quite some time (if only people would wise up and install Firefox. ;)