Once upon a time…

It doesn’t seem long ago that I struggled to convince corporate IT that Apache on Linux was not only a viable alternative to IIS, but the superior choice. Whilst I’m not a big fan of windows for web servers - that’s largely a personal preference. IIS is actually a very capable web server; I’ve seen it stand up to quite a pounding. Of course, you need to patch it regularly, but the bottom line is it works perfectly fine. That is, until you try to configure it.

The biggest weakness in IIS is configurability.

Point and clicky configuration really is one of the stupidest ideas to come out of Redmond, particularly when that configuration can’t be edited or diffed as a simple text file. Have you ever tried replicating IIS settings to a DR machine? (Thank goodness for Vmware).

Configuration: Simple, but not stupid!

Apache really wins out in the configuration and configurability department. Apache is far more than just a web server, it’s the undisputed heavyweight champ of HTTP; if you can’t do it in Apache (either you’re doing something wrong, or) you can write a module to do it.

The King is here

In short, Apache is brilliant - it has set the benchmark as the practical, reliable utilitarian web server - it is probably responsible for more Linux server deployments than anything else.

In fact, I’d dare to say that Apache is the killer app for Linux.

I’ve deployed Apache in all shapes and sizes, in many configurations, as web servers, as reverse proxies, behind different load balancers and SSL accelerators - always with success. In short, this is why I don’t believe banking and finance will be in a hurry to move on from Apache and embrace the new darling of the RoR crowd, nginx.

The next great HTTPD

For all of the reasons above, I’ve never had cause to look beyond Apache. In fact, I didn’t know there was anything beyond Apache until recently, when I read that You Tube use Lighttpd to serve over 100 million crap videos, every day!

As anyone who is following this blog will know, I am currently using Django to develop a new web application (and I’m liking it a lot). Django is Python, and to use Python on Apache you are supposed to use mod_python. Now mod_python is an excellent piece of software, it’s very fast - and it exposes Apaches internal API. So, anything you can do by writing an Apache module in c, you can do in mod-python using python - and it’s pretty bloody fast.

The drawback is memory; mod-python has a bad rep for memory (although if you read Grahams comments on this post, you’ll see there are two sides to this story).

Regardless, Apache is a multithreaded beast - and so under moderate load you will start to require memory.

Memory Poor Computing

I’m building a web application to operate in a memory confined VPS, so, even with optimisations mod_python on Apache really isn’t ideal for me. Luckily, Django will also run as fastcgi, and while I haven’t done any serious load testing yet - early indications are that this seems reliable.

Continuing along the low memory meme - I have done a lot of testing with Apache over time, and what I have found is that Apache scales very well, so long as memory allows - and then it goes splat. In other words, to make Apache work under greater load, you have two options:

  1. Reduce Apaches memory footprint (you can actually get it quite small)
  2. Add more memory

Ostensibly, the number of concurrent requests Apache can handle is:

max requests = total physical mem - (RAM used by all other processes / size of an Apache process)

Anyway, with a 256MB VPS - you don’t even want to go there. No matter how much you optimise, you still end up with a built in scalability limit. So, I took a look at Lighttpd and Nginx.

Web Servers - The New Breed

These two web servers come from very different backgrounds, yet are actually rather similar. Both are so-called - asynchronous web servers, as opposed to the request-per-process model of Apache.

Lighttpd (pron: Lighty) is a single threaded web server which was famously developed to solve the C10K problem; that is, to serve 10,000 simultanious connections.

NginX (pron: Engine-X) originates from Russia; it was written for rambler.ru, which is Russias second bussiest web site. It runs with a single master process which delegates work to a small number of worker processes; so not quite single threaded - but light footed, shall we say.

My (limited) expereinces with NginX and Lighttpd.

First, I installed Lighttpd - to be fair, I didn’t really give it much of a chance - I read of several detailed accounts of memory leaks, enough to require scheduled restarts of the HTTPD. Now, I don’t really want to have to include scheduled restarts in my production servers (reminds me of running IIS2), nor do I want to have to reserve memory to cater for a leaking process, so I immediatley uninstalled lighttpd and instead took a look at NginX.

The Contender from the East

NginX is relatively new to the the English speaking world, the documentation was only recently translated from Russian to English; although the code is more mature than you might at first think.

I have to say that the NginX English wiki is excellent; the configuration is very similar to Apache - but with a slight bias towards c syntax (i.e. it’s easier to read with curly brackets and indentation). So, if you are familiar with configuring Apache, NginX is easy.

Familiarity breeds comfort

I use Ubuntu, where you can install NginX from Aptitude. Basically, this gives you a root conf file in /etc/nginx/nginx.conf - this contains the server wide settings, and it includes all files in the familiar sites-enabled (/etc/nginx/sites-enabled/) folder where you can add and remove http “server” definitions, or virtual servers in Apache parlance.

For the past few weeks, I’ve been using a single NginX instance with sites-enabled for: a static web server, a reverse proxy (for the django dev server) and a fastcgi server (for the django pre-production server) all at the same time, all without a problem.

Whilst I haven’t delved much deeper into the configuration potential, there is such a wealth of native modules available for NginX today, that I deeply suspect will make it as functional as Apache, for many use cases.

If I find myself experimenting with any of these (I intend to experiement with pre-compiled caches of gzipped content), I’ll be sure to talk about them here in detail.

Summary

In short - while NginX remains to be proven to the masses, it clearly is working well for some, myself included. I intend to do some serious load testing on NginX on RHEL when I get a chance, I’m curious to see how it behaves under stress, particularly with the SSL module.

If you’re starting to look beyond Apache, for whatever reason (I can’t think of any other than memory and massive scalability), definitely consider giving NginX a closer look.