So, over at the Virtualmin forums I get a lot of questions about which operating system to use. Partly, they’re asking, “What does Virtualmin support?”, but they’re also looking for guidance on what will be most productive and least problematic for the tasks they want to perform with the system. Given that I’m building server management software, and the majority of the systems I deal with are web servers, the following advice pretty much applies exclusively to servers. Based on my past ten or so years of heavy Linux usage on servers, I’m comfortable calling myself an expert on OS selection and deployment, so disregard the advice at your own peril.
The good news is that I’m not going to tell you that any particular Linux distribution sucks for servers. Some of them do, but I won’t spend a lot of time naming names, except where there is a clear and objective measure that clearly makes some distros better than others. I’d much rather cover what I’ve found makes an OS great for server usage and what makes one painful.
Which brings us to the very first and single most important factor in your OS decision.
Support Life Cycle
Servers remain in service far longer than desktop machines. Migrating from one to another is painful and time-consuming (even with tools like Virtualmin providing a really clean migration path, you still have the DNS and temporary loss of service issues to contend with), so you want to do it as rarely as possible. But, you cannot safely run a server without regular security updates. You just can’t. Even if you carefully choose historically secure packages for your services (Postfix or qmail for MTA, OpenSSH for file transfer and shell access, Webmin for web-based administration, Dovecot for POP/IMAP, etc.) and carefully shut down every service you won’t be needing or firewall it down to just the hosts that need to talk to them, some security problem will pop up in the future. If you can’t easily upgrade that service within hours of the vulnerability being discovered, your system will be compromised, and you’ll be forced into a rapid migration to a new system, or trying to find a security consultant who can clean up the problem and paying the very high price that said consultants charge.
So, pick an OS with a long life cycle, and choose the most recent version of that OS. Debian has a very long life cycle. RHEL and CentOS have a very long life cycle. Mandriva and SUSE have a long life cycle (but I believe you have to get the paid versions to get the long-term support). Ubuntu LTS also has a long life cycle, but non-LTS releases do not, so you’d want to avoid any version other than LTS. Fedora, now that Fedora Legacy is gone, has a very short life cycle, and should be avoided for servers. I’m unable to find the life cycle of Gentoo.
I’m not kidding about life cycle. If you pay no attention to anything else in this article, pay attention to this advice. I’ve managed hundreds of servers in my life, and the single biggest pain point is updates and upgrade management and migration when updates and upgrades are no longer feasible. Avoid that pain as much as possible!
Package Management
Packages are great. They take away several major pain points in managing servers. So, choose an OS that offers good package management. The definition of “good” in this context is that it, roughly in order of importance: handles dependencies automatically, allows alternate software repositories so you can manage local packages or make use of third party repositories for specialized software, offers access to a wide array of “official” or at least nicely built third-party packaged software, allows you to make your own packages or customize existing ones, allows unattended operation and operation from scripts.
apt-get+dpkg and yum+RPM and urpmi+RPM are the best package managers available today, meeting all of those qualifications. After spending a couple of weeks porting our application and all of its dependencies to Debian 4.0 and CentOS/RHEL 5, I can say that Debian wins on the package management front, due to the sheer number of packages available in the universe repository. I like yum and RPM better from a technical perspective, and Fedora Extras makes it comparable to Debian on the package management front, but we’ve already disqualified Fedora due to stupidly short life cycle.
Ease of Management
Every Linux distribution has quirks that make it harder to use than it should be. In some cases, better ways are known and well understood, but historical behavior sticks around due to inertia and remaining compatible with existing users expectations. That said, there is value in predictability and good documentation. I find CentOS and RHEL to be the most capable out-of-the-box, and with quite sane management methods. Debian is also strong, but has a few historical stupidities, such as lingering reliance on START=yes crap in configuration files for some services. In a world where chkconfig has been around for years, this is foolishness. This is merely an example, and is a minor nuisance in any single instance, but by the time you’ve had to edit a dozen files just to get your services running (even if no configuration is required), it becomes painful.
SUSE has made a habit of changing major components of the system in every single revision. Package management on SUSE has changed in major ways three times in the past three revisions, and it’s still weaker than yum or apt-get. Configuration files move from one version to the next, and many can only be sanely managed with YaST. In fact, SUSE has a horrible habit of designing the underlying system for the convenience of the YaST developers. A specific example, so folks won’t think I’m just being bitchy because we build a management tool that overlaps with YaST in some areas: SUSE uses a firewall save format different from every other distro (which uses the standard iptables save file format). The SUSE format is less powerful, less flexible, more complicated to edit, and incompatible with all other firewall management tools. There’s no reason for this, except that it’s easier for YaST to modify a few variables in a shell script than to actually parse the iptables save file format and deal with it cleanly.
Webmin and Virtualmin nearly nullify the management problem, by abstracting it out to a nearly identical interface across all Operating Systems. But even with Webmin and Virtualmin, there are still differences that can increase the friction of going from zero to “in service”. I also like to know that when I hit the command line, the configuration files and mechanisms are sane and well-documented.
Documentation
Having good documentation is pretty darned important. And by documentation, I refer to every scrap of information that Google can dig up for you on a particular problem, not just the official documentation for the distribution. In this regard, the more popular an operating system for the tasks you’re performing, the more likely you are to find good documentation on the web.
The Debian official documentation is particularly verbose and particularly useless, and Ubuntu only marginally corrects that. The Debian documentation suffers from the same disease as man and info pages in the GNU system: No examples. Both are quite popular on servers, though, so one can often find good examples via Google.
The Red Hat, CentOS and Fedora documentation is much stronger, despite being less voluminous. The Red Hat based systems, are also the most popular, and so examples and discussions on the web abound. I’ve found them to be the easiest to find solutions for. If you’re new to Linux going with the most popular distro might be wise.
Gentoo has shockingly good official documentation with fantastic examples. The distro itself falls down in numerous other categories for server usage (pretty much all of the above, as far as I can tell), but the documentation is simply awesome. Whatever process they’ve used for their documentation ought to be replicated by the other distributions, because despite the youth of the Gentoo platform, it has far better official documentation than any other distribution. Hell, I need to spend some time studying how the Gentoo documentation project is run, so I can replicate it for Virtualmin and Webmin.
SUSE is far more popular on desktops than servers, and so server-related documentation is slimmer than we’d like. It is also less popular than Debian/Ubuntu and the Red Hat based systems, and so finding stuff on the web is less likely. The official documentation is of good quality, but again, there is simply less topic coverage.
Mandriva also has reasonably good official documentation, though it also is more popular on the desktop, and so much of the official and unofficial documentation is focused on things that we, as server administrators, don’t care about.
Your Use Case
This one is more difficult to give advice on, because it depends on what you’re doing with your system. If you’re a .net developer, you might look to SUSE, because Mono is a core part of that system. If you like Python, you might consider Red Hat or CentOS, because Python is an important part of their system and thus the packages are well-cared for in the official repository. Perl is well-supported by all of the distributions (though Mandriva and Debian have the most official packages for various CPAN modules), and Ruby is equally poorly supported by all of them (though Debian 4.0 seems to have improved remarkably on this front). Debian and Ubuntu are particularly strong on PHP, as they provide official packages for both PHP4 and PHP5, and they can co-exist on the same system (it took me weeks to replicate this capability on the Red Hat based systems).
So, what are you doing with your server? Search the web before making your OS decision to see if others are using one particular OS more often for tasks like the ones you’ll be tackling. This is a variation of the old “it’s the applications, stupid” rule. If all of your necessary applications are being used heavily on a particular OS, like maybe by the developers themselves, you’d be well-advised to pay attention to that.
In our case, we are heavily invested in Perl development, so our options are pretty wide open. We also use Joomla for our new in-development website, so we need reliable PHP support. Another, particularly difficult, piece of our puzzle is that we need a virtualized server to run our demo on, so native support for virtualization would be nice. We currently reside on a Red Hat Enterprise Linux 4 server, but if I were choosing today, I’d probably select Debian 4.0 on the strength of the universe repository. But, that’s just me.
Y’all pick your favorites, and let me know what criteria you use, but try to avoid “I like it because it’s easier to X”, because that intangible quality is almost always a function of what you’re familiar with…I’ve tried to steer clear of familiarity as a basis for selection, since there are far more folks who don’t have familiarity with any particular Linux distribution than there are that do.