Apache HTTP Server Usage Survey Results

This post ran originally on blogs.apache.org/httpd.

Wouldn’t it be nice if you had an idea of how people use the software that you write? I wanted to have an idea how the Apache HTTP Server is being used, and which features users consider important. So, I set up a short online survey of eight questions and sent a link to it to the HTTP Server project user and developer mailing lists. Over the next week and a half, I got 134 responses. Here are the survey results in shiny pie charts with witty interpretation.
Continue reading

Somebody, Turn off That Tap!

I recently attended a keynote address by the CTO of a leading anti-virus firm. His company is fighting the good fight. Having recognized that signature-based malware detection no longer suffices, they have turned to a combination of detection and prevention to find and weed out bad actors. Big Data is crunched in The Cloud to find the malware which is then manually investigated to find out what it does. Once identified, sites serving malware are blacklisted for the benefit of this firm’s customers. The CTO proceeded to show an example of a piece of malware that changed the Windows hosts file to point a list of banking URLs to a single IP address, where one presumes the unsuspecting user would find a rogue copy of their banking website intent on stealing the user’s credentials or worse.

Now, this is only one example of the forty-thousand-odd unique malware infestations spotted in a depressingly short time, but my question is thus: why was a piece of malicious software running (inadvertently one assumes) on behalf of a user allowed to change a system-wide file like hosts? Shouldn’t there be a sandbox for code downloaded from the network that, if it needs to be run at all, prevents it from damaging the underlying operating system?

This situation paints for me the following picture: a tap is running, malware flowing like water into a sieve and onto the floor. The security industry is frantically mopping the floor, trying to stem the flow of malware. They are paid well for their trouble, but meanwhile the expensive rug that represents your business is getting awfully wet. It would be nice if someone could turn off the tap, or design an operating system that doesn’t leak like a sieve.

Goodbye, Quicken

In the early aughts, I purchased a copy of the game Civilization III for my Mac. I have played it ever since, especially after I learned that its copy protection code would mistake a mounted disk image of its CD for the real thing so I could run it without a CD in the drive (no funny business here: I still have the CD and in fact recently came across it). A great casual game, suitable for mindlessly clicking away, I used to play it on the bus home from work. Another regular commuter even accosted me once saying “You’ve been playing that same game for years! Haven’t you ever thought of getting a different game?” I still occasionally play it even though I have several versions of its successor Civilization IV, because III is easier on the battery and improved copy protection in IV doesn’t fall for the disk image trick. Now, its long tenure is coming to an end. Apple is releasing OS X Lion and retiring the PowerPC compatibility layer. Goodbye Civilization III, you will be missed.

However, this post is not about Civilization III. It’s about the only other application I use that requires PowerPC compatibility: Quicken 2007. I have now used it for over ten years to manage my finances, track my investments, time and pay my bills, and forecast savings. A couple of weeks ago, Intuit sent out a notice to the effect that Quicken 2007 would not be compatible with Lion, and support for it (such as it was) would end. Customers were advised to migrate to Quicken for Windows (ha!) or Quicken Essentials, their long awaited ground-up rewrite that does take advantage of current SDKs and runs natively on Intel Macs.

Unfortunately, Quicken Essentials has significant feature discrepancies compared with the older product. It has no bill pay feature. It also can’t track investments: the web site suggests that you manually enter stock and fund prices which seems to me a slightly less fun proposition than drying untreated wooden plates and spoons with a tea towel. Finally, Intuit states that they “ we are evaluating options for Quicken Essentials for Mac”, which to me sounds like “It’s dead but we won’t tell you yet because we want to get some more revenue out of it” and is not a confidence builder.

Here’s what I would like my next financial management app to do:

  • Run natively on my Mac, without having to run a VM
  • Ingest bank statement data through OFX files from multiple financial institutions
  • Ideally, pull said OFX files directly from the respective fiancial institutions’ websites (dream, dream)
  • Track inter-account transfers. Ideally, instigate inter-account transfers but I’m not holding my breath
  • Pay bills, with a settable future payment date. Quicken 2007 lost this capability when Wells Fargo dropped support for that version and WF’s interface is nice, but I now have to enter payments in two different places. This is not ideal
  • Track loans: Balance, Interest and Impound
  • Break down my Paycheck into various taxes and withholdings (a welcome new feature in Quicken 2007)
  • Report on spending by category, tax table, comparison with previous years etc.
  • Track investments, keeping track of security prices, cost basis, dividends, etc. for various investment accounts at multiple financial institutions

As far as I can see, I have the following alternatives:

  • Drop $50 (or, temporarily, $25) on Quicken Essentials, see if I can live with the reduced feature set, and hope they don’t put me in the same position in the near future
  • GNU Cash, an open source finance tracker which seems to have a fairly horrid user interface at first glance, but the major advantage is that there is no company that can unilaterally pull the plug on it
  • Buy and Install Quicken for Windows on a VM and use that. Not a viable option as far as I’m concerned
  • Buy iBank from the Apple App Store for $60 and see what it’s like. It’s getting some good recent reviews from people clearly in the same boat as I am
  • Start using mint.com, which is now also owned by Intuit and has never struck me as the financial management app I need

Dear LazyWeb, what are your experiences with the above? Any alternatives I missed?

Lessons on Rails

Spent a not-very-fun day today playing around with Rails, Cucumber and their friends. I hope I learned something, because otherwise my output of today is decidedly minimal. These are some things I picked up, in the hope that they prove useful to someone else.
Continue reading

EC2 is Not a Web Hosting Company

The entire universe is abuzz and atwitter about the big Amazon EC2 outage this past week. A cascading series of glitches in their Elastic Block Storage (EBS) system took down several high profile websites hosted in their Eastern Region data centers. The AWS Status Dashboard has a considerable write-up on the outage as it progressed over the latter half of last week.

Responses to the outage were mixed. As question-and-answer service Quora posted on their outage page: “we’d point fingers, but we wouldn’t be where we are today without EC2.” This is true: Amazon and its ilk provide relatively affordable and scalable hosting for applications, and relieve the current wave of startups of the burden of having to invest in and operate their own hosting. However, when you host your application on Amazon, you still have a single point of failure unless you very specifically engineer it to be resilient under failures. EC2 offers many features that can take you beyond a single host deployment. Customers who have adapted their deployment to take advantage of these features withstood last week’s outage with little or no customer-visible impact. Without such adaptations, your web application is no better off than if it were hosted on a conventional web hosting platform.

Amazon operates multiple Availability Zones that are supposed to isolate failures… which did not work too well last week because the issues cascaded across availability zones until the entire Eastern Region was affected. Resilience across geographic regions is not straightforward because the CAP Theorem kicks in: Consistency, Availability, Partition Tolerance, pick two. You can’t have all three at the same time. Engineering an application to withstand outage by distributing it across different availability zones, across regions, or even across different providers is a considerable and costly undertaking, which is not lightly embarked upon by a cash-strapped startup trying to get swiftly to market. Whether to spend this time and money, or whether to tolerate and respond to the occasional outage is a determination that every company will have to make for themselves.

ApacheCon Meetup: Whither HTTPD?

ApacheCon North America 2010You can now suggest Meetup topics for the evenings of ApacheCon. I’m not sure what a Meetup is in this context: perhaps it’s a little like a BOF. Anyway, I went ahead and registered a Meetup with the following topic:

HTTP Server 3.0: Who Needs It? Who Wants It? Who will Write It?
Whither httpd? Does our User Community need a quantum shift that would require a major new version number? Does our Developer have this need and would/could/are they in a position to start major new development on the project? Will 2.x serve us until the end of time?

This topic is partially inspired by the Keynote session Roy Fielding presented in Amsterdam in 2008 on Apache 3.0: two-and-a-half years later seems like a good time to take stock. If you want to talk about this, come to ApacheCon and join the Meetup. Did I mention that rates go up after Friday, October 8?

Playing With Rails

I need to prepare for my upcoming speaking engagement, so I’m playing around with Ruby on Rails today. Excellent opportunity to learn a new web technology. No, the speaking gig has nothing to do with RoR: this is pure procrastination.

Learned a couple of interesting things:

  • When you run gem outdated on a stock Snow Leopard system, it pulls information from an outdated source which makes it fail to run the next time. Only successfully updating RubyGems itself solves this issue.
  • Nobody ever tells you that after sudo gem update rubygems-update, you have to run sudo /usr/bin/update_rubygems. Otherwise, it will keep using the old version and a) can’t update sqlite3-ruby which needs the newer RubyGems and b) will try to keep accessing the outdated source.
  • When you want to use Aptana Studio with Eclipse 3.6 (Helios), make sure to install the plugin in the Eclipse installation itself, not under your own user account. This seems to be a bug in Eclipse itself that affects all plugins: if installed under a user account (for instance because the application installation directory is not writable by the user), the plugins don’t show up in the IDE and can’t be used.

There is no better way to procrastinate than to go learn something, and there is no better way to put off learning something than to mess around with tools.

Speaking at SofTECH

I will be speaking next Wednesday at the monthly meeting of SofTECH. The topic will be Security and Open Source Software:

Many software choices are available to professionals who need to run applications in their business. Some of these will be delivered by conventional vendors who have full control over the product and its development. However, over the past decade many Open Source applications have emerged as viable alternatives, developed using an open process by volunteers from many different companies.

Speaking from his experience as an Open Source Software developer, Sander will compare some security aspects of Open Source and Closed Source software, likely debunking some myths along the way. We will examine the security vulnerability mitigation process used by the Apache Software Foundation and discuss how an open development process can provide enhanced security.

See the meeting page for details. An RSVP link is at the bottom of the page.

File System Permissions for Apache

I don’t spend a lot of time on The Apache HTTP Server Users mailing list, but a discussion sprang up there this week on which I think I should share my response. The issue was why the server in question did not have permission to show a particular file. The initial response was “just chown your document root to the Apache user” and, when pointed out that this introduced security issues,

Oh man an experienced sys admin told me to do it that way.
Please tell me what is wrong in this and where is this documented on Apache 
docs.
I want to read.

Here is my response reproduced: read on.

The Apache HTTP Server needs read access to its configuration files and the files it serves. In and of itself, the server does not need write access anywhere on the system: even its log files are opened for write when the server is still root, and the open file descriptors passed to the child processes which change their user id to the lesser privileged user.

Read access only. The web server user should not own, or be able to write to, its configuration files or content.

Content, other than CGI scripts, generally does not need Execute permissions. Even PHP files that are interpreted by the server do not need to be Executable.

Certain applications, especially publishing platforms and Content Management Systems that you manage and populate through the web server itself using a browser, require that certain directories on the system be made writable by the web server user. You can do this by changing the owner of the directory to that user (usually www but ymmv), or by making the directory group-writable and changing the group to the group as which Apache runs.

Making directories writable by the web server should be done only with care and consideration. The usual threat model is that someone manages to upload (for instance) a PHP script of their own making into the document root, and simply executes that by accessing it through a browser. Now someone is executing code on your machine. Google for ‘r57’ for an example of what such code can do.

If a web app needs writable directories, it’s often better to have those outside the DocumentRoot: that way the uploads can’t be accessed from the outside through a direct URL. Some applications (WordPress for instance) support this, others do not.

In many cases, writable directories are not strictly necessary even though the web app might like them: rather than upload plugins (which contain code that gets executed or interpreted, yech!) through the web browser, upload them through ssh and manually unpack them on the server. The CMS Joomla! likes to write its configuration file to the Document Root on initial install (which promptly becomes a popular attack target) but if it can’t write to the Document Root, it will output the config to the browser to the user can manually upload it.

The Apache Documentation will merely tell you to make the server installation root-owned. The HTTP Server Documentation does not cover third party applications like WordPress or Joomla!, so it will not discuss their need to have some directories writable. I hope the above makes the picture a little more complete.