I’m not sure when building a scaleable web app became optional. But Feedster, Technorati, Delicious, Google Analytics (and numerous other Google apps of late), BlogPulse and many of the other “big apps” have “suddenly” been hit by scaleability issues.

And so has FeedLounge:

As previously discussed in some depth, we’ve run into all sorts of trouble with scalability. When we began this project, I was working on the interaction design and saw a web based feed reader as “just another web application” – I was dead wrong (Scott already had a better idea about the scale we were looking at, but even he was surprised by what we found). The feed refreshing is a significantly higher load on the system than the actual users are – more like a search engine than a webmail system. We quickly found this out, and it fundamentally changed the way we had to approach the project.

Yeah. Here’s their process:

1. Start with a handful of users. This is too much for ded box.
2. Move to dedicated server.
3. Add a few more users til they’re at 100. This is too much for one box.
4. Add more hardware. It’s obvious this isn’t enough.
5. Recode.

Erm… Hello? Should the recoding have happened after step 1? I mean, if you draw a graph of “okay if we use 10% of a CPU with 10 users, with 100,000 users we’ll need 10K CPU’s” … Something’s wrong.

Maybe I’m just spoiled, having worked in high performance, high availability apps before, but it constantly astounds me what some folk consider “scaleable” and “available” applications. I’ve spent about 10 hours this month working with really, really high profile Web 2.0 ish companies nearly yelling at them about their lack of true infrastructure.

I won’t even get into their code.

It’s funny, because you’d think companies would have learned this lesson years ago. I remember back in the mid part of the .com boom I spent 2 days with Amazon optimizing their front-end code (HTML, JS, CSS, etc). Over 2 days, we trimmed about 50K of weight from it. Nothing major, just smart optimization. It saved them the need for 20 new servers and a major bandwidth upgrade.

Listen up. If your company relies on the web to stay alive, you’d damn well better be using at least some of the following “ladder to high availability”:

Backups, Redundant, Failover, Cluster, Distributed, Grid and finally Mesh

Each step up is a massive increase in cost, but it’s also a massive increase in uptime and such. I hate it when companies say they want 99.9% uptime (or even worse 5 9s of uptime) without thinking about what that’ll cost them.

If your business depends on your website being up, look at your code, look at your infrastructure and for your users sake figure out what you actually need and build the damn thing properly!

Thank you ;-)