Monday, August 3, 2009

Squid docs

I've started writing up some of my notes from Squid consulting into something (mostly) fit for public consumption.

This is partly to aid myself and partly to try and stop others from finding and fixing the same mistakes.

The fledgling documentation dump is here . I'll be adding more to it as I type up more notes and complete more work!

Wednesday, July 15, 2009

Installing Proxy Cache Servers for Fun and Profit..

One of my current contracts involves setting up a web cache farm for an ISP on the end of a whole lot of full duplex satellite IP. They initially specced out 5 rather large servers (at least $10,000 each); I think they had a minor heart attack when I reduced that to one server. But then, the cost of bandwidth savings versus hardware (and Xenion's contracting/support rates!) is very minor in the long run.

In any case, it has been a resounding success. I'll summarise how things look at the moment; I'll do up a proper press release sometime later next month.

There's about 15,000 users sitting behind the single proxy cache server, with around 100mbit or so aggregate satellite IP bandwidth. The service uses a slightly modified FreeBSD-7 setup to support fully transparent HTTP interception (both client and server-side IP address spoofing) with a Cisco 3750 providing the WCCPv2 interception.

Tuning the FreeBSD stack (and Linux too, for those Linux people out there!) to effectively scale for satellite IP is no easy feat. It took a bit of time but I have quite a bit of experience in this area so the tuning was quite successful. The issue here is finding the right balance between throughput, scaling and link efficiency. A little bit of first year college mathematics helped me predict some decent settings and they work as expected.

The software is Lusca-HEAD (the very recent version as of this post) - this gives me all the useful Squid-2.7 features, stability and performance with my extras (twiddles for satellite IP stuff, TPROXY support, etc.)

The box itself is a dual dual-core AMD Opteron 270 at 2GHZ; 16gig RAM; Intel 1000/pro NIC, 3ware 9000 series SATA controller with 12 x 500gig 7200rpm disks of some sort. The disks are all mounted individually - no RAID at all. 10 disks are for storage; 1 for OS and 1 for logging.

The box pushes around 80 to 120mbit at peak with a byte hit rate between 20 and 40%. The request rate sits between 300 and 600 requests a second, sometimes peaking to 800 or more. This translates to traffic savings (saving a whole lot of money - satellite transponder space is expensive!) and much improved performance for clients.

It also handles between 10,000 and 20,000 concurrent connections with peaks over 40,000. Yes. 40,000 concurrent connections. I'm not making this up.

The cache size at the moment is around 2TB and 20,000,000 objects. I'm absolutely, positively not filling the disks to capacity for a whole lot of very good reasons. (Hint - don't do it.) I'll be happier to increase the storage to 4TB and beyond once I've deployed COSS for the small objects and tidied up some of the memory usage. The Lusca process is around 4 gigabytes at the present time and 75% that is the storage index and related bits.

Just for interests sake - out of the 20,000,000 objects, around 300,000 of them are larger than 256 kilobytes. The rest are small objects. It is quite scary actually how much of the cache directory is small objects.

I've included some preliminary windows update caching which is providing a 100% hit rate for the update files themselves. It's actually quite scary how simple it was to implement. Shame on you Microsoft for -almost- but not quite getting HTTP caching "right" in the windows updates.

All in all, the client in question is extremely happy about the support, installation and performance of the cache. There's a shortlist of items to do including Lusca improvements and reporting tools so the client can provide further information to his boss about how effective this all is.

Monday, June 15, 2009

why this blog is suddenly a spam blog?!

So apparently updating all of your labels to be consistent is enough to trigger the spambot logic. I apologize to anyone reading this blog and thinking its spam - honest, it's not. Really! Honest!

Grrr!

Saturday, June 13, 2009

New replacement hosting service - hosting-5

G'day,

I've just deployed a new hosting server (hosting-5.) It's running on the new network setup, running CentOS 5.3 32-bit, and generally seems quite well-behaved.

I'm going to migrate everyone on the old Fedora Core 6 hosting server over to this over the next week and then finally retire it.

EDIT: I've migrated a couple of customer VMs onto it (with their permission, of course!) and it has fixed their stability. Even Ubuntu VMs, traditionally having been very unstable, are now stable once again. Success!

Thursday, June 4, 2009

Hosting-4 downtime

One of the VM servers, hosting-4, needed a spontaneous reboot this afternoon. It seems the Xen management software got very upset after only 360 days of uptime.



On the downside, it means those who are hosted on hosting-4 will suffer a 10-20 minute downtime. It should have been 5 minutes except that the box and VMs have been up so long that fsck is enforcing file system checks.

For example:

/dev/sda1 has gone 361 days without being checked, check forced.

So to make things run smoother, I'm manually starting each VM once the previous one has finished fscking.

On the upside, the server is now running the latest CentOS 5.3 Xen packages which have fixed a fair few bugs.

I apologise for the downtime. I have to say though, 360 day uptime is pretty good. I'll just have to make sure that further downtime is scheduled in advance.

Monday, June 1, 2009

Outage - interstate and international traffic

There's some issue with my upstream's upstream's interstate link provider. Its affecting interstate and international traffic for my upstream, my upstream's upstream and potentially other providers.

I'll post an update when the problem is resolved.

EDIT: I really should point out that I'll feed twitter updates: http://twitter.com/#search?q=%23xenion

EDIT: The service has been restored at 1:10am. I'll keep an eye on things for a bit longer.

Friday, May 22, 2009

FreeBSD 6.3 and FreeBSD-7 Xen hosting

I've been playing around with the FreeBSD-7.x and FreeBSD-6.x Xen DomU support (thanks to Kip Macy) and documenting all of the strange bits needed to make a fully working environment.

I've managed to figure out all the right incantations to build the DomU, run the DomU and for the most part, keep the DomU up and running.

I may offer FreeBSD DomU support to Xenion customers with part proceeds being donated back to the FreeBSD project. 

Let me know if you're at all interested in this!