Tuesday, August 14, 2007

My new disk has just eaten planet earth!

Here are some things that trouble me
-- Disk are really no more reliable than they were 3-5 years ago
-- Manufacturers are chasing capacity, 3x or 4x growth over the same period of time
-- The failure rate per MB stored grown as the same rate

I conclude that my data is being stored less reliable than 3-5 years ago. That really worries me. Sure, RAID-5, 5 and 6 will help. But it means that my data is much more vulnerable whilst the RAID group is being rebuilt, since the amount of time it takes to rebuild a RAID group with 1TB disks is proportionally longer than 73GB disks. I'm now in a much wider window to suffer a double or triple disk failure, because the disks are no more reliable. Yikes! Oh and a double disk failure. Well that's not the worst of my problems. Read error whilst re-building a RAID group. Yup, they happen all the time.

Given the price of disks, I am seriously considering RAID-1 everywhere. Why? Well it helps scale my I/O (see my other posts on supporting very large data warehouses), but it gets me away from RAID group reconstruction, I simply just need to re silver the disk, I don't need to recompute parity which kills my performance to the unaffected data.

So before you flame me, just think about why we had the other RAID schemes. It was to increase the reliability of unreliable disks (Redundant Array of Inexpensive Disks). RAID-4 and 5 saved capacity because they stored parity not a physical copy of the data. But I am faced with an excess of capacity, because I need to size my system for I/O not capacity. In the old days you sized it for capacity and as a function of that you got an excess (well normally at least) of I/O. What used to require a whole shelf of disks now can be stored on a single disk. Great for capacity and density, horrible for the fact that I now have 1/12th or 1/14th of the I/O because disks have essentially remained unchanged since the introduction of 15k RPM disks.

RAID-1. Here we go.

Oh dear, you are out of capacity sir

I have to run very (and I mean close to 250TB) Data Warehouses. There are a huge number of challenges with these sort of environments, getting the data loaded, keeping the performance scalable and stable, backing up and restoring... its a much longer list!

So I have used Oracle, Sybase IQ, DB2, MySQL. I recently was introduced to Greenplum (I don't work for them), who provide an appliance (oh, how trendy) that combines a Sun "Thimper" (X4500 I think) which has 4 x AMD processors and 48 x 500GB disks, with a clustered version of PostgreSQL. I was intrigued. But then it made me think. I have 24TB of storage in 4U of rack space, that is really dense but I also have a large number of disks. Typically I am disk bound, so more dense disks makes a great deal of sense. I do end up wasting space (i.e. capacity) since I don't always need the storage BUT I do need the I/O. The push for greater capacity disk it not really helping me, so I really wish the disk manufactures would stop that race. Its just like the camera manufactuers putting more mega-pixels into your point-and-shoot. Above 6MP who cares right. Over 500GB on a disk, who cares if I ever want to do a mixed random rea and write workload (i.e. a database workload).

So why am I mad? All they are giving me is a higher rate of failure per MB of stored data, since the disks are no more reliable than they were, like in 1999

Here are some things that trouble me
-- Disk are really no more reliable than they were 3-5 years ago
-- The capacity has grown dramatically
-- The failure rate per MB stored is that same
-- Therefore the disk have a higher failure rate per MB stored

So Mr. Disk manufacturer, please learn that I need I/O and I need reliability. I don't need any more stinking MB per platter.

Wednesday, August 8, 2007

Crying over spilt beer

So, some blogs and I stress the some part here, have turned into "my last employer sucked because..." diatribe. I guess people need to vent, but I really don't care if I am honest. There have been several recently on the NetApp versus EMC. Well I have some interest here. I need to store things reliably, performance does matter to my application users and yes cost is always an issue. Sure, even my a supplier gives it to me at no cost (that will be the day huh?) there is still cost of managing the thing.

I guess the thing that gets my goat here, is that all these blogs read (and I read them all) like "your technology sucks because of X, Y or sometimes even Z". I guess that is the difference between somebody who cares about business versus the technology. The technology guy will say my technology is best because of some great feature that his or her competitor does not have. Well I don't give a monkeys adam to be perfectly clear. What I really care is the business value. I don't care that EMC's BCV's are technically superior than NetApp FlexClones. Sure the business value is that I can clone my application in an instant. If, I need to make a change to it, for example to change is SAP ID or Oracle Database SID, then I change a block. Well for BCV's, which are basically a copy-on-write, well darn it, I have a complete physical copy after I make the first byte of change. Its cool that NetApp FlexClones only store the change, so if I had used that technology then I would have saved storage right?

Wrong. I never, read never store my copies of Dev and Test on the same piece of storage as production. That would be dumb. Not just for performance, but you think I would let developers see a real copy of actual data. Our current financial data? Your HR records? Your are kidding right? So, no matter what I have to make a physical copy, simply to ensure the operational integrity of my system and privacy of the data. Now, NetApp does provide me business value in making the secondary copies basically cost free. The technology is cool, but that is real business value to me.

Do I expect that both can make a consistent point in time copy, sure. But both have consistency group features that allow that.

So, vendors, don't blind me with science. Tell me something that I care about. Tell me how you are different. Don't slag off your competitors, if you talk about them *that* much, then sure I'm going to get a trail unit. And sure, I'm going to love it to. Remember to speak about the business value and how it solves my problems, not the technology. Its your money and quarters figures to throw away.

The cult of the 12" PowerBook

So I sit here on my aging but much loved 12" PowerBook after a coast-to-coast trip in the USA. I still smile like the Cheshire cat when I see people with there jumbo 15" or 17" machines trying to open up their laptops in economy class. The 12" PowerBook, now discontinued, is probably the best form factor I have ever used if you spend your life in 9B (substitute your own favorite seat). I was happy to see that on my last trip I counted four other people who still hang on grimly to their beloved machines, waiting for that day when Apple thinks small is beautiful again. We can only hope.

So what would I like? Well
- Similar form factor, thinner would be nice
- Proper keyboard, i.e. a MacPro not a MacBook keyboard
- Faster processors, my 1.33GHz is not coping well these days
- 2GB Memory
- Better battery life (hell this is a wish-list)
- 2 x USB 2.0
- 1 x Firewire
- iSight built in
- DVD burner
- Line in
- Headphone

Things I could live without
- Modem
- Firewire 800
- If I really had to I could loose the DVD burner, if I have USB then I can plug in an external burner

Its still great to see so many of these machines in use, but Leopard is coming and Apache + MySQL + Eclipse just about brings my machine to its knees.

Coming up next time: "Stop crying over spilt beer"