It's time to debunk the myth that Desktop RAID 0 really isn't 'worth it'. A number of tech sites have done hardware reviews claiming that Desktop RAID 0 really doesn't provide the typical end-user with much in the way of performance benefits - and is more likely to result in data loss. Those 'findings' opinions have been promulgated about the web so much that they've gained mythical standing - to the point where even smart people seriously believe that RAID-0 isn't worth it on a desktop machine.
Arguments Against RAID-0
Arguments against RAID-0 typically stem from the following:
1) The benefits of RAID-0 just aren't substantial enough for the 'average desktop user'.
2) Using RAID-0 (which requires multiple disks) increases your chance of encountering a failure - and thereby losing all of your data.
The rationale for the arguments against RAID-0, therefore, typically go something like: "It's easy to lose your data, and doesn't buy you much - so stay away from it."
And plenty of sites/experts tend to echo these sentiments. Here's a table of links (along with my sarcastic summaries of each opinion).
Dev Hardware (RAID 0 is scary - trust us, we read about it on teh intarweb...)
AnandTech (RAID 0 is faster on every single benchmark we tested, but, uhhh, we don't think that means it would actually BE faster in reality - even though we make our living by finding systems with the highest benchmark numbers ... )
Tech News Blog (Some people are too stupid to use RAID-0, so you shouldn't use it. (Actually, this is a fairly level-headed post and only highlights some of the issues with RAID 0 and data protection.))
CodingHorror (See, a smart person who is promulgating the myth... The benefits aren't that compelling??? How about some tests/numbers?)
All the anti RAID-0 folks are WRONG
In this post I'm going to prove all of anti RAID-0 'nay sayers' wrong - at least for power users. (I could care less about the 'average' desktop user - anyone who just uses their computer for looking at 'pictures', browsing the web, and sending email wouldn't be reading this post ANYHOW.)
RAID-0 Performance - Explained
RAID-0 performance is just plain spiffy - no matter how you slice it. And ironically, a clear and concise description of how it works is found in that AnandTech.com article that I so despise.
The premise behind striping is simple. Data being written to a drive is split into "stripes", generally 16 - 256KB in size, with each stripe being written to a different drive in the array. For example, say we were dealing with a 2-drive RAID-0 array with a stripe size of 128KB and we wanted to write 256KB of data; drive 0 would get the first 128KB of data written to it, and drive 1 would get the remaining 128KB.Here, you can see that the write performance of RAID-0 can be almost double that of a single drive, since twice as much data gets written at the same time. The higher write performance is obtained at the expense of some controller overhead, since the RAID controller has to handle splitting up data into stripes before sending it to the drives themselves - but with modern day microprocessors being as fast as they are, the overhead is usually thought of as negligible.
Reading works the exact same way, but in reverse. Say that we want to read that same 256KB of data back; we pull one stripe from drive 0 and the other stripe from drive 1. The read is now completed in half the time, theoretically doubling performance.
Their explanation is spot on. The problem, of course, is their idiotic injection of the term "theoretical" into the equation. RAID-0 performance is not just theoretical, it's tangible: RAID-0 performance is multiples quicker. Them saying that they don't typically work their systems enough to SEE this benefit does NOT render the performance benefits of RAID-0 theoretical.
My own RAID-0 Experience
Me, I was just sure that RAID-0 would net me some benefit. So when I got my new box, I set it up with a RAID-0 configuration. The box was new, and powerful, and obviously EVERYTHING screamed. Then I thought that a possible firmware issue between my RAID controllers (on the Northbridge) might be causing interference with my 'spensive' gaming mouse. So I destroyed my RAID-0 and repaved my system without RAID-0. Things still screamed, everything was still fast. Only... Outlook 2007 would take 20 - 25 seconds to fire up when I first booted my machine (and I've only got a 600MB .pst). What was worse, Outlook 2007 still took like 5 - 12 seconds to restart any time that I closed it (played games, or whatever) and then re-opened it.
So, after finally solving my wireless mouse interference issue (it was a radio-frequency problem - and had nothing to do with my controllers/north-bridge), I decided to re-RAID-0 my box. Once again everything screamed. Only Outlook 2007 was back to opening up in < 10 seconds after a fresh reboot. And it was taking < 3 - 5 seconds to re-open.
That one piece of anecdotal evidence was enough to force me to look into trying out some benchmarks.
My own RAID-0 Benchmarks
Knowing what I know about RAID (that knowledge helps pay the bills around here as a SQL Server consultant), I was pretty sure that in all cases RAID-0 would be faster in just about every scenario involving 'large-ish' files. The only question was what constituted a 'large-ish' file, and would the threshold for 'large-ish' be low enough to make RAID-0 worth the added effort of installation/configuration. In other words: Given that I do a LOT of video editing and moving/copying of Virtual Machine files, I figured that the benefits for just those two activities would be worth it. But what about 'piddly stuff'.
So I grabbed a benchmarking tool that writes/reads an arbitrary amount of data (i.e. a specified file size), then just saturates your pipe to see how much throughput it cram through within a specified amount of time. In my test I used my current box doing tests against my RAID-0 'drive' and against my non-RAID 'backup' drive as well. Both 'drives' are SATA and have very similar specifications (i.e. 7200 RPM, SATA 2 - but my 'backup drive' has a 32MB cache whereas each of the RAID drives only has a 16MB cache each).
The results:
- 2 GB Files: My RAID-0 gave me over a 2x performance improvement (I can only suspect saturation/queuing against a single drive where such isn't the case against 2 drives). That said, this is always something that I suspected as moving VMs around on my RAID-0 drives in the past has been bloody fast compared to just a single drive.
- 30 MB files: Another huge performance improvement - still OVER 2x what I was seeing with my non-RAID drive. This too isn't that surprising, with video editing I've 'felt' some big improvements with my RAID configuration, and installs from DVD go tons faster.
As a bit of anecdotal evidence, Carson and I needed to re-install BF 2142 the other night after playing around with a beta version/patch. We both started the install at roughly the same time and I completed the install (roughly 2 GB when it's all said and done) WELL before his install completed - and he's got a 10K Raptor - I've just got 7.2K disks in my RAID-0. That wasn't an empirical test by ANY stretch of the imagination, but none-the-less my RAID-0 provided a tangible boost to throughput. - 1 MB files: No distinguishable difference.
- 100K files: No distinguishable difference. (Note though, the RAID-0 isn't any SLOWER here... it's just the same speed for all intents and purposes.)
So, let's translate those performance 'benefits' into something that everyone (even grandmas) can understand:
- Using Outlook 2007: Not much faster once Outlook is OPEN, but expect it to OPEN a noticeably faster - especially if you've got large .pst files laying around. (The big thing: Outlook will NOT be slower.)
- Writing Emails: Nope, not faster. But, again, not slower either.
- Watching silly videos of cats on YouTube.com: Not appreciably faster - but not slower either... (although... if the cat videos are REALLY large, you'll be able to write them to local cache at almost 2x the speed... so even stupid cat videos SHOULD be faster... as long as they're 'large-ish'...)
- Searching your computer for some files or other crap: Not faster - but not slower either (starting to see a trend?)
- Moving around big VPC/VM files and .ISO files: Considerably faster. In fact, almost twice as fast - sometimes even faster than 2x under the right scenarios.
- Video Editing: Depends on the type of editing and what software you're using, but definitely NOT slower, and in the case for more IO intensive operations it will likely be much faster.
- Games: Not really going to be much faster in terms of playability - but loading levels and maps will be tons faster in most cases. But again: definitely not slower.
- Installing software: This one is a huge win - expect an improvement. Think about it: copy from a CD/DVD to your HD. It's a straight question of throughput for larger files, and you can see big performance improvements. This is especially the case when installing 'big' applications like Visual Studio, SQL Server, and some of today's 'honking big' games.
- Backups and Imaging: If you're using imaging software (Acronis/Ghost/etc) expect some truly obscene performance benefits (I can image/copy 20-30 GBs of crap on my primary partition using Acronis in well under 3 minutes). Backups won't be any slower than with a single disk, and you'll only see appreciable perf benefits if you're writing lots of larger files. But the thing to remember: The only thing you stand to see here is improvement - no negatives.
The Performance Myth debunked
The Anandtech.com article on RAID-0 just kills me. In every perf test that they executed the RAID-0 configuration was either just as fast, or FASTER than a non-RAID-0 counterpart. Sometimes the benefit was big - as in a 20% (or greater) benefit. Normally a 'gaming' site like Anandtech.com would be all OVER those kinds of benefits, especially if they come with no performance negatives. Yet in the end of the article, they fall prey to the same mind-set: "benefits just aren't there, and you'll shoot your eye out kid..."
Granted, the gaming benefits aren't there. So AnandTech.com is probably right in their diagnosis for GAMERS. But if you use your machine for something OTHER than just games, the benefits are seriously tangible. Come on, a 20% improvement for 'business' type operations? That, and I highly suspect that any of those 'business' benchmark applications test 'real' things like heavily pegging your disk by moving around large files. And yeah, I don't do that ALL day long - but when I do move around my files why the hell wouldn't I want that to be faster?
In other words, there's no performance penalty for running with RAID-0. And it CAN make a noticeable difference for things like Outlook and Video Editing, as well as even some minor aspects of game-play. More importantly: RAID-0 is not just theoretically faster, it really IS faster (typically by a factor of as many disks as you have involved). The only caveat is that you have be moving/reading/writing large files for that benefit to be realized. But when disk is the single-biggest bottleneck on your PC, why wouldn't you go for any option that let you improve overall speed? Unless, of course, you're worried about RAID-0 myth #2: Data Loss.
You'll shoot your eye out - or why is backup so damned hard?
I really get irked when so called industry luminaries shoot down RAID-0 due to possible issues with drive failure. It's a ridiculous argument.
Because only a NINNY would be put out by drive failure.
If you're a power user, you've got a religiously implemented and trusted backup routine. One that makes nightly local backups, and at least weekly off-site backups. (I backup remotely every night.) A hard-drive failure WOULD be a miserable setback. But if it put you out of business then you don't deserve your clients/job.
I think of it this way: A hard-drive failure with EITHER a RAID-0 or NON-RAID-0 configuration/system would be a pain in the ass. But neither would put me out of business. I'd just re-image (quicker on a RAID-0 system), re-install recent apps and other crud (also quicker on a RAID-0 system), and then do a restore of my user data, files, and all other files/data/crap (also quicker on a RAID-0 system). The point is: I'd have to do all of those steps regardless of my underlying disk configuration.
RAID-0 Disk Failure - De-FUD-ed
I also take huge exception to the whole "you're 3x more likely to encounter disk failure running a RAID-0" argument. The way the argument works is that you're using more moveable parts and components, so therefore you increase your risk of failure.
Obviously there's truth to that argument. But I could also argue that using anything greater than a 4400 RPM drive is a "baaaaaaad idea" because all of that "excess spinning and heat" raises your risk of failure. Hell, with that logic a 10K disk should fail at > 2x the rate of a 4400 RPM drive and therefore anyone using 10K (or HEAVEN FORBID 15K) disks is just WAITING for a disaster.
Yes, HDs do turf-it and die. So too (I guess) do controllers. But if a SATA RAID Controller can barf (the mysterious '3rd component' in the RAID-0 recipe for disaster), then why can't a non-RAID SATA controller barf? (i.e., if a 'RAID-0 has a so-called '3x' chance for failure, then a non-RAID system has a 2x (drive and controller) chance of failure - the point is that you're not 3 TIMES as likely to encounter a drive failure using a RAID system as you wquld be with a simple non-RAID system.)
HD Failure Gedanken
Let's look at this in a different light - one that actually MATTERS and transcends idiotic 2x and 3x failure rate arguments ("the Hulk could TOOO take on Darth Vader!!!111"). Suppose that we have two HDs: Disk A and Disk B. Both are rated to ... 20 years of continual use by their manufacturer. In reality, one of them (let's say Disk B) will turf-it after 18 weeks of real-use and die (let's just say it's a dud sitting on the shelf waiting for some poor unsuspecting consumer). Let's also say that Disk A will last for 12 years of activity with no issues what-so-ever.
Now, in a theoretical scenario a RAID-0 sceptic goes out and buys a new HD for a new system. If he/she gets Disk A, then there's no problem, and the system runs 'forever'.
The skeptical argument is that if I go out and buy two disks for my RAID-0, and get Disks A and B, I'm doomed for a failure in 18 weeks. NO ARGUMENTS there; 18 weeks later, Disk B turfs-it, and my RAID-0 is TOAST. There's no way to get pictures of my cat, expense reports, or any of my software off of my system partitions. If that happened, it would really be a pain in the butt. But I'd RECOVER because I religiously back up data that I care about.
Now suppose that our hypothetical RAID-0 sceptic purchased Drive B instead of Drive A. In that case, they too would encounter a drive failure within 18 weeks. Gone too would be their expense reports, software, and pictures of their cat.
Question: Are they religiously backing up their data?
If so, then the drive crash is a pain in the butt, but they recover from it and move on - just as the RAID-0 user who is backing up their data.
If they're not (which none of these pansy RAID-0 sceptics seem to be able to manage), then they're just plain screwed. The moral of the story, though, is that their state of 'screwed-ness' has absolutely NOTHING to do with whether or not they decided to run a RAID-0 or not - it's all just a symptom of their own incompetence. In other words: If you're too scared to run RAID-0 because you don't back-up your data and you're worried about a greater chance of failure then don't run RAID-0 as you're just too stupid to pull it off. In fact, you might want to consider trading your computer in for an XBox or web appliance...
COMMENTARY: And no, I'm not saying that everyone who uses a computer should be a savant/genius. But I am saying that for GEEKS trying to squeeze perf out of their systems, arranging a dependable backup topology shouldn't be an impediment to your use of TECHNOLOGY. In fact, it should be something you've already addressed - and therefore should NOT be something that discourages you from looking for a perfectly logical perf boost...
In summary
So yeah, the potential for a disk crash is, admittedly, potentially higher for a RAID-0 setup, blah blah blah... But if you're habitually backing up (and testing your backups regularly), then this is really a completely MOOT point. As such, if you CAN setup a system to use RAID-0 you'll ONLY see performance benefits. And some of those benefits could even be substantial depending upon your workloads/habits/needs. The question then is: Why wouldn't you want the increased performance boost where you can get it? Unless, of course, you've been snowed by the anti RAID-0 Myth...