Troubleshooting a REALLY Slow RAID 5
Can anyone help me troubleshoot my RAID 5 Array? I'm getting write speeds of 5 MB/sec and UNDER. The platform isn't that old, so it doesn't seem like it should be going that slow. I've tried what I know to check, but I've got lots of data on the array right now, so I haven't used some of the benchmarks that wipe data. Everything looks good. So I'm really at a loss.
Some details:
The server runs 10.04 with an AMD Athlon 3000 and 4 GB RAM. I added a PCI expansion card to get 4 extra SATA ports for all the hard drives. I think all the RAID drives are connected to that card, and I'm sure that can, technically, limit speed by quite a bit, but I still can't see it bringing writes down to less than 5 MB/s. Sometimes less than 1MB/S.
There are two 500 GB drives arranged as "Just a Bunch of Disks" for a total of about 1TB.
Then I have that and two other 1TB drives set up as the RAID 5. The partition is ext3 and shared via NFS to another desktop. If I navigate there via Nautilus on the other desktop and start copying files, that's where I get the low write speed.
Where should I start looking for problems?
Edit: I should have also noted that the OS is on a separate 20GB IDE drive. Reads and writes on that are appropriate for the drive I'm using. And if I SSH into the server and start creating either large or small files on the array, I see the same speed problems. So it's not a data transfer problem with the network. It's a Gigabit network and I can watch full 1080p content from the server via NFS without any problems.
Re: Troubleshooting a REALLY Slow RAID 5
Sorry for the followup, but I should probably list some of the tests I've done:
- Used dd to write about 2 GB to both the OS disk and the RAID array. OS disk finishes in about 20 seconds, RAID 5 Array finishes in like 20 min. Can't remember the exact time, but it was LESS THAN 1 Mb/s.
- During those writes and while I copy stuff across the network, top never reports any process above 6% CPU.
- hdparm -tT (I know this isn't the best) reports really high read speeds.
Chunk size is 64K
Filesystem block size is 4096
I appreciate any help anyone has.
Re: Troubleshooting a REALLY Slow RAID 5
What do you get with hdparm -tT when testing every partition one by one?
Re: Troubleshooting a REALLY Slow RAID 5
Returns the following for each drive.
Code:
/dev/sda:
Timing cached reads: 1328 MB in 2.00 seconds = 663.82 MB/sec
Timing buffered disk reads: 56 MB in 3.07 seconds = 18.25 MB/sec
/dev/sdb:
Timing cached reads: 1380 MB in 2.00 seconds = 689.88 MB/sec
Timing buffered disk reads: 256 MB in 3.01 seconds = 85.12 MB/sec
/dev/sdc:
Timing cached reads: 1356 MB in 2.00 seconds = 678.17 MB/sec
Timing buffered disk reads: 270 MB in 3.02 seconds = 89.53 MB/sec
/dev/sdd:
Timing cached reads: 1372 MB in 2.00 seconds = 686.11 MB/sec
Timing buffered disk reads: 196 MB in 3.01 seconds = 65.22 MB/sec
/dev/sde:
Timing cached reads: 1366 MB in 2.00 seconds = 682.58 MB/sec
Timing buffered disk reads: 230 MB in 3.00 seconds = 76.64 MB/sec
/dev/sda has the OS and is an old IDE laptop drive (slow). The rest are in the array and are SATA.
Doing the same command for the RAID devices returns:
Code:
/dev/md0:
Timing cached reads: 1380 MB in 2.00 seconds = 689.58 MB/sec
Timing buffered disk reads: 196 MB in 3.01 seconds = 65.07 MB/sec
/dev/md1:
Timing cached reads: 1366 MB in 2.00 seconds = 682.90 MB/sec
Timing buffered disk reads: 316 MB in 3.00 seconds = 105.31 MB/sec
/dev/md0 is the JBOD over sdb and sdc, and /dev/md1 is md0 with sdd and sde in the RAID 5 configuration.
Re: Troubleshooting a REALLY Slow RAID 5
I just realized I ran the command on the device, not the partition (/dev/sdb rather than /dev/sdb1). Does that make a difference?
Also, I only posted single results for all of them, but there's not much variation, and reading data isn't really a problem in practice. Actual results from copying data FROM the array over the network are in the range reported above. It's WRITING data to the array when things get ridiculously slow. Are there any write tests I can do that DON'T destroy data that's already on the array?
Re: Troubleshooting a REALLY Slow RAID 5
Partitions are on certain locations on the disks, if this is at the end of the disk, the speed will be slower, but that is not the point here, so no problem.
Seperated speeds look fine. The combined for /dev/md1 is a bit of a dissapointment. 133MB/s should be the max for PCI as you probably know. But not a problem either.
I don't understand why the write speeds are that low.
Did you verify the actual networkconnection speed?
Re: Troubleshooting a REALLY Slow RAID 5
Quote:
Originally Posted by
Joshua
Can anyone help me troubleshoot my RAID 5 Array? I'm getting write speeds of 5 MB/sec and UNDER. The platform isn't that old, so it doesn't seem like it should be going that slow. I've tried what I know to check, but I've got lots of data on the array right now, so I haven't used some of the benchmarks that wipe data. Everything looks good. So I'm really at a loss.
.........
This might be worth trying:
http://ubuntuforums.org/showthread.php?t=1625233
Re: Troubleshooting a REALLY Slow RAID 5
SaturnusDJ:
I was getting ready to reply that I didn't think it was the network, but now I'm getting some strange results from tests that I already did. First, is there a good way to verify the network speed?
The confusing test I did was running
Code:
dd count=1k bs=1M if=/dev/zero of=/foo/test.img
to write data on the OS drive and then on the array. It returned something similar to the following for BOTH:
Code:
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 71.0087 s, 15.1 MB/s
The last time I did that test, it returned about 1 MB/s for the array. Not as fast as I want but still better than it was before. But then I went into Nautilus to transfer some files via NFS from my desktop to the server, and it was back down below 1 MB/s.
dcstar:
I'm not familiar with that patch. Is that supposed to address general speed issues or is that specific to RAID? I don't think I have general issues with the OS since CPU utilization is never above about 5% when I'm reading or writing data.
Re: Troubleshooting a REALLY Slow RAID 5
Use 'ethtool' to see if you really are running Gbit speeds. (Also check this on the other computer.)
Maybe you can try Samba. I am no expert and anyways it is hard to see what is wrong here but maybe if Samba does work nice for you, we can say something is wrong with NFS.
A good write test might be to create a dir and mount it into your computers memory. Put a file into the dir and then write from that dir to your array.
Re: Troubleshooting a REALLY Slow RAID 5
Quote:
Originally Posted by
Joshua
..........
dcstar:
I'm not familiar with that patch. Is that supposed to address general speed issues or is that specific to RAID? I don't think I have general issues with the OS since CPU utilization is never above about 5% when I'm reading or writing data.
It makes a massive difference to systems that grind to a halt when hit with big disk I/O.