this post was submitted on 19 Apr 2024
22 points (92.3% liked)

Selfhosted

40006 readers
874 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

  1. Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.

  2. No spam posting.

  3. Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.

  4. Don't duplicate the full text of your blog or github here. Just post the link for folks to click.

  5. Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).

  6. No trolling.

Resources:

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 1 year ago
MODERATORS
 

I built a 5x 16TB RAIDz2, filled it with data, then I discovered the following.

Sequentially reading a single file from the file system gave me around 40MB/s. Reading multiple in parallel brought the total throughput in the hundreds of megabytes - where I'd expect it. This is really weird. The 5 disks show 100% utilization during single file reads. Writes are supremely fast, whether single threaded or parallel. Reading directly from each disk gives >200MB/s.

Splitting the the RAIDz2 into two RAIDz1s, or into one RAIDz1 and a mirror improved reads to 100 and something MB/s. Better but still not where it should be.

I have an existing RAIDz1 made of 4x 8TB disks on the same machine. That one reads with 250-350MB/s. I made an equivalent 4x 16TB RAIDz1 from the new drives and that read with about 100MB/s. Much slower.

All of this was done with ashift=12 and default recordsize. The disks' datasheets say their block size is 4096.

I decided to try RAIDz2 with ashift=13 even though the disks really say they've got 4K physical block size. Lo and behold, the single file reads went to over 150MB/s. πŸ€”

Following from there, I got full throughput when I increased the recordsize to 1M. This produces full throughput even with ashift=12. My existing 4x 8TB RAIDz1 pools with ashift=12 and recordsize=128K read single files fast.

Here's a diff of the queue dump of the old and new drives. The left side is a WD 8TB from the existing RAIDz1, the right side is one of the new HC550 16TB

< max_hw_sectors_kb: 1024
***
> max_hw_sectors_kb: 512
20c20
< max_sectors_kb: 1024
***
> max_sectors_kb: 512
25c25
< nr_requests: 2
***
> nr_requests: 60
36c36
< write_cache: write through
***
> write_cache: write back
38c38
< write_zeroes_max_bytes: 0
***
> write_zeroes_max_bytes: 33550336

Could the max_*_sectors_kb being half on the new drives have something to do with it?


Can anyone make any sense of any of this?

top 4 comments
sorted by: hot top controversial new old
[–] [email protected] 2 points 6 months ago

OK, I think it may have to do with the odd number of data drives. If I create a raidz2 with 4 of the 5 disks, even with ashift=12, recordsize=128K, the performance in sequential single thread read is stellar. What's not clear is why this doesn't affect, or not as much, the 4x 8TB-drive raidz1.

[–] [email protected] -2 points 6 months ago (1 children)

Would you use zfs and raid-z when there is only 1 file on your disk?

Would you build 4 ticket counters when your concert hall has only 1 seat? Would you build a 4 lane highway when there is only 1 car in your country?

:-)

[–] [email protected] 2 points 6 months ago (1 children)

Yes, yes I would use ZFS if I had only one file on my disk.

[–] [email protected] 1 points 6 months ago

Ok :-)

Then you probably shouldn't optimize it for the use of many files (which is the default, of course).