Options for performance improvements on very big Filesystems and high IOWAIT

I have a Ubuntu 16.04 Backup Server with 8x10TB HDD via a SATA 3.0 Backplane. The 8 Harddisks are assembled to a RAID6, an EXT4 Filesystem is in use. This Filesystem stores a huge amount of small files with very many SEEK operations but low IO throughput. In fact there are many small files from different servers which get snappshotted via rsnapshot every day (multiple INODES direct to the same files. I have a very poor performance since the file system (60TB net) exceeded 50% usage. At the moment, the usage is at 75% and a

du -sch /backup-root/

takes several days(!). The machine has 8 Cores and 16G of RAM. The RAM is totally utilized by the OS Filesystem Cache, 7 of 8 cores always idle because of IOWAIT.

Filesystem volume name:   <none>

Last mounted on:          /

Filesystem UUID:          5af205b0-d622-41dd-990e-b4d660c12bd9

Filesystem magic number:  0xEF53

Filesystem revision #:    1 (dynamic)

Filesystem features:      has_journal ext_attr dir_index filetype needs_recovery extent 64bit flex_bg sparse_super large_file huge_file uninit_bg dir_nlink extra_isize

Filesystem flags:         signed_directory_hash 

Default mount options:    user_xattr acl

Filesystem state:         clean

Errors behavior:          Continue

Filesystem OS type:       Linux

Inode count:              912203776

Block count:              14595257856

Reserved block count:     0

Free blocks:              4916228709

Free inodes:              793935052

First block:              0

Block size:               4096

Fragment size:            4096

Group descriptor size:    64

Blocks per group:         32768

Fragments per group:      32768

Inodes per group:         2048

Inode blocks per group:   128

RAID stride:              128

RAID stripe width:        768

Flex block group size:    16

Filesystem created:       Wed May 31 21:47:22 2017

Last mount time:          Sat Apr 14 18:48:25 2018

Last write time:          Sat Apr 14 18:48:18 2018

Mount count:              9

Maximum mount count:      -1

Last checked:             Wed May 31 21:47:22 2017

Check interval:           0 (<none>)

Lifetime writes:          152 TB

Reserved blocks uid:      0 (user root)

Reserved blocks gid:      0 (group root)

First inode:              11

Inode size:               256

Required extra isize:     28

Desired extra isize:      28

Journal inode:            8

First orphan inode:       513933330

Default directory hash:   half_md4

Directory Hash Seed:      5e822939-cb86-40b2-85bf-bf5844f82922

Journal backup:           inode blocks

Journal features:         journal_incompat_revoke journal_64bit

Journal size:             128M

Journal length:           32768

Journal sequence:         0x00c0b9d5

Journal start:            30179

I'm lacking experience with this kind of filesystem usage. What options do I have to tune this. What filesystem would perform better with this scenario? Are there any options to involve RAM for other caching options than the OS-build-in one?

How do You handle very large amounts of small files on large RAID assemblies?

Thanks,
Sebastian

asked Jan 18 at 22:09

t2m

513

2

Faster disks, preferably SSD. As much RAM as possible for read caching. 16GiB isn't even in the same planet as enough RAM. Get LOTS of it, even 512GiB or more. And of course don't use RAID 6.

– Michael Hampton♦
Jan 18 at 22:15

Thanks for your reply. I'm aware of the SSD option, but this makes the difference between a 7000$ Server or a 70000$ Server for backing up data. The RAM hint is a good one, but I fear that I will only get a virgin-like filesystem performance if I totally avoid DISK IO for SEEK operations which means at 60TB net. capacity a 60TB RAM cache, doesn't it? I avoided other Filesystems than EXT2/3/4 in the past, but now I am totally open for options in this direction, if they will help. :)

– t2m
Jan 18 at 22:41

What's your recommendation for a RAID6 replacement at this disk configuration?

– t2m
Jan 18 at 22:50

1

"In fact there are many small files from different servers which get snappshotted via rsnapshot every day (multiple INODES direct to the same files." - I think you mean multiple links/names to the same inodes. When hard-linking a file, there's only one inode, but two (or more) links/names.

– marcelm
Jan 19 at 11:52

1

Dude, if that is a 7000 USD server then STOP GETTING RIPPED OFF. And adding 1000 USD in PCIe SSD into the server will no magically make it a 70k SSD server.

– TomTom
Jan 19 at 15:41

|
show 4 more comments

du -sch /backup-root/

takes several days(!). The machine has 8 Cores and 16G of RAM. The RAM is totally utilized by the OS Filesystem Cache, 7 of 8 cores always idle because of IOWAIT.

Filesystem volume name:   <none>

Last mounted on:          /

Filesystem UUID:          5af205b0-d622-41dd-990e-b4d660c12bd9

Filesystem magic number:  0xEF53

Filesystem revision #:    1 (dynamic)

Filesystem features:      has_journal ext_attr dir_index filetype needs_recovery extent 64bit flex_bg sparse_super large_file huge_file uninit_bg dir_nlink extra_isize

Filesystem flags:         signed_directory_hash 

Default mount options:    user_xattr acl

Filesystem state:         clean

Errors behavior:          Continue

Filesystem OS type:       Linux

Inode count:              912203776

Block count:              14595257856

Reserved block count:     0

Free blocks:              4916228709

Free inodes:              793935052

First block:              0

Block size:               4096

Fragment size:            4096

Group descriptor size:    64

Blocks per group:         32768

Fragments per group:      32768

Inodes per group:         2048

Inode blocks per group:   128

RAID stride:              128

RAID stripe width:        768

Flex block group size:    16

Filesystem created:       Wed May 31 21:47:22 2017

Last mount time:          Sat Apr 14 18:48:25 2018

Last write time:          Sat Apr 14 18:48:18 2018

Mount count:              9

Maximum mount count:      -1

Last checked:             Wed May 31 21:47:22 2017

Check interval:           0 (<none>)

Lifetime writes:          152 TB

Reserved blocks uid:      0 (user root)

Reserved blocks gid:      0 (group root)

First inode:              11

Inode size:               256

Required extra isize:     28

Desired extra isize:      28

Journal inode:            8

First orphan inode:       513933330

Default directory hash:   half_md4

Directory Hash Seed:      5e822939-cb86-40b2-85bf-bf5844f82922

Journal backup:           inode blocks

Journal features:         journal_incompat_revoke journal_64bit

Journal size:             128M

Journal length:           32768

Journal sequence:         0x00c0b9d5

Journal start:            30179

How do You handle very large amounts of small files on large RAID assemblies?

Thanks,
Sebastian

asked Jan 18 at 22:09

t2m

513

2

Faster disks, preferably SSD. As much RAM as possible for read caching. 16GiB isn't even in the same planet as enough RAM. Get LOTS of it, even 512GiB or more. And of course don't use RAID 6.

– Michael Hampton♦
Jan 18 at 22:15

Thanks for your reply. I'm aware of the SSD option, but this makes the difference between a 7000$ Server or a 70000$ Server for backing up data. The RAM hint is a good one, but I fear that I will only get a virgin-like filesystem performance if I totally avoid DISK IO for SEEK operations which means at 60TB net. capacity a 60TB RAM cache, doesn't it? I avoided other Filesystems than EXT2/3/4 in the past, but now I am totally open for options in this direction, if they will help. :)

– t2m
Jan 18 at 22:41

What's your recommendation for a RAID6 replacement at this disk configuration?

– t2m
Jan 18 at 22:50

1

"In fact there are many small files from different servers which get snappshotted via rsnapshot every day (multiple INODES direct to the same files." - I think you mean multiple links/names to the same inodes. When hard-linking a file, there's only one inode, but two (or more) links/names.

– marcelm
Jan 19 at 11:52

1

Dude, if that is a 7000 USD server then STOP GETTING RIPPED OFF. And adding 1000 USD in PCIe SSD into the server will no magically make it a 70k SSD server.

– TomTom
Jan 19 at 15:41

|
show 4 more comments

du -sch /backup-root/

takes several days(!). The machine has 8 Cores and 16G of RAM. The RAM is totally utilized by the OS Filesystem Cache, 7 of 8 cores always idle because of IOWAIT.

Filesystem volume name:   <none>

Last mounted on:          /

Filesystem UUID:          5af205b0-d622-41dd-990e-b4d660c12bd9

Filesystem magic number:  0xEF53

Filesystem revision #:    1 (dynamic)

Filesystem features:      has_journal ext_attr dir_index filetype needs_recovery extent 64bit flex_bg sparse_super large_file huge_file uninit_bg dir_nlink extra_isize

Filesystem flags:         signed_directory_hash 

Default mount options:    user_xattr acl

Filesystem state:         clean

Errors behavior:          Continue

Filesystem OS type:       Linux

Inode count:              912203776

Block count:              14595257856

Reserved block count:     0

Free blocks:              4916228709

Free inodes:              793935052

First block:              0

Block size:               4096

Fragment size:            4096

Group descriptor size:    64

Blocks per group:         32768

Fragments per group:      32768

Inodes per group:         2048

Inode blocks per group:   128

RAID stride:              128

RAID stripe width:        768

Flex block group size:    16

Filesystem created:       Wed May 31 21:47:22 2017

Last mount time:          Sat Apr 14 18:48:25 2018

Last write time:          Sat Apr 14 18:48:18 2018

Mount count:              9

Maximum mount count:      -1

Last checked:             Wed May 31 21:47:22 2017

Check interval:           0 (<none>)

Lifetime writes:          152 TB

Reserved blocks uid:      0 (user root)

Reserved blocks gid:      0 (group root)

First inode:              11

Inode size:               256

Required extra isize:     28

Desired extra isize:      28

Journal inode:            8

First orphan inode:       513933330

Default directory hash:   half_md4

Directory Hash Seed:      5e822939-cb86-40b2-85bf-bf5844f82922

Journal backup:           inode blocks

Journal features:         journal_incompat_revoke journal_64bit

Journal size:             128M

Journal length:           32768

Journal sequence:         0x00c0b9d5

Journal start:            30179

How do You handle very large amounts of small files on large RAID assemblies?

Thanks,
Sebastian

asked Jan 18 at 22:09

t2m

513

du -sch /backup-root/

takes several days(!). The machine has 8 Cores and 16G of RAM. The RAM is totally utilized by the OS Filesystem Cache, 7 of 8 cores always idle because of IOWAIT.

Filesystem volume name:   <none>

Last mounted on:          /

Filesystem UUID:          5af205b0-d622-41dd-990e-b4d660c12bd9

Filesystem magic number:  0xEF53

Filesystem revision #:    1 (dynamic)

Filesystem features:      has_journal ext_attr dir_index filetype needs_recovery extent 64bit flex_bg sparse_super large_file huge_file uninit_bg dir_nlink extra_isize

Filesystem flags:         signed_directory_hash 

Default mount options:    user_xattr acl

Filesystem state:         clean

Errors behavior:          Continue

Filesystem OS type:       Linux

Inode count:              912203776

Block count:              14595257856

Reserved block count:     0

Free blocks:              4916228709

Free inodes:              793935052

First block:              0

Block size:               4096

Fragment size:            4096

Group descriptor size:    64

Blocks per group:         32768

Fragments per group:      32768

Inodes per group:         2048

Inode blocks per group:   128

RAID stride:              128

RAID stripe width:        768

Flex block group size:    16

Filesystem created:       Wed May 31 21:47:22 2017

Last mount time:          Sat Apr 14 18:48:25 2018

Last write time:          Sat Apr 14 18:48:18 2018

Mount count:              9

Maximum mount count:      -1

Last checked:             Wed May 31 21:47:22 2017

Check interval:           0 (<none>)

Lifetime writes:          152 TB

Reserved blocks uid:      0 (user root)

Reserved blocks gid:      0 (group root)

First inode:              11

Inode size:               256

Required extra isize:     28

Desired extra isize:      28

Journal inode:            8

First orphan inode:       513933330

Default directory hash:   half_md4

Directory Hash Seed:      5e822939-cb86-40b2-85bf-bf5844f82922

Journal backup:           inode blocks

Journal features:         journal_incompat_revoke journal_64bit

Journal size:             128M

Journal length:           32768

Journal sequence:         0x00c0b9d5

Journal start:            30179

How do You handle very large amounts of small files on large RAID assemblies?

Thanks,
Sebastian

ubuntu-16.04 ext4 performance-tuning

asked Jan 18 at 22:09

t2m

513

asked Jan 18 at 22:09

t2m

513

asked Jan 18 at 22:09

t2m

513

asked Jan 18 at 22:09

t2m

513

asked Jan 18 at 22:09

t2m

513

2

Faster disks, preferably SSD. As much RAM as possible for read caching. 16GiB isn't even in the same planet as enough RAM. Get LOTS of it, even 512GiB or more. And of course don't use RAID 6.

– Michael Hampton♦
Jan 18 at 22:15

Thanks for your reply. I'm aware of the SSD option, but this makes the difference between a 7000$ Server or a 70000$ Server for backing up data. The RAM hint is a good one, but I fear that I will only get a virgin-like filesystem performance if I totally avoid DISK IO for SEEK operations which means at 60TB net. capacity a 60TB RAM cache, doesn't it? I avoided other Filesystems than EXT2/3/4 in the past, but now I am totally open for options in this direction, if they will help. :)

– t2m
Jan 18 at 22:41

What's your recommendation for a RAID6 replacement at this disk configuration?

– t2m
Jan 18 at 22:50

1

"In fact there are many small files from different servers which get snappshotted via rsnapshot every day (multiple INODES direct to the same files." - I think you mean multiple links/names to the same inodes. When hard-linking a file, there's only one inode, but two (or more) links/names.

– marcelm
Jan 19 at 11:52

1

Dude, if that is a 7000 USD server then STOP GETTING RIPPED OFF. And adding 1000 USD in PCIe SSD into the server will no magically make it a 70k SSD server.

– TomTom
Jan 19 at 15:41

|
show 4 more comments

2

Faster disks, preferably SSD. As much RAM as possible for read caching. 16GiB isn't even in the same planet as enough RAM. Get LOTS of it, even 512GiB or more. And of course don't use RAID 6.

– Michael Hampton♦
Jan 18 at 22:15

Thanks for your reply. I'm aware of the SSD option, but this makes the difference between a 7000$ Server or a 70000$ Server for backing up data. The RAM hint is a good one, but I fear that I will only get a virgin-like filesystem performance if I totally avoid DISK IO for SEEK operations which means at 60TB net. capacity a 60TB RAM cache, doesn't it? I avoided other Filesystems than EXT2/3/4 in the past, but now I am totally open for options in this direction, if they will help. :)

– t2m
Jan 18 at 22:41

What's your recommendation for a RAID6 replacement at this disk configuration?

– t2m
Jan 18 at 22:50

1

"In fact there are many small files from different servers which get snappshotted via rsnapshot every day (multiple INODES direct to the same files." - I think you mean multiple links/names to the same inodes. When hard-linking a file, there's only one inode, but two (or more) links/names.

– marcelm
Jan 19 at 11:52

1

Dude, if that is a 7000 USD server then STOP GETTING RIPPED OFF. And adding 1000 USD in PCIe SSD into the server will no magically make it a 70k SSD server.

– TomTom
Jan 19 at 15:41

Faster disks, preferably SSD. As much RAM as possible for read caching. 16GiB isn't even in the same planet as enough RAM. Get LOTS of it, even 512GiB or more. And of course don't use RAID 6.

– Michael Hampton♦
Jan 18 at 22:15

Thanks for your reply. I'm aware of the SSD option, but this makes the difference between a 7000$ Server or a 70000$ Server for backing up data. The RAM hint is a good one, but I fear that I will only get a virgin-like filesystem performance if I totally avoid DISK IO for SEEK operations which means at 60TB net. capacity a 60TB RAM cache, doesn't it? I avoided other Filesystems than EXT2/3/4 in the past, but now I am totally open for options in this direction, if they will help. :)

– t2m
Jan 18 at 22:41

What's your recommendation for a RAID6 replacement at this disk configuration?

– t2m
Jan 18 at 22:50

"In fact there are many small files from different servers which get snappshotted via rsnapshot every day (multiple INODES direct to the same files." - I think you mean multiple links/names to the same inodes. When hard-linking a file, there's only one inode, but two (or more) links/names.

– marcelm
Jan 19 at 11:52

Dude, if that is a 7000 USD server then STOP GETTING RIPPED OFF. And adding 1000 USD in PCIe SSD into the server will no magically make it a 70k SSD server.

– TomTom
Jan 19 at 15:41

|
show 4 more comments

4 Answers
4

active

oldest

votes

I have a similar (albeit smaller) setup, with 12x 2TB disks in a RAID6 array, used for the very same purpose (rsnapshot backup server).

First, it is perfectly normal for du -hs to take so much time on such a large, and used, filesystem. Moreover du accounts for hardlinks, which cause considerable and bursty CPU load in addition to the obvious IO load.

Your slowness is due to the filesystem metadata being located in very distant (in LBA terms) blocks, causing many seeks. As a normal 7.2K RPM disk provides about ~100 IOPS, you can see how hours, if not days, are needed to load all metadata.

Something you can try to (non-destructively) ameliorate the situation:

be sure to not having mlocate/slocate indexing your /backup-root/ (you can use the prunefs facility to avoid that), or metadata cache trashing will severly impair your backup time;

for the same reason, avoid running du on /backup-root/. If needed, run du only on the specific subfolder interested;

lower vfs_cache_pressure from the default value (100) to a more conservative one (10 or 20). This will instruct the kernel to prefer metadata caching, rather than data caching; this should, in turn, speed up the rsnapshot/rsync discovery phase;

you can try adding a writethrough metadata caching device, for example via lvmcache or bcache. This metadata device should obviously be an SSD;

increase your available RAM.

as you are using ext4, be aware of inode allocation issues (read here for an example). This is not directly correlated to performance, but it is an important factor when having so many files on an ext-based filesystem.

Other things you can try - but these are destructive operations:

use XFS with both -ftype and -finobt option set;

use ZFS on Linux (ZoL) with compressed ARC and primarycache=metadata setting (and, maybe, an L2ARC for read-only cache).

edited Jan 28 at 17:27

answered Jan 18 at 23:05

shodanshok

25.9k34287

Thank you very much for this reply. As you've might have expected, I've got something to read now. The vfs_cache_pressure option is very interesting. I've played around with the caches for some minutes now and I think, the System became a bit more responsive (directory listings, autocomplete, etc..). I'll check the other points as well and give a feedback. Thanks again.

– t2m
Jan 18 at 23:50

"primarycache=metadata setting (and, maybe, an L2ARC for read-only cache)." ZFS can't do both, I had a write up on its most prominent down sides: medium.com/p/zfs-is-raid5-of-2010s-eefaeeea2396

– poige
Jan 19 at 15:17

@poige due to the low RAM amount, I was speaking about metadata caching in L2ARC (in addition on what already cached in ARC). After all, data caching should not made any big difference for a rsnapshot backup server.

– shodanshok
Jan 19 at 16:26

1

I clarified that the only thing in L2ARC would be metadata no matter what then. :) As to RAM amount, 16 GB is no RAM at all for that HDD overall volume. Reasonable minimum would be around 128 GB, hence if it's upgrading anyways, you're no longer limited to 16 GB

– poige
Jan 19 at 16:42

@marcelm you are right: I confused -h for a completely different things (-H for rsync...). I updated my answer.

– shodanshok
Jan 19 at 21:09

add a comment |

This Filesystem stores a huge amount of small files with very many SEEK operations but low IO throughput.

🎉

This is thing that catches lots of people nowadays. Alas, conventional FSes do not scale any well here. I can give you probably just a few advices when it comes to the set-up you already have: EXT4 over RAID-6 on HDDs:

Lower vm.vfs_cache_pressure down, say to 1. It'd change cacheing bias towards preserving more metadata (inode, dentry) instead of data itself and it should have positive effect in reducing number of seeks

Add more RAM. Although it might look strange for a server that doesn't run any piggy apps, remember: the only way to reduce seeks is to keep more metadata in faster storage, given that you have 16 GB only it seems that it should be relatively easy to increase the RAM amount

As I've said EXT4 isn't good choice for the use case you have, but still you can put in use some of the features it poses to soothe pain:
- external journal is supported so you can try adding SSD (better mirrored) and place the journal there. Check out "ext4: external journal caveats"
- Try switching journal mode to "all data's being journaled" mounting with data=journal

Try moving files outside of single FS scope. For e. g., if you have LVM-2 here you can create volumes of a lesser size and use them for a time being, then when it gets full, create another one and so on.
- If you don't have LVM-2 you can try doing that with /dev/loop but it's not that convenient and probably less performant

UPD.: since it's turned out to be Linux Software RAID (LSR) RAID-6, here goes additional item:

LSR has own tuning options that many people seem to overlook
- Stripe cache, that can be set thus to maximum: echo 32768 | sudo tee /sys/devices/virtual/block/md*/md/stripe_cache_size — But do this with care (use lesser value if needed) since the size is chunk-size multiple and depending on the chunk size you've chosen it would take different amount of RAM
- External journal which can be also on those mirrored SSDs (but currently MD device created w/o journal can't be converted to use one).

— That's probably most of what can be improved w/o from scratch re-design.

I have a very poor performance since the file system (60TB net) exceeded 50% usage. At the moment, the usage is at 75%

That's very serious issue because that high disk space occupancy level only worsen fragmentation. And more fragmentation means more seeks. Wonder no longer why it gave more-or-less acceptable performance before reaching 50 %. Lots of manuals have clear recommendations to do not allow FSes grow up behind 75—80 %.

edited Jan 19 at 15:41

answered Jan 19 at 4:24

poige

7,03411437

You're clearly hinting that ext4 on raid-6 is not the way you'd go. Would you mind outlining the setup you would recommend?

– marcelm
Jan 19 at 11:58

2

That's too complex task even to outline it, actually. For some cases it would be ok to choose conventional FS even if one has lots of files, for other (cases) it's no way in the beginning. You can take a look at a good intro on why CEPH abandoned POSIX FS at all and switched to DB. BTW, when they used FS they preferred XFS. I'd probably do same. As to RAID-6, it's major IOPS multiplier — for every write it has to update parity on 2 other devices. So, probably some kind of RAID-x0 approach. With on-fly compression support it might have sense to use even RAID-10. Of course there're ways …

– poige
Jan 19 at 12:43

1

… to speed up it further with SSD cacheing (bcache, dm-cache, ZFS's in-house ZIL+L2ARC) but practice might have some of its own constraints effectively disabling ways-around. So this is why I've said "too complex". One needs to know requirements and resources that would be available to achieve the goal.

– poige
Jan 19 at 12:47

1

I understand it's asking too much to come up with a complete solution, but even the braindump you put in the comments above can be a good starting point for further research to anyone facing similar problems; thanks :)

– marcelm
Jan 19 at 18:07

add a comment |

RAID6 does not help you much in this case, something like ZFS might enable much faster metadata and directory access while keeping speeds about the same.

answered Jan 19 at 3:15

John Keates

63149

add a comment |

RAID-6 stripes drives, therefore all IO goes to all drives. That's pretty inefficient with many small files. However this probably isn't your main problem which is...

Ext4 isn't well suited for big filesystems with millions of files. Use XFS. I have XFS filesystems running as big as 1,2 PB and with as many as 1 billion files, no problem. Simply use XFS.

answered Jan 24 at 21:24

wazoox

4,81132249

add a comment |

Your Answer

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "2"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fserverfault.com%2fquestions%2f949808%2foptions-for-performance-improvements-on-very-big-filesystems-and-high-iowait%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

4 Answers
4

active

oldest

votes

4 Answers
4

active

oldest

votes

I have a similar (albeit smaller) setup, with 12x 2TB disks in a RAID6 array, used for the very same purpose (rsnapshot backup server).

Something you can try to (non-destructively) ameliorate the situation:

be sure to not having mlocate/slocate indexing your /backup-root/ (you can use the prunefs facility to avoid that), or metadata cache trashing will severly impair your backup time;

for the same reason, avoid running du on /backup-root/. If needed, run du only on the specific subfolder interested;

lower vfs_cache_pressure from the default value (100) to a more conservative one (10 or 20). This will instruct the kernel to prefer metadata caching, rather than data caching; this should, in turn, speed up the rsnapshot/rsync discovery phase;

you can try adding a writethrough metadata caching device, for example via lvmcache or bcache. This metadata device should obviously be an SSD;

increase your available RAM.

as you are using ext4, be aware of inode allocation issues (read here for an example). This is not directly correlated to performance, but it is an important factor when having so many files on an ext-based filesystem.

Other things you can try - but these are destructive operations:

use XFS with both -ftype and -finobt option set;

use ZFS on Linux (ZoL) with compressed ARC and primarycache=metadata setting (and, maybe, an L2ARC for read-only cache).

edited Jan 28 at 17:27

answered Jan 18 at 23:05

shodanshok

25.9k34287

Thank you very much for this reply. As you've might have expected, I've got something to read now. The vfs_cache_pressure option is very interesting. I've played around with the caches for some minutes now and I think, the System became a bit more responsive (directory listings, autocomplete, etc..). I'll check the other points as well and give a feedback. Thanks again.

– t2m
Jan 18 at 23:50

"primarycache=metadata setting (and, maybe, an L2ARC for read-only cache)." ZFS can't do both, I had a write up on its most prominent down sides: medium.com/p/zfs-is-raid5-of-2010s-eefaeeea2396

– poige
Jan 19 at 15:17

@poige due to the low RAM amount, I was speaking about metadata caching in L2ARC (in addition on what already cached in ARC). After all, data caching should not made any big difference for a rsnapshot backup server.

– shodanshok
Jan 19 at 16:26

1

I clarified that the only thing in L2ARC would be metadata no matter what then. :) As to RAM amount, 16 GB is no RAM at all for that HDD overall volume. Reasonable minimum would be around 128 GB, hence if it's upgrading anyways, you're no longer limited to 16 GB

– poige
Jan 19 at 16:42

@marcelm you are right: I confused -h for a completely different things (-H for rsync...). I updated my answer.

– shodanshok
Jan 19 at 21:09

add a comment |

I have a similar (albeit smaller) setup, with 12x 2TB disks in a RAID6 array, used for the very same purpose (rsnapshot backup server).

Something you can try to (non-destructively) ameliorate the situation:

be sure to not having mlocate/slocate indexing your /backup-root/ (you can use the prunefs facility to avoid that), or metadata cache trashing will severly impair your backup time;

for the same reason, avoid running du on /backup-root/. If needed, run du only on the specific subfolder interested;

lower vfs_cache_pressure from the default value (100) to a more conservative one (10 or 20). This will instruct the kernel to prefer metadata caching, rather than data caching; this should, in turn, speed up the rsnapshot/rsync discovery phase;

you can try adding a writethrough metadata caching device, for example via lvmcache or bcache. This metadata device should obviously be an SSD;

increase your available RAM.

as you are using ext4, be aware of inode allocation issues (read here for an example). This is not directly correlated to performance, but it is an important factor when having so many files on an ext-based filesystem.

Other things you can try - but these are destructive operations:

use XFS with both -ftype and -finobt option set;

use ZFS on Linux (ZoL) with compressed ARC and primarycache=metadata setting (and, maybe, an L2ARC for read-only cache).

edited Jan 28 at 17:27

answered Jan 18 at 23:05

shodanshok

25.9k34287

Thank you very much for this reply. As you've might have expected, I've got something to read now. The vfs_cache_pressure option is very interesting. I've played around with the caches for some minutes now and I think, the System became a bit more responsive (directory listings, autocomplete, etc..). I'll check the other points as well and give a feedback. Thanks again.

– t2m
Jan 18 at 23:50

"primarycache=metadata setting (and, maybe, an L2ARC for read-only cache)." ZFS can't do both, I had a write up on its most prominent down sides: medium.com/p/zfs-is-raid5-of-2010s-eefaeeea2396

– poige
Jan 19 at 15:17

@poige due to the low RAM amount, I was speaking about metadata caching in L2ARC (in addition on what already cached in ARC). After all, data caching should not made any big difference for a rsnapshot backup server.

– shodanshok
Jan 19 at 16:26

1

I clarified that the only thing in L2ARC would be metadata no matter what then. :) As to RAM amount, 16 GB is no RAM at all for that HDD overall volume. Reasonable minimum would be around 128 GB, hence if it's upgrading anyways, you're no longer limited to 16 GB

– poige
Jan 19 at 16:42

@marcelm you are right: I confused -h for a completely different things (-H for rsync...). I updated my answer.

– shodanshok
Jan 19 at 21:09

add a comment |

I have a similar (albeit smaller) setup, with 12x 2TB disks in a RAID6 array, used for the very same purpose (rsnapshot backup server).

Something you can try to (non-destructively) ameliorate the situation:

be sure to not having mlocate/slocate indexing your /backup-root/ (you can use the prunefs facility to avoid that), or metadata cache trashing will severly impair your backup time;

for the same reason, avoid running du on /backup-root/. If needed, run du only on the specific subfolder interested;

lower vfs_cache_pressure from the default value (100) to a more conservative one (10 or 20). This will instruct the kernel to prefer metadata caching, rather than data caching; this should, in turn, speed up the rsnapshot/rsync discovery phase;

you can try adding a writethrough metadata caching device, for example via lvmcache or bcache. This metadata device should obviously be an SSD;

increase your available RAM.

as you are using ext4, be aware of inode allocation issues (read here for an example). This is not directly correlated to performance, but it is an important factor when having so many files on an ext-based filesystem.

Other things you can try - but these are destructive operations:

use XFS with both -ftype and -finobt option set;

use ZFS on Linux (ZoL) with compressed ARC and primarycache=metadata setting (and, maybe, an L2ARC for read-only cache).

edited Jan 28 at 17:27

answered Jan 18 at 23:05

shodanshok

25.9k34287

I have a similar (albeit smaller) setup, with 12x 2TB disks in a RAID6 array, used for the very same purpose (rsnapshot backup server).

Something you can try to (non-destructively) ameliorate the situation:

be sure to not having mlocate/slocate indexing your /backup-root/ (you can use the prunefs facility to avoid that), or metadata cache trashing will severly impair your backup time;

for the same reason, avoid running du on /backup-root/. If needed, run du only on the specific subfolder interested;

lower vfs_cache_pressure from the default value (100) to a more conservative one (10 or 20). This will instruct the kernel to prefer metadata caching, rather than data caching; this should, in turn, speed up the rsnapshot/rsync discovery phase;

you can try adding a writethrough metadata caching device, for example via lvmcache or bcache. This metadata device should obviously be an SSD;

increase your available RAM.

as you are using ext4, be aware of inode allocation issues (read here for an example). This is not directly correlated to performance, but it is an important factor when having so many files on an ext-based filesystem.

Other things you can try - but these are destructive operations:

use XFS with both -ftype and -finobt option set;

use ZFS on Linux (ZoL) with compressed ARC and primarycache=metadata setting (and, maybe, an L2ARC for read-only cache).

edited Jan 28 at 17:27

answered Jan 18 at 23:05

shodanshok

25.9k34287

edited Jan 28 at 17:27

answered Jan 18 at 23:05

shodanshok

25.9k34287

answered Jan 18 at 23:05

shodanshok

25.9k34287

answered Jan 18 at 23:05

shodanshok

25.9k34287

Thank you very much for this reply. As you've might have expected, I've got something to read now. The vfs_cache_pressure option is very interesting. I've played around with the caches for some minutes now and I think, the System became a bit more responsive (directory listings, autocomplete, etc..). I'll check the other points as well and give a feedback. Thanks again.

– t2m
Jan 18 at 23:50

"primarycache=metadata setting (and, maybe, an L2ARC for read-only cache)." ZFS can't do both, I had a write up on its most prominent down sides: medium.com/p/zfs-is-raid5-of-2010s-eefaeeea2396

– poige
Jan 19 at 15:17

@poige due to the low RAM amount, I was speaking about metadata caching in L2ARC (in addition on what already cached in ARC). After all, data caching should not made any big difference for a rsnapshot backup server.

– shodanshok
Jan 19 at 16:26

1

I clarified that the only thing in L2ARC would be metadata no matter what then. :) As to RAM amount, 16 GB is no RAM at all for that HDD overall volume. Reasonable minimum would be around 128 GB, hence if it's upgrading anyways, you're no longer limited to 16 GB

– poige
Jan 19 at 16:42

@marcelm you are right: I confused -h for a completely different things (-H for rsync...). I updated my answer.

– shodanshok
Jan 19 at 21:09

add a comment |

Thank you very much for this reply. As you've might have expected, I've got something to read now. The vfs_cache_pressure option is very interesting. I've played around with the caches for some minutes now and I think, the System became a bit more responsive (directory listings, autocomplete, etc..). I'll check the other points as well and give a feedback. Thanks again.

– t2m
Jan 18 at 23:50

"primarycache=metadata setting (and, maybe, an L2ARC for read-only cache)." ZFS can't do both, I had a write up on its most prominent down sides: medium.com/p/zfs-is-raid5-of-2010s-eefaeeea2396

– poige
Jan 19 at 15:17

@poige due to the low RAM amount, I was speaking about metadata caching in L2ARC (in addition on what already cached in ARC). After all, data caching should not made any big difference for a rsnapshot backup server.

– shodanshok
Jan 19 at 16:26

1

I clarified that the only thing in L2ARC would be metadata no matter what then. :) As to RAM amount, 16 GB is no RAM at all for that HDD overall volume. Reasonable minimum would be around 128 GB, hence if it's upgrading anyways, you're no longer limited to 16 GB

– poige
Jan 19 at 16:42

@marcelm you are right: I confused -h for a completely different things (-H for rsync...). I updated my answer.

– shodanshok
Jan 19 at 21:09

Thank you very much for this reply. As you've might have expected, I've got something to read now. The vfs_cache_pressure option is very interesting. I've played around with the caches for some minutes now and I think, the System became a bit more responsive (directory listings, autocomplete, etc..). I'll check the other points as well and give a feedback. Thanks again.

– t2m
Jan 18 at 23:50

"primarycache=metadata setting (and, maybe, an L2ARC for read-only cache)." ZFS can't do both, I had a write up on its most prominent down sides: medium.com/p/zfs-is-raid5-of-2010s-eefaeeea2396

– poige
Jan 19 at 15:17

@poige due to the low RAM amount, I was speaking about metadata caching in L2ARC (in addition on what already cached in ARC). After all, data caching should not made any big difference for a rsnapshot backup server.

– shodanshok
Jan 19 at 16:26

I clarified that the only thing in L2ARC would be metadata no matter what then. :) As to RAM amount, 16 GB is no RAM at all for that HDD overall volume. Reasonable minimum would be around 128 GB, hence if it's upgrading anyways, you're no longer limited to 16 GB

– poige
Jan 19 at 16:42

@marcelm you are right: I confused -h for a completely different things (-H for rsync...). I updated my answer.

– shodanshok
Jan 19 at 21:09

add a comment |

This Filesystem stores a huge amount of small files with very many SEEK operations but low IO throughput.

🎉

Lower vm.vfs_cache_pressure down, say to 1. It'd change cacheing bias towards preserving more metadata (inode, dentry) instead of data itself and it should have positive effect in reducing number of seeks

Add more RAM. Although it might look strange for a server that doesn't run any piggy apps, remember: the only way to reduce seeks is to keep more metadata in faster storage, given that you have 16 GB only it seems that it should be relatively easy to increase the RAM amount

As I've said EXT4 isn't good choice for the use case you have, but still you can put in use some of the features it poses to soothe pain:
- external journal is supported so you can try adding SSD (better mirrored) and place the journal there. Check out "ext4: external journal caveats"
- Try switching journal mode to "all data's being journaled" mounting with data=journal

Try moving files outside of single FS scope. For e. g., if you have LVM-2 here you can create volumes of a lesser size and use them for a time being, then when it gets full, create another one and so on.
- If you don't have LVM-2 you can try doing that with /dev/loop but it's not that convenient and probably less performant

UPD.: since it's turned out to be Linux Software RAID (LSR) RAID-6, here goes additional item:

LSR has own tuning options that many people seem to overlook
- Stripe cache, that can be set thus to maximum: echo 32768 | sudo tee /sys/devices/virtual/block/md*/md/stripe_cache_size — But do this with care (use lesser value if needed) since the size is chunk-size multiple and depending on the chunk size you've chosen it would take different amount of RAM
- External journal which can be also on those mirrored SSDs (but currently MD device created w/o journal can't be converted to use one).

— That's probably most of what can be improved w/o from scratch re-design.

I have a very poor performance since the file system (60TB net) exceeded 50% usage. At the moment, the usage is at 75%

edited Jan 19 at 15:41

answered Jan 19 at 4:24

poige

7,03411437

You're clearly hinting that ext4 on raid-6 is not the way you'd go. Would you mind outlining the setup you would recommend?

– marcelm
Jan 19 at 11:58

2

That's too complex task even to outline it, actually. For some cases it would be ok to choose conventional FS even if one has lots of files, for other (cases) it's no way in the beginning. You can take a look at a good intro on why CEPH abandoned POSIX FS at all and switched to DB. BTW, when they used FS they preferred XFS. I'd probably do same. As to RAID-6, it's major IOPS multiplier — for every write it has to update parity on 2 other devices. So, probably some kind of RAID-x0 approach. With on-fly compression support it might have sense to use even RAID-10. Of course there're ways …

– poige
Jan 19 at 12:43

1

… to speed up it further with SSD cacheing (bcache, dm-cache, ZFS's in-house ZIL+L2ARC) but practice might have some of its own constraints effectively disabling ways-around. So this is why I've said "too complex". One needs to know requirements and resources that would be available to achieve the goal.

– poige
Jan 19 at 12:47

1

I understand it's asking too much to come up with a complete solution, but even the braindump you put in the comments above can be a good starting point for further research to anyone facing similar problems; thanks :)

– marcelm
Jan 19 at 18:07

add a comment |

This Filesystem stores a huge amount of small files with very many SEEK operations but low IO throughput.

🎉

Lower vm.vfs_cache_pressure down, say to 1. It'd change cacheing bias towards preserving more metadata (inode, dentry) instead of data itself and it should have positive effect in reducing number of seeks

Add more RAM. Although it might look strange for a server that doesn't run any piggy apps, remember: the only way to reduce seeks is to keep more metadata in faster storage, given that you have 16 GB only it seems that it should be relatively easy to increase the RAM amount

As I've said EXT4 isn't good choice for the use case you have, but still you can put in use some of the features it poses to soothe pain:
- external journal is supported so you can try adding SSD (better mirrored) and place the journal there. Check out "ext4: external journal caveats"
- Try switching journal mode to "all data's being journaled" mounting with data=journal

Try moving files outside of single FS scope. For e. g., if you have LVM-2 here you can create volumes of a lesser size and use them for a time being, then when it gets full, create another one and so on.
- If you don't have LVM-2 you can try doing that with /dev/loop but it's not that convenient and probably less performant

UPD.: since it's turned out to be Linux Software RAID (LSR) RAID-6, here goes additional item:

LSR has own tuning options that many people seem to overlook
- Stripe cache, that can be set thus to maximum: echo 32768 | sudo tee /sys/devices/virtual/block/md*/md/stripe_cache_size — But do this with care (use lesser value if needed) since the size is chunk-size multiple and depending on the chunk size you've chosen it would take different amount of RAM
- External journal which can be also on those mirrored SSDs (but currently MD device created w/o journal can't be converted to use one).

— That's probably most of what can be improved w/o from scratch re-design.

I have a very poor performance since the file system (60TB net) exceeded 50% usage. At the moment, the usage is at 75%

edited Jan 19 at 15:41

answered Jan 19 at 4:24

poige

7,03411437

You're clearly hinting that ext4 on raid-6 is not the way you'd go. Would you mind outlining the setup you would recommend?

– marcelm
Jan 19 at 11:58

2

That's too complex task even to outline it, actually. For some cases it would be ok to choose conventional FS even if one has lots of files, for other (cases) it's no way in the beginning. You can take a look at a good intro on why CEPH abandoned POSIX FS at all and switched to DB. BTW, when they used FS they preferred XFS. I'd probably do same. As to RAID-6, it's major IOPS multiplier — for every write it has to update parity on 2 other devices. So, probably some kind of RAID-x0 approach. With on-fly compression support it might have sense to use even RAID-10. Of course there're ways …

– poige
Jan 19 at 12:43

1

… to speed up it further with SSD cacheing (bcache, dm-cache, ZFS's in-house ZIL+L2ARC) but practice might have some of its own constraints effectively disabling ways-around. So this is why I've said "too complex". One needs to know requirements and resources that would be available to achieve the goal.

– poige
Jan 19 at 12:47

1

I understand it's asking too much to come up with a complete solution, but even the braindump you put in the comments above can be a good starting point for further research to anyone facing similar problems; thanks :)

– marcelm
Jan 19 at 18:07

add a comment |

This Filesystem stores a huge amount of small files with very many SEEK operations but low IO throughput.

🎉

Lower vm.vfs_cache_pressure down, say to 1. It'd change cacheing bias towards preserving more metadata (inode, dentry) instead of data itself and it should have positive effect in reducing number of seeks

Add more RAM. Although it might look strange for a server that doesn't run any piggy apps, remember: the only way to reduce seeks is to keep more metadata in faster storage, given that you have 16 GB only it seems that it should be relatively easy to increase the RAM amount

As I've said EXT4 isn't good choice for the use case you have, but still you can put in use some of the features it poses to soothe pain:
- external journal is supported so you can try adding SSD (better mirrored) and place the journal there. Check out "ext4: external journal caveats"
- Try switching journal mode to "all data's being journaled" mounting with data=journal

Try moving files outside of single FS scope. For e. g., if you have LVM-2 here you can create volumes of a lesser size and use them for a time being, then when it gets full, create another one and so on.
- If you don't have LVM-2 you can try doing that with /dev/loop but it's not that convenient and probably less performant

UPD.: since it's turned out to be Linux Software RAID (LSR) RAID-6, here goes additional item:

LSR has own tuning options that many people seem to overlook
- Stripe cache, that can be set thus to maximum: echo 32768 | sudo tee /sys/devices/virtual/block/md*/md/stripe_cache_size — But do this with care (use lesser value if needed) since the size is chunk-size multiple and depending on the chunk size you've chosen it would take different amount of RAM
- External journal which can be also on those mirrored SSDs (but currently MD device created w/o journal can't be converted to use one).

— That's probably most of what can be improved w/o from scratch re-design.

I have a very poor performance since the file system (60TB net) exceeded 50% usage. At the moment, the usage is at 75%

edited Jan 19 at 15:41

answered Jan 19 at 4:24

poige

7,03411437

This Filesystem stores a huge amount of small files with very many SEEK operations but low IO throughput.

🎉

Lower vm.vfs_cache_pressure down, say to 1. It'd change cacheing bias towards preserving more metadata (inode, dentry) instead of data itself and it should have positive effect in reducing number of seeks

Add more RAM. Although it might look strange for a server that doesn't run any piggy apps, remember: the only way to reduce seeks is to keep more metadata in faster storage, given that you have 16 GB only it seems that it should be relatively easy to increase the RAM amount

As I've said EXT4 isn't good choice for the use case you have, but still you can put in use some of the features it poses to soothe pain:
- external journal is supported so you can try adding SSD (better mirrored) and place the journal there. Check out "ext4: external journal caveats"
- Try switching journal mode to "all data's being journaled" mounting with data=journal

Try moving files outside of single FS scope. For e. g., if you have LVM-2 here you can create volumes of a lesser size and use them for a time being, then when it gets full, create another one and so on.
- If you don't have LVM-2 you can try doing that with /dev/loop but it's not that convenient and probably less performant

UPD.: since it's turned out to be Linux Software RAID (LSR) RAID-6, here goes additional item:

LSR has own tuning options that many people seem to overlook
- Stripe cache, that can be set thus to maximum: echo 32768 | sudo tee /sys/devices/virtual/block/md*/md/stripe_cache_size — But do this with care (use lesser value if needed) since the size is chunk-size multiple and depending on the chunk size you've chosen it would take different amount of RAM
- External journal which can be also on those mirrored SSDs (but currently MD device created w/o journal can't be converted to use one).

— That's probably most of what can be improved w/o from scratch re-design.

I have a very poor performance since the file system (60TB net) exceeded 50% usage. At the moment, the usage is at 75%

edited Jan 19 at 15:41

answered Jan 19 at 4:24

poige

7,03411437

edited Jan 19 at 15:41

answered Jan 19 at 4:24

poige

7,03411437

answered Jan 19 at 4:24

poige

7,03411437

answered Jan 19 at 4:24

poige

7,03411437

You're clearly hinting that ext4 on raid-6 is not the way you'd go. Would you mind outlining the setup you would recommend?

– marcelm
Jan 19 at 11:58

2

That's too complex task even to outline it, actually. For some cases it would be ok to choose conventional FS even if one has lots of files, for other (cases) it's no way in the beginning. You can take a look at a good intro on why CEPH abandoned POSIX FS at all and switched to DB. BTW, when they used FS they preferred XFS. I'd probably do same. As to RAID-6, it's major IOPS multiplier — for every write it has to update parity on 2 other devices. So, probably some kind of RAID-x0 approach. With on-fly compression support it might have sense to use even RAID-10. Of course there're ways …

– poige
Jan 19 at 12:43

1

… to speed up it further with SSD cacheing (bcache, dm-cache, ZFS's in-house ZIL+L2ARC) but practice might have some of its own constraints effectively disabling ways-around. So this is why I've said "too complex". One needs to know requirements and resources that would be available to achieve the goal.

– poige
Jan 19 at 12:47

1

I understand it's asking too much to come up with a complete solution, but even the braindump you put in the comments above can be a good starting point for further research to anyone facing similar problems; thanks :)

– marcelm
Jan 19 at 18:07

add a comment |

You're clearly hinting that ext4 on raid-6 is not the way you'd go. Would you mind outlining the setup you would recommend?

– marcelm
Jan 19 at 11:58

2

That's too complex task even to outline it, actually. For some cases it would be ok to choose conventional FS even if one has lots of files, for other (cases) it's no way in the beginning. You can take a look at a good intro on why CEPH abandoned POSIX FS at all and switched to DB. BTW, when they used FS they preferred XFS. I'd probably do same. As to RAID-6, it's major IOPS multiplier — for every write it has to update parity on 2 other devices. So, probably some kind of RAID-x0 approach. With on-fly compression support it might have sense to use even RAID-10. Of course there're ways …

– poige
Jan 19 at 12:43

1

… to speed up it further with SSD cacheing (bcache, dm-cache, ZFS's in-house ZIL+L2ARC) but practice might have some of its own constraints effectively disabling ways-around. So this is why I've said "too complex". One needs to know requirements and resources that would be available to achieve the goal.

– poige
Jan 19 at 12:47

1

I understand it's asking too much to come up with a complete solution, but even the braindump you put in the comments above can be a good starting point for further research to anyone facing similar problems; thanks :)

– marcelm
Jan 19 at 18:07

You're clearly hinting that ext4 on raid-6 is not the way you'd go. Would you mind outlining the setup you would recommend?

– marcelm
Jan 19 at 11:58

That's too complex task even to outline it, actually. For some cases it would be ok to choose conventional FS even if one has lots of files, for other (cases) it's no way in the beginning. You can take a look at a good intro on why CEPH abandoned POSIX FS at all and switched to DB. BTW, when they used FS they preferred XFS. I'd probably do same. As to RAID-6, it's major IOPS multiplier — for every write it has to update parity on 2 other devices. So, probably some kind of RAID-x0 approach. With on-fly compression support it might have sense to use even RAID-10. Of course there're ways …

– poige
Jan 19 at 12:43

… to speed up it further with SSD cacheing (bcache, dm-cache, ZFS's in-house ZIL+L2ARC) but practice might have some of its own constraints effectively disabling ways-around. So this is why I've said "too complex". One needs to know requirements and resources that would be available to achieve the goal.

– poige
Jan 19 at 12:47

I understand it's asking too much to come up with a complete solution, but even the braindump you put in the comments above can be a good starting point for further research to anyone facing similar problems; thanks :)

– marcelm
Jan 19 at 18:07

add a comment |

RAID6 does not help you much in this case, something like ZFS might enable much faster metadata and directory access while keeping speeds about the same.

answered Jan 19 at 3:15

John Keates

63149

add a comment |

RAID6 does not help you much in this case, something like ZFS might enable much faster metadata and directory access while keeping speeds about the same.

answered Jan 19 at 3:15

John Keates

63149

add a comment |

RAID6 does not help you much in this case, something like ZFS might enable much faster metadata and directory access while keeping speeds about the same.

answered Jan 19 at 3:15

John Keates

63149

RAID6 does not help you much in this case, something like ZFS might enable much faster metadata and directory access while keeping speeds about the same.

answered Jan 19 at 3:15

John Keates

63149

answered Jan 19 at 3:15

John Keates

63149

answered Jan 19 at 3:15

John Keates

63149

answered Jan 19 at 3:15

John Keates

63149

add a comment |

RAID-6 stripes drives, therefore all IO goes to all drives. That's pretty inefficient with many small files. However this probably isn't your main problem which is...

Ext4 isn't well suited for big filesystems with millions of files. Use XFS. I have XFS filesystems running as big as 1,2 PB and with as many as 1 billion files, no problem. Simply use XFS.

answered Jan 24 at 21:24

wazoox

4,81132249

add a comment |

RAID-6 stripes drives, therefore all IO goes to all drives. That's pretty inefficient with many small files. However this probably isn't your main problem which is...

Ext4 isn't well suited for big filesystems with millions of files. Use XFS. I have XFS filesystems running as big as 1,2 PB and with as many as 1 billion files, no problem. Simply use XFS.

answered Jan 24 at 21:24

wazoox

4,81132249

add a comment |

RAID-6 stripes drives, therefore all IO goes to all drives. That's pretty inefficient with many small files. However this probably isn't your main problem which is...

Ext4 isn't well suited for big filesystems with millions of files. Use XFS. I have XFS filesystems running as big as 1,2 PB and with as many as 1 billion files, no problem. Simply use XFS.

answered Jan 24 at 21:24

wazoox

4,81132249

RAID-6 stripes drives, therefore all IO goes to all drives. That's pretty inefficient with many small files. However this probably isn't your main problem which is...

Ext4 isn't well suited for big filesystems with millions of files. Use XFS. I have XFS filesystems running as big as 1,2 PB and with as many as 1 billion files, no problem. Simply use XFS.

answered Jan 24 at 21:24

wazoox

4,81132249

answered Jan 24 at 21:24

wazoox

4,81132249

answered Jan 24 at 21:24

wazoox

4,81132249

answered Jan 24 at 21:24

wazoox

4,81132249

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Server Fault!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

5f3P56e,yPnfruDJS92R KX 7z0O7t28i4vVXY,NE,7WwFAaEMF

搜尋此網誌

Jtdylktuy