Posts filed under 'sysadmin'

Admin: Linux file server performance boost (ext4 version)

LinuxIn the previous article, I showed how to improve the performance of an existing file server by tweaking ext3 and mount settings.
We can also take advantage of the availability of the now stable ext4 file system to further improve our file server performance.

Some distribution, in particular RedHat/CentOS 5, do not allow us to select ext4 as a formatting option during setup of the machine, so you will initially have to use ext3 as file system (on top of LVM preferably for easy extensibility).

A small digression on partitioning

Remember to create separate partitions for your file data: do not mix OS files with data files, they should live on different partitions. In an enterprise environment, a minimal partition configuration for a file server could look like:


  • 2x 160GB HDD for the OS
  • 4x 2TB HDD for the data

The 160GB drives could be used as such:

  • 200MB RAID1 partition over the 2 drives for /boot
  • 2GB RAID1 partition over the 2 drives for swap
  • all remaining space as a RAID1 partition over the 2 drives for /
    Note though that it is generally recommended to create additional partitions to further contain /tmp and /var.

The 2TB drives could be used like this:

  • all space as RAID6 over all drives (gives us 4TB of usable space) for /data
  • alternatively, all space as RAID5 over all drives (gives us 6TB of usable space) The point of using RAID6 is that it gives better redundancy than RAID5, so you can safely add more drives later without increasing the risk of failure of the whole array (which is not true of RAID5).

Moving to ext4

If you are upgrading an existing system, backup first!

Let’s say that your /data partition is an LVM volume under /dev/VolGroup01/LogVol00. First, make sure we have the ext4 tools installed on our machine, then unmount the partition to upgrade:

    # yum -y install e4fsprogs
    # umount /dev/VolGroup01/LogVol00

For a new system, create a large partition on the disk, then format the volume (this will destroy all data on that volume!).

    # mkfs -t ext4 -E stride=32 -m 0 -O extents,uninit_bg,dir_index,filetype,has_journal,sparse_super /dev/VolGroup01/LogVol00
    # tune4fs -o journal_data_writeback /dev/VolGroup01/LogVol00

Note: on a RAID array, use the appropriate -E stride,stripe-width options, for instance, on a RAID5 array using 4 disks and 4k blocks, it could be: -E stride=16,stripe-width=48

For an existing system, upgrading from ext3 to ext4 without damaging existing data is barely more complicated:

    # fsck.ext3 -pf  /dev/VolGroup01/LogVol00
    # tune2fs -O extents,uninit_bg,dir_index,filetype,has_journal,sparse_super /dev/VolGroup01/LogVol00
    # fsck.ext4 -yfD /dev/VolGroup01/LogVol00

We can optionally give our volume a new label to easily reference it later:

    # e4label /dev/VolGroup01/LogVol00 data

Then we need to persist the mount options in /etc/fstab:

    /dev/VolGroup01/LogVol00    /data    ext4    noatime,data=writeback,barrier=0,nobh,errors=remount-ro    0 0

And now we can remount our volume:

    # mount /data

If you upgraded an existing filesystem from etx3, you may want to run the following to ensure the existing files are using extents for file attributes:

    # find /data -xdev -type f -print0 | xargs -0 chattr +e
    # find /data -xdev -type d -print0 | xargs -0 chattr +e

Important notes

The mounting options we use are somewhat a bit risky if your system is not adequately protected by a UPS.
If your system crashes due to a power failure, you are more likely to lose data using these options than using the safer defaults.
At any rate, you must have a proper backup strategy in place to safeguard data, regardless of what could damage them (hardware failure or user error).

  • The barrier=0 option disables Write barriers that enforce proper on-disk ordering of journal commits.
  • The data=writeback and nobh go hand in hand and allow the system to write data even after it has been committed to the journal.
  • The noatime ensures that the access time is not updated when we’re reading data as this is a big performance killer (this one is safe to use in any case).


1 comment October 3rd, 2010

Previous Posts

Most Recent Posts



Posts by Month