FAQ SearchLogin
Tuxera Home
View unanswered posts | View active topics It is currently Tue Jul 23, 2019 17:20



Post new topic Reply to topic  [ 5 posts ] 
Writing sparse files to an NTFS volume is 4-10 times slower 
Author Message

Joined: Wed Aug 27, 2008 23:13
Posts: 2
Post Writing sparse files to an NTFS volume is 4-10 times slower
Writing a 4GB (8GB apparent) sparse file to an NTFS volume takes ~3.75 times longer than writing the same data to a non-sparse file.

The file consists of 0.5-2.5MB blocks of random data separated by 0.5-2.5MB gaps (see below).

For a 0.4GB (0.8GB apparent) file, writing sparse and non-sparse versions, takes approximately the same time, while writing a ~30GB (50GB apparent) sparse disk image (created by ntfsclone) takes at least 10x longer than writing it in non sparse mode.

Watching the CPU usage of the mount.ntfs process while writing the files, for the first 30-60 seconds it oscillates, but gradually increasing, until it hits 95-100%. This seems to be consistent with the problem not showing up for smaller files.

This is all on a vanilla Fedora 9 system, running ntfs-3g 1.2712.


Steps to reproduce are as follows:

1. Create sparse file using the following perl script

Code:
srand(1);
open my $f, ">", "sparse.txt";

for my $i (1..2500)
{
    my $offset=int(rand(2048)+512)*1024;
    my $length=int(rand(2048)+512)*1024/16;

    my $str="";
    for my $j (1..$length)
    {
   $str.=pack('NNNN',int(rand(2**32)),int(rand(2**32)),
         int(rand(2**32)),int(rand(2**32)));
    }

    seek $f, $offset, 1;
    print $f $str;
}

close $f;


Which produces the following file:

Code:
[user@localhost ~]$ perl create_sparse.pl
[user@localhost ~]$ du -s --block-size=1  sparse.txt
3926851584      sparse.txt
[user@localhost ~]$ du -s --block-size=1  --apparent-size sparse.txt
7872768000      sparse.txt


2. Mount the filesystem

Code:
[root@localhost user]# mount /dev/sda8 /mnt/shared
[root@localhost user]# mount | grep '/mnt/shared'
/dev/sda8 on /mnt/shared type fuseblk (rw,allow_other,blksize=4096)


3. Measure the time taken to copy the file in sparse and non-sparse modes

Code:
[user@localhost ~]$ time cp sparse.txt /mnt/shared/

real    14m47.526s
user    0m0.761s
sys     0m10.359s
[user@localhost ~]$ time cp --sparse=never sparse.txt /mnt/shared/

real    4m6.052s
user    0m0.406s
sys     0m13.955s


Thu Aug 28, 2008 01:37
Profile
Tuxera CTO

Joined: Tue Nov 21, 2006 23:15
Posts: 1648
Post 
Nice bug report, thank you.

I tried both cases with the current development version and they give very similar results, around 2.5 minutes, which means that the bottleneck is not the CPU but the disk bandwidth in my case:

Code:
root@dhcppc1:/tmp # ls -lh sparse.txt
-rw-r--r-- 1 root root 7.4G Aug 27 19:27 sparse.txt
root@dhcppc1:/tmp # du -h sparse.txt
3.7G    sparse.txt
root@dhcppc1:/tmp # mount | grep /mnt/test
/dev/sdb1 on /mnt/test type fuseblk (rw,allow_other,blksize=4096)
root@dhcppc1:/tmp # time cp sparse.txt /mnt/test
  0.86s  usr,  15.32s sys,  152.15s real,  10% CPU
root@dhcppc1:/tmp # time cp --sparse=never sparse.txt /mnt/test
  0.03s  usr,  20.18s sys,  154.85s real,  13% CPU
root@dhcppc1:/tmp # time cp --sparse=never sparse.txt /mnt/test
  0.01s  usr,  20.42s sys,  155.56s real,  13% CPU
root@dhcppc1:/tmp # time cp sparse.txt /mnt/test         
  0.80s  usr,  15.72s sys,  153.42s real,  10% CPU

The sparse case uses maximum 30% CPU, the other one 12%.

The reason for the higher CPU usage is explained here: http://forum.ntfs-3g.org/viewtopic.php?p=1333#1333

The sparse file higher CPU usage problem is rarely reported, so it's not a high priority work item at the moment.

You didn't write your CPU ... My test was done using a Core 2 Duo, T9300@2.5 GHz. Based on the results, it's about 20 times faster than your CPU.


Thu Aug 28, 2008 05:21
Profile

Joined: Wed Aug 27, 2008 23:13
Posts: 2
Post 
Thanks for your response. If you do get around to working on sparse file issues in the future, feel free to PM me for further info or testing.

I'm not entirely sure I understand how your linked post explains the higher CPU usage for sparse files - is it that the allocator has to work harder to handle all the fragments generated in the sparse file case?

Quote:
You didn't write your CPU ... My test was done using a Core 2 Duo, T9300@2.5 GHz. Based on the results, it's about 20 times faster than your CPU.


It's a Core 2 Duo E8500 @ 3.16 GHz, so I don't think the CPU can be the culprit. (The CPU was otherwise idle when I did the copy, and I repeated the test twice with similar results).

Are there any changes since 1.2712 that could have fixed this issue? It's not completely trivial for me to upgrade to test, as 1.2712 is the latest Fedora rpm build.


Another thing I didn't specify is that I was copying between partitions on the same physical disk. Was this the case for your test also?


Fri Aug 29, 2008 03:52
Profile
Tuxera CTO

Joined: Tue Nov 21, 2006 23:15
Posts: 1648
Post 
On Fri, 29 Aug 2008, mrkh wrote:

> I'm not entirely sure I understand how your linked post explains the
> higher CPU usage for sparse files - is it that the allocator has to work
> harder to handle all the fragments generated in the sparse file case?

Not really harder. It's how NTFS was designed: the extent list is
compressed. Sparse files can have a huge extent list meanwhile the driver
minimize this for non-sparse files. Compressing thousands of entries
obviously takes much more time than compressing only a few ones.

This was a Microsoft file system driver design problem.

> It's a Core 2 Duo E8500 @ 3.16 GHz, so I don't think the CPU can be the
> culprit. (The CPU was otherwise idle when I did the copy, and I repeated
> the test twice with similar results).

Ok, so the current driver in development is indeed much faster :-)

> Are there any changes since 1.2712 that could have fixed this issue? It's
> not completely trivial for me to upgrade to test, as 1.2712 is the latest
> Fedora rpm build.

No.

> Another thing I didn't specify is that I was copying between partitions
> on the same physical disk. Was this the case for your test also?

No. I used two disks.


Wed Sep 03, 2008 22:06
Profile
NTFS-3G Lead Developer

Joined: Tue Sep 04, 2007 17:22
Posts: 1286
Post Re: Writing sparse files to an NTFS volume is 4-10 times slower
Hi

Can you retry with the new advanced version http://pagesperso-orange.fr/b.andre/ntf ... .1AR.4.tgz ? it contains a hopefully significant improvement for much fragmented files.

Regards

Jean-Pierre


Tue Mar 03, 2009 10:34
Profile
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 5 posts ] 


Who is online

Users browsing this forum: Google [Bot] and 7 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group.
Original forum style by Vjacheslav Trushkin.