FAQ SearchLogin
Tuxera Home
View unanswered posts | View active topics It is currently Sun Jun 13, 2021 13:17



Post new topic Reply to topic  [ 10 posts ] 
Request for future enhancements 
Author Message

Joined: Sun Apr 18, 2010 17:09
Posts: 6
Post Request for future enhancements
Hi, I would firstly like to thank you for your work in creating an excellent NTFS driver. Without it, I simply wouldn't be able to use Linux because I have to share a lot of files between XP and Linux, and FAT32 has too many limitations.

I've got two NTFS formatted drives, one is used for storing my data (which is shared between XP and Linux), and the other is where I backup my data to. However, the problem is that when I do my backups, I need to preserve all of the NTFS timestamps, including the creation date.

I understand from the ntfs-3g manual that the windows timestamps are mapped to the system.ntfs_times extended attribute. So theoretically, if I use a program like cp or rsync and select the “preserves extended attributes option” the windows timestamps should also be preserved.

However, although all the standard Linux/Unix utilities such as cp and rsync appear to preserve the user extended attributes, they don't appear to be able to recognise (and therefore preserve) the system extended attributes.

getfattr and setfattr also appear to exhibit similar behaviour. You can read and write to a specific system attribute. But it's not possible to list the system attributes that are available.

I'm not sure whether this restriction is standard and deliberate Linux behaviour, or whether it's just the way that the ntfs-3g driver has been written. But whatever the reason, it's extremely annoying, because it renders all the standard Linux utilities completely useless for my purposes.

What I've managed to do to get round the problem is to use getfattr to dump all the system.ntfs_times attributes into a text file (fortunately getfattr allows this to be done recursively over a range of files). Once the file copying has been done, I'm then able to use setfattr to restore the attributes from the text file to the copied files. It works surprisingly well but it's clumsy. It's fine for archiving but I really don't want to have to go through that hassle every time I copy a single file from A to B.

I was wondering whether this problem could be addressed in a future release.

I can think of at least three possible solutions.

1. Make the system.ntfs_times attribute visible to programs like cp and rsync (assuming this is possible).

2. Convert the ntfs_times attribute from a system into a user attribute.

3. Allow the system.ntfs_times attribute to be mapped on to a second attribute in user space. So for example, system.ntfs_times and user.ntfs_times could point to the same data (i.e. if you alter one of the attributes, you also automatically alter the other).

I prefer solution three (which could be implemented as a mount option) because it preserves backwards compatibility.

Also, it says in the ntfs-3g manual that the byte order in which the time stamps are stored depends upon the endianness of the processor used. This concerns me a bit because it could cause compatibility issues in the future. Could the order be standardised, and made independent of the processor in a future release?

Thanks for your help.


Sun Apr 18, 2010 17:55
Profile
NTFS-3G Lead Developer

Joined: Tue Sep 04, 2007 17:22
Posts: 1286
Post Re: Request for future enhancements
Hi,

Quote:
I understand from the ntfs-3g manual that the windows timestamps are mapped to the system.ntfs_times extended attribute. So theoretically, if I use a program like cp or rsync and select the “preserves extended attributes option” the windows timestamps should also be preserved.

Yes, ... but the creation time has no legal existence in Unix-type OSes...
Quote:
getfattr and setfattr also appear to exhibit similar behaviour. You can read and write to a specific system attribute. But it's not possible to list the system attributes that are available.
I'm not sure whether this restriction is standard and deliberate Linux behaviour, or whether it's just the way that the ntfs-3g driver has been written.

This is a standard behavior on Linux. I imagine there is some security concern is showing non-user data.
Quote:
What I've managed to do to get round the problem is to use getfattr to dump all the system.ntfs_times attributes into a text file (fortunately getfattr allows this to be done recursively over a range of files).
I was wondering whether this problem could be addressed in a future release.

This is the job of a backup program, not of file system driver. I have put on http://pagesperso-orange.fr/b.andre/tools.zip a sample user program ntfscp.c which does a recursive directory copy preserving all ntfs attributes, provided both source and target are on ntfs. If you want to backup on another file system type, you have to define a specific backup format. fsarchiver is a backup program which does that for several file system, including ntfs. The backup program has to be ntfs-aware, because of several conflicts : Posix ACLs vs NTFS ACLs, Linux symlinks vs NTFS reparse points, and to a lesser extent Unix times vs NTFS times.
Quote:
Make the system.ntfs_times attribute visible to programs like cp and rsync (assuming this is possible).

They are visible... but cp and rsync are not ntfs aware.
Quote:
Convert the ntfs_times attribute from a system into a user attribute.
Allow the system.ntfs_times attribute to be mapped on to a second attribute in user space.

A file system driver should not pollute the user name space... designed to protect user attributes from system pollution, but you can accept that on your own and change the attribute names when backing up to an ext3 file system.
Quote:
Also, it says in the ntfs-3g manual that the byte order in which the time stamps are stored depends upon the endianness of the processor used. This concerns me a bit because it could cause compatibility issues in the future. Could the order be standardised, and made independent of the processor in a future release?

It is standardized, for use in getxattr(2), as an array of 64-bit integers, in ntfscp.c you can see this is fully transparent, and I use it both on small endian and big endian machines (also see how the values for the attrib are defined irrespective of the endianness). If you put the stamps or the attrib in a text file, you should put numbers, not character strings.

Regards

Jean-Pierre


Sun Apr 18, 2010 22:36
Profile

Joined: Sun Apr 18, 2010 17:09
Posts: 6
Post Re: Request for future enhancements
Quote:
This is the job of a backup program, not of file system driver. I have put on http://pagesperso-orange.fr/b.andre/tools.zip a sample user program ntfscp.c which does a recursive directory copy preserving all ntfs attributes, provided both source and target are on ntfs. If you want to backup on another file system type, you have to define a specific backup format. fsarchiver is a backup program which does that for several file system, including ntfs. The backup program has to be ntfs-aware, because of several conflicts : Posix ACLs vs NTFS ACLs, Linux symlinks vs NTFS reparse points, and to a lesser extent Unix times vs NTFS times.


Thanks for your reply.

I think perhaps you've slightly misunderstood me. I'm not suggesting that all system attributes should be made visible. However, I think you can make a strong case for making at least some of them visible in the user name space.

For several of the non-Linux filesystems (including NTFS) there is a subset of the system attributes that are not used by Linux, but which are safe to copy and which the user often wishes to preserve. One such attribute is the NTFS file creation date but I'm sure there are others as well.

The trouble is that Linux currently has no standard way to deal with these attributes. It's not just backing up that's a problem, it's general housekeeping. I frequently have to switch between Windows and Linux, and for that reason, I keep all my data on a separate NTFS partition that is readable by both operating systems. However, when I'm in Linux I run the constant risk of a file's creation date being silently overwritten when I do something routine like moving a file from A to B.

Unfortunately, it's issues like this that give Linux a bad name. You can come up with all sorts of perfectly reasonable technical reasons why the ntfs_times attribute should not be visible, or copied. But that's of absolutely no use whatsoever to the end user. Unfortunately, the file creation date is sometimes used in the Windows world, and Linux's inability to consistently and reliably preserve that attribute will be a deal breaker for many people.

I can get round the problem by writing shell scripts, modifying source code, etc. But most people haven't got the knowledge, time, or inclination to do any of those things. And why should they? Life's just too short. These days most users, not unreasonably, expect simple things like copying files around to just work without running the risk of data loss.

You say it's the job of a backup program to preserve non-Linux attributes. But I don't see it as being that clear cut. Ideally, a backup program shouldn't need to concern itself with what filesystems it's being asked to copy files to and from. All file system drivers should ideally provide a uniform interface to the programs that use them.

I would suggest the remapping could be activated with a mount option. For example the fstab entry could look something like this:

/dev/sda5 /home/user/mnt ntfs-3g ro,uid=1000,ntfs_times=user.my_times 0 0

The exact syntax is unimportant but the point I'm trying to make is that the remapping could be made optional (and presumably turned off by default). And to minimise the possibility of name clashes, the system administrator would be able to choose what the attribute's new name was.

The nice thing about the remapping approach is that it would still work even if a file was copied from an NTFS filesystem to a non-NTFS filesystem as long as the second filesystem also supported extended attributes. For example, a file could be copied from an NTFS partition to an ext4 partition and then back to another NTFS partition and the creation date would still be preserved. The copying program wouldn't need to know anything about the ntfs-3g driver or the system.ntfs_times attribute. It would just have to blindly copy the file and all its user space extended attributes and everything would just work. To me that seems like a simple and elegant solution to the problem.

Sure it's non-standard, and a bit of a kludge. But the important thing is that it would instantly allow dozens of existing Linux copying and archiving programs to preserve the NTFS file creation date without any additional work needing to be done. And because it would be optional, it could be deprecated when (hopefully) a standard approach to preserving non-Linux attributes is adopted by the Linux community.

The only other solution to preserving the creation date that I can think of is for dozens of archiving, backup, and copying programs to be rewritten so that they understand and act upon the system.ntfs_times attribute. But this solution is just as non-standard as the one that I'm proposing and a lot more work. And furthermore, it's a moving target. For example, at some point, someone else might write another NTFS driver that deals with the timestamps in a completely different way.


Sun May 02, 2010 22:55
Profile
NTFS-3G Lead Developer

Joined: Tue Sep 04, 2007 17:22
Posts: 1286
Post Re: Request for future enhancements
Hi,

Quote:
However, when I'm in Linux I run the constant risk of a file's creation date being silently overwritten when I do something routine like moving a file from A to B.

Can you be more precise on the circumstances in which the creation date of a file is overwritten ?

Regards

Jean-Pierre


Mon May 10, 2010 10:10
Profile
NTFS-3G Lead Developer

Joined: Tue Sep 04, 2007 17:22
Posts: 1286
Post Re: Request for future enhancements
Hi,

Quote:
The exact syntax is unimportant but the point I'm trying to make is that the remapping could be made optional (and presumably turned off by default). And to minimise the possibility of name clashes, the system administrator would be able to choose what the attribute's new name was.

Can you try http://pagesperso-orange.fr/b.andre/ntf ... .6AA.7.tgz
This is an experimental version which supports system extended attributes mapping to user space. The mapping is normally defined in the file .NTFS-3G/XattrMapping (in the hidden directory .NTFS-3G of the ntfs partition), but can be defined elsewhere through the option "xattrmapping=<actual-file>" in the mount command or /etc/fstab (same rules as for user mapping, see http://pagesperso-orange.fr/b.andre/per ... sermapping)
Each mapping is defined in a line with two fields separated by a colon, for example :
Code:
system.ntfs_attrib:user.ntfs_attrib
# this is a comment line
system.ntfs_times:user.ntfs_times

Neither the system name nor the user name can be duplicated.
Currently all system extended attributes can be mapped (caveat emptor !)

Quote:
Also, it says in the ntfs-3g manual that the byte order in which the time stamps are stored depends upon the endianness of the processor used. This concerns me a bit because it could cause compatibility issues in the future. Could the order be standardised, and made independent of the processor in a future release?

In the above version, I have added two extended attributes which return big-endian results :
Quote:
system.ntfs_attrib_be
system.ntfs_times_be

The big-endian mode was selected, to avoid explaining why 0x01000000 means read-only whose code is 0x00000001.

Regards

Jean-Pierre


Thu May 13, 2010 16:54
Profile

Joined: Sun Apr 18, 2010 17:09
Posts: 6
Post Re: Request for future enhancements
jpa wrote:
Hi,

Quote:
However, when I'm in Linux I run the constant risk of a file's creation date being silently overwritten when I do something routine like moving a file from A to B.

Can you be more precise on the circumstances in which the creation date of a file is overwritten ?

Regards

Jean-Pierre


As far as I'm aware, almost any file copy operation will cause the creation date to be lost because, as you pointed out earlier, the concept of a file creation date simply doesn't exist in Unix type operating systems.

The only way to copy the creation date is to do so indirectly by copying the system.ntfs_times attribute. But unfortunately all of the file copying utilities that I've come across that are capable of copying extended attributes, deliberately exclude the system extended attributes.

For example, if I'm in a directory on an NTFS partition and I run the following command:

cp --preserve=xattr,timestamps file1 file2

file2 will end up with a different creation date to file1.

Also, if a file is moved from one NTFS partition to another (which is basically the same as a copy followed by a delete), the file's creation date will be changed.


Sat May 22, 2010 16:30
Profile

Joined: Sun Apr 18, 2010 17:09
Posts: 6
Post Re: Request for future enhancements
jpa wrote:
In the above version, I have added two extended attributes which return big-endian results :
Quote:
system.ntfs_attrib_be
system.ntfs_times_be

The big-endian mode was selected, to avoid explaining why 0x01000000 means read-only whose code is 0x00000001.

Regards

Jean-Pierre


Thanks for doing that but I'm still a bit confused about the endian issue. Could you answer a hypothetical question for me.

Let's suppose I have an external USB hard disk with an NTFS partition on it, and on the partition is a file called test_file.txt.

I plug the disk into a big-endian computer (running Linux and NTFS-3G) and run the following command:

getfattr -n system.ntfs_times test_file.txt

I now plug the same disk into a little-endian computer (again running Linux and NTFS-3G) and run exactly the same command. Am I right in assuming that the command's output would look different on the two computers?

Now if I run the following command on both computers:

getfattr -n system.ntfs_times_be test_file.txt

Am I right in assuming that this time the command's output would look the same on the two computers?

Thanks


Sat May 22, 2010 16:35
Profile

Joined: Sun Apr 18, 2010 17:09
Posts: 6
Post Re: Request for future enhancements
jpa wrote:
Can you try http://pagesperso-orange.fr/b.andre/ntf ... .6AA.7.tgz
This is an experimental version which supports system extended attributes mapping to user space. The mapping is normally defined in the file .NTFS-3G/XattrMapping (in the hidden directory .NTFS-3G of the ntfs partition), but can be defined elsewhere through the option "xattrmapping=<actual-file>" in the mount command or /etc/fstab (same rules as for user mapping, see http://pagesperso-orange.fr/b.andre/per ... sermapping)


Thanks for doing that. I've run a few quick tests and it seems to work fine.

However, I've come across what could be a potential problem.

Lets assume that I've mapped system.ntfs_times to user.ntfs_times.

If you execute a command like cp --preserve=timestamps,xattr file1 file2 the last modification time will be copied twice - once by copying user.ntfs_times, and once through the normal Linux/Unix system calls.

However the two copies are possibly going to be carried out with different levels of precision. For example user.ntfs_times will always copy the timestamps with 100 nanosecond accuracy, whereas the system calls might copy the timestamps with a microsecond accuracy (the accuracy will depend upon the system call used).

So the modification time that the new file ends up with might depend upon the order in which the two copying operations were carried out. The trouble is that with cp (and most other copying utilities) this order is undefined. So different utilities, and potentially different versions of the same utility, could produce slightly different results.

I don't really see this double copying of the same attribute as being a problem if you're just copying (or moving) files from one NTFS directory to another (which is all I'll be doing). But I have thought of an (admittedly fairly contrived) scenario where it could cause difficulties.

Let's suppose you've got a text file originally created in Windows saved on an NTFS formatted partition.

You're now accessing the partition using Linux and want to preserve the NTFS file creation times. Once again I'll assume that the system.ntfs_times attribute is mapped to the user.ntfs_times attribute.

You decide to temporarily copy (or move) the file to an ext4 partition whilst preserving the creation time. So you copy the file using cp --preserve=timestamps,xattr.

The ext4 filesystem supports extended attributes so the user.ntfs_times attribute gets converted from a pseudo user space attribute into a genuine user space attribute, thus preserving the creation time.

The file is then edited and this causes the modified time to change. I'll assume you want to preserve this new modification time.

You now decide to copy the file back to an NTFS partition.

At this point you've got a problem. If you do the copying using cp --preserve=timestamps then the new modification time will be preserved but the creation time will be overwritten.

However, if you do the copying using cp --preserve=xattr the creation time will be preserved but the new modification time will be overwritten with it's old value.

And if you do the copying using cp --preserve=timestamps,xattr the result will be undefined.

I think the answer to this (admittedly unlikely) scenario would be to separate the four timestamps into their own individual extended attributes. The user could then choose to make just the creation time attribute visible in user space. I don't think there is any real need to make the modification time visible because it can already be preserved through the normal Linux system calls (albeit possibly at a slightly different level of accuracy). I haven't considered the other two timestamps as I've never used them.


Sat May 22, 2010 18:10
Profile

Joined: Sun Apr 18, 2010 17:09
Posts: 6
Post Re: Request for future enhancements
I recently came across another thread that discusses the possibility of mapping the file creation time to a user space extended attribute. They're talking about Samba and ext4. However, I think the discussion is also relevant to NTFS-3G.

Link


Sat May 22, 2010 18:26
Profile
NTFS-3G Lead Developer

Joined: Tue Sep 04, 2007 17:22
Posts: 1286
Post Re: Request for future enhancements
Hi,

Thank you for you replies. I group the answers I can provide :

Quote:
As far as I'm aware, almost any file copy operation will cause the creation date to be lost because, as you pointed out earlier, the concept of a file creation date simply doesn't exist in Unix type operating systems.

Well, when copying a file, you are creating a new file. The creation date is relative to the container, not to the contents. You also get this behavior when using Windows Explorer, but of course specific programs may behave differently.

Quote:
I plug the disk into a big-endian computer (running Linux and NTFS-3G) and run the following command:
getfattr -n system.ntfs_times test_file.txt
I now plug the same disk into a little-endian computer (again running Linux and NTFS-3G) and run exactly the same command. Am I right in assuming that the command's output would look different on the two computers?

Yes.

Quote:
Now if I run the following command on both computers:
getfattr -n system.ntfs_times_be test_file.txt
Am I right in assuming that this time the command's output would look the same on the two computers?

Yes.
Rule of thumb : use system.ntfs_times with system functions like getxattr(2) and system.ntfs_times_be with commands like getfattr(1).

Quote:
If you execute a command like cp --preserve=timestamps,xattr file1 file2 the last modification time will be copied twice - once by copying user.ntfs_times, and once through the normal Linux/Unix system calls.

However the two copies are possibly going to be carried out with different levels of precision. For example user.ntfs_times will always copy the timestamps with 100 nanosecond accuracy, whereas the system calls might copy the timestamps with a microsecond accuracy (the accuracy will depend upon the system call used).

And what did your tests show ? (ntfs-3g supports utimensat(2), providing 100ns accuracy, but this depends on the OS and system call used by the copy program).
Quote:
So the modification time that the new file ends up with might depend upon the order in which the two copying operations were carried out. The trouble is that with cp (and most other copying utilities) this order is undefined. So different utilities, and potentially different versions of the same utility, could produce slightly different results.

True. This is why a mapping to an extended attribute in user space should only be made on user request.
Quote:
At this point you've got a problem. If you do the copying using cp --preserve=timestamps then the new modification time will be preserved but the creation time will be overwritten.

However, if you do the copying using cp --preserve=xattr the creation time will be preserved but the new modification time will be overwritten with it's old value.

And if you do the copying using cp --preserve=timestamps,xattr the result will be undefined.

You are probably right. Trying to maintain a creation date over ext4 using standard copy commands is likely to cause problems.
Quote:
I think the answer to this (admittedly unlikely) scenario would be to separate the four timestamps into their own individual extended attributes.

Yes, this would solve your specific problem. My feeling is all this can be more easily and safely solved by using an appropriate copy program.
Quote:
I recently came across another thread that discusses the possibility of mapping the file creation time to a user space extended attribute. They're talking about Samba and ext4. However, I think the discussion is also relevant to NTFS-3G.

Thanks for the pointer. If some general agreement emerges on how to deal with ntfs internal data over non-ntfs file systems, I will adapt.

In the meantime, I am not pushing the proposed changes to a stable version until there is more agreement on what should be done. (I will however promote ntfs_times_be and ntfs_attrib_be).

Regards

Jean-Pierre


Mon May 24, 2010 18:56
Profile
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 10 posts ] 


Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group.
Original forum style by Vjacheslav Trushkin.