FAQ SearchLogin
Tuxera Home
View unanswered posts | View active topics It is currently Sat May 08, 2021 07:08



Post new topic Reply to topic  [ 4 posts ] 
ntfsresize silently destroys highly fragmented filesystems 
Author Message

Joined: Sun Oct 25, 2015 12:49
Posts: 2
Post ntfsresize silently destroys highly fragmented filesystems
An apparently successful run of ntfsresize on a partition with lots of fragmentation ends up with most of the MFT missing and massive data loss after chkdsk.

I managed to investigate exactly what happened and manually fix the filesystem and recover everything. The gory details are here, which should also serve as a detailed bug report in a way:
https://marcan.st/2015/10/rescuing-a-br ... ilesystem/

The short version is that it seems that ntfsresize can wind up attempting to fragment a relocated MFT chunk into more extents than can fit in a single runlist in the $MFT entry, and ends up truncating the runlist. I was able to rewrite the runlist to point back at the original extent (that was still there since I hadn't overwritten the freed up space after the resized volume, and since I'd chkdsked a copy of the partition and not the original) and everything seemed to come back.

Here are some repro steps. They aren't 100% faithful to the original filesystem that demonstrated the problem, since the MFT ends up very fragmented to begin with even before ntfsresize runs (which sounds like something that should be improved in ntfs-3g), but the filesystem survives that until the ntfsresize.

Code:
# cd /tmp/
# mkdir mnt
# dd if=/dev/zero of=ntfs.bin bs=1M count=100
# losetup /dev/loop0 ntfs.bin
# mkfs.ntfs /dev/loop0
The partition start sector was not specified for /dev/loop0 and it could not be obtained automatically.  It has been set to 0.
The number of sectors per track was not specified for /dev/loop0 and it could not be obtained automatically.  It has been set to 0.
The number of heads was not specified for /dev/loop0 and it could not be obtained automatically.  It has been set to 0.
Cluster size has been automatically set to 4096 bytes.
To boot from a device, Windows needs the 'partition start sector', the 'sectors per track' and the 'number of heads' to be set.
Windows will not be able to boot from this device.
Initializing device with zeroes: 100% - Done.
Creating NTFS volume structures.
mkntfs completed successfully. Have a nice day.
# mount /dev/loop0 mnt
# dd if=/dev/urandom of=junk.bin bs=4096 count=1
1+0 records in
1+0 records out
4096 bytes (4.1 kB) copied, 0.000289423 s, 14.2 MB/s
# for i in $(seq -w 000001 999999); do cp junk.bin mnt/$i || break; done
cp: error writing ‘mnt/019201’: No space left on device
# umount mnt
# losetup -d /dev/loop0
# dd if=/dev/zero of=ntfs.bin bs=1M count=100 seek=100 conv=notrunc
100+0 records in
100+0 records out
104857600 bytes (105 MB) copied, 0.0549189 s, 1.9 GB/s
# losetup /dev/loop0 ntfs.bin
# ntfsresize /dev/loop0     
ntfsresize v2015.3.14 (libntfs-3g)
Device name        : /dev/loop0
NTFS volume version: 3.1
Cluster size       : 4096 bytes
Current volume size: 104854016 bytes (105 MB)
Current device size: 209715200 bytes (210 MB)
New volume size    : 209711616 bytes (210 MB)
Checking filesystem consistency ...
100.00 percent completed
Accounting clusters ...
Space in use       : 105 MB (100.0%)
Collecting resizing constraints ...
WARNING: Every sanity check passed and only the dangerous operations left.
Make sure that important data has been backed up! Power outage or computer
crash may result major data loss!
Are you sure you want to proceed (y/[n])? y
Schedule chkdsk for NTFS consistency check at Windows boot time ...
Resetting $LogFile ... (this might take a while)
Updating $BadClust file ...
Updating $Bitmap file ...
Updating Boot record ...
Syncing device ...
Successfully resized NTFS on device '/dev/loop0'.
# mount /dev/loop0 mnt
# for i in $(seq -w 1000001 1999999); do cp junk.bin mnt/$i || break; done
cp: error writing ‘mnt/1019689’: No space left on device
# umount /dev/loop0 mnt
# mount /dev/loop0 mnt
# rm mnt/*[12346789]
# umount /dev/loop0 mnt
# ntfsresize -s 100M -f /dev/loop0
ntfsresize v2015.3.14 (libntfs-3g)
Device name        : /dev/loop0
NTFS volume version: 3.1
Cluster size       : 4096 bytes
Current volume size: 209711616 bytes (210 MB)
Current device size: 209715200 bytes (210 MB)
New volume size    : 99996160 bytes (100 MB)
Checking filesystem consistency ...
100.00 percent completed
Accounting clusters ...
Space in use       : 83 MB (39.2%)
Collecting resizing constraints ...
Needed relocations : 10319 (43 MB)
WARNING: Every sanity check passed and only the dangerous operations left.
Make sure that important data has been backed up! Power outage or computer
crash may result major data loss!
Are you sure you want to proceed (y/[n])? y
Schedule chkdsk for NTFS consistency check at Windows boot time ...
Resetting $LogFile ... (this might take a while)
Relocating needed data ...
100.00 percent completed
Updating $BadClust file ...
Updating $Bitmap file ...
allocated extent inode 18
allocated extent inode 19
Updating Boot record ...
Syncing device ...
Successfully resized NTFS on device '/dev/loop0'.
You can go on to shrink the device for example with Linux fdisk.
IMPORTANT: When recreating the partition, make sure that you
  1)  create it at the same disk sector (use sector as the unit!)
  2)  create it with the same partition type (usually 7, HPFS/NTFS)
  3)  do not make it smaller than the new NTFS filesystem size
  4)  set the bootable flag for the partition if it existed before
Otherwise you won't be able to access NTFS or can't boot from the disk!
If you make a mistake and don't have a partition table backup then you
can recover the partition table by TestDisk or Parted's rescue mode.
# ntfsfix -n /dev/loop0
Mounting volume... Failed to load runlist for $MFT/$DATA.
highest_vcn = 0x1f14, last_vcn - 1 = 0x260d
Failed to load $MFT: Input/output error
FAILED
Attempting to correct errors... Failed to load runlist for $MFT/$DATA.
highest_vcn = 0x1f14, last_vcn - 1 = 0x260d
Failed to load $MFT: Input/output error
FAILED
Failed to startup volume: Input/output error
Checking for self-located MFT segment... OK
The startup data can be fixed, but no change was requested
Volume is corrupt. You should run chkdsk.
No change made


Sun Oct 25, 2015 19:20
Profile
NTFS-3G Lead Developer

Joined: Tue Sep 04, 2007 17:22
Posts: 1286
Post Re: ntfsresize silently destroys highly fragmented filesystems
Hi,

Thanks for having proposed a script so that the issue can easily be reproduced ! This script now added to my test suite.

And congratulations for having manually repaired your file system.

The root cause is an "egg and hen" situation when having to add extents to the MFT while the MFT is being relocated. This was incompletely taken into account, leading to the MFT being updated in its old location instead of the new one.

It is probably time to defragment your file system...

Attached is a proposed patch to fix the issue.

Regards

Jean-Pierre


Attachments:
resize-mft-runlists.patch.gz [2.21 KiB]
Downloaded 911 times
Wed Oct 28, 2015 22:21
Profile

Joined: Sun Oct 25, 2015 12:49
Posts: 2
Post Re: ntfsresize silently destroys highly fragmented filesystems
Ah, I think the code explains some of what I saw. So the reason why the MFT was processed last (which was lucky for me) was that the MFT runlist update, after its entry became full (which is where I saw the truncation), was queued in the delayed updates linked list, which entries later got prepended to (since that's fast for a linked list). Hence the MFT, which comes first, got updated last.

As for the filesystem, even after the repair I wasn't going to trust it (for one, the MFT entry was full of junk padding that I didn't bother to clean up, but also, as you mention, the fragmentation); I created a new filesystem and did a metadata-aware file based copy instead, so the new filesystem should be fragmentation-free.

One question: how does ntfsresize handle adding MFT entries to extend the MFT? Clearly there is an ultimate chicken-and-egg situation here: if the extension MFT entries lie beyond the portion of the MFT described by its first entry's runlist, then they cannot be located at mount time. I don't know what constraints MS's filesystem implementation requires (e.g. maybe they have to be located within the first extent, not just within the extents in the first runlist? Something worth checking I guess). I'm a bit concerned about what other corner cases can result from MFT fragmentation beyond one runlist, especially since it seems that with my test case ntfs-3g also creates lots of MFT fragments even before ntfsresize runs.

E.g. in my test case, it seems entries 18 and 19 were allocated. I see that ntfs-3g reserves entries 16 to 24 for this purpose (and Windows doesn't quite reserve them but uses them last). What would happen if all of those wind up filled up, and there are no other free MFT entries, so the MFT would have to be extended, which by necessity means the free entries would be in the portion of the MFT that they are about to describe? Lots of strange corner cases that can happen here... I think ideally ntfsresize (and ntfs-3g) should be designed to at least error out and guarantee to leave the volume in a consistent state (e.g. in the case of ntfsresize, possibly with data partially relocated already, but at least with no consistency issues or data loss)


Fri Oct 30, 2015 02:12
Profile
NTFS-3G Lead Developer

Joined: Tue Sep 04, 2007 17:22
Posts: 1286
Post Re: ntfsresize silently destroys highly fragmented filesystems
Hi,

Quote:
One question: how does ntfsresize handle adding MFT entries to extend the MFT? Clearly there is an ultimate chicken-and-egg situation here: if the extension MFT entries lie beyond the portion of the MFT described by its first entry's runlist, then they cannot be located at mount time.

The first run of the MFT has to at least cover the first 16 entries, so that the second runlist can be described in entry 15 (which is reserved for that purpose). Even with a lot of fragmentations, there is space in entry 15 to hold hundreds of runs, each of them describing at least one entry. So entries 16 to 24 can always be accessed.

ntfsresize reserves the first extent of the MFT at the beginning in order to get a greater chance of getting a long run (and avoid long runlists).
Quote:
E.g. in my test case, it seems entries 18 and 19 were allocated. I see that ntfs-3g reserves entries 16 to 24 for this purpose (and Windows doesn't quite reserve them but uses them last). What would happen if all of those wind up filled up, and there are no other free MFT entries, so the MFT would have to be extended, which by necessity means the free entries would be in the portion of the MFT that they are about to describe?

When entries 15 to 24 (or 23 ?) are exhausted, any entry is used, hopefully at this stage the full partition is accessible. However Windows has a limit on the size of runlists, and will probably drop down to a BSOD before this happens (ntfs-3g has no such limit).
Quote:
I think ideally ntfsresize (and ntfs-3g) should be designed to at least error out and guarantee to leave the volume in a consistent state (e.g. in the case of ntfsresize, possibly with data partially relocated already, but at least with no consistency issues or data loss)

I agree. This is why the old data is never overwritten and the boot sector is written last, so in case of erroring out, the partition is still in a safe state. Unluckily this does not cover buggy situations where a bad change is done unknowingly.

Regards

Jean-Pierre


Fri Oct 30, 2015 10:06
Profile
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 4 posts ] 


Who is online

Users browsing this forum: No registered users and 3 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group.
Original forum style by Vjacheslav Trushkin.