FAQ SearchLogin
Tuxera Home
View unanswered posts | View active topics It is currently Wed Nov 25, 2020 09:28



Post new topic Reply to topic  [ 8 posts ] 
Dual Hibernation, almost! How 2 force unmount/close handles? 
Author Message

Joined: Fri Jul 17, 2009 19:12
Posts: 12
Post Dual Hibernation, almost! How 2 force unmount/close handles?
Dual Hibernation, almost! How 2 force unmount/close handles?

Hello again!
I have started this new topic because the previous thread (HIBERFILE.SYS) had many incorrect information by me. If you are Jean-Pierre you can just skip to the Problem: - Question: at the end. For others -- here is to re-cap, so far...

Goal:
* To hibernate both Windows and Ubuntu. Switch between them - DONE
* To share my D: Data drive between Windows and Ubuntu - Almost ? (with some caveats)

My setup:
Platforms: Windows 8.1, Ubuntu 14.10
Shared D:\ "Data" drive in the middle - The D: drive is formatted to NTFS filesystem

So Far we have discovered that before hibernation on Windows side, the D:\ data NTFS volume must be fully unmounted and taken offline with the command:

mountvol.exe D: /P

/P is necessary because the /D switch only removes the drive letter / directory mappings. /D switch did not actually dismount the volume (for an internal disk).

mountvol.exe /?
/P Removes the volume mount point from the specified directory,
dismounts the volume, and makes the volume not mountable.
You can make the volume mountable again by creating a volume
mount point.

We also found that after using /P the volume *IS* mountable again afterwards. Contrary to the help text saying it cannot. It will re-mount very easily in normal way. No issues.

Now. Some limitations in Windows side. Caveats.

Giving enough time to dismount:

In Windows, we can only listen for WM_POWEREVENT. That is the only notification (i am aware of) which is saying we are going to hibernate (OR sleep). On Windows 8 and 8.1 we are told that there is only 2.0 seconds from receiving this message until the system goes into hibernation. I have tested this and can say that is very true. Although maybe in practice it can sometimes be a little more than 2.0 seconds, not more than 2.5 sec maximum. IN THEORY we might use SetThreadExecutionState to delay hibernation further until we are truly finished dismounting out NTFS volumes. Like this:

DllCall("SetThreadExecutionState","UInt",ES_CONTINUOUS | ES_SYSTEM_REQUIRED | ES_DISPLAY_REQUIRED | ES_AWAYMODE_REQUIRED)
MsgBox, Press OK to allow sleep
DllCall("SetThreadExecutionState","UInt",ES_CONTINUOUS)

However I have tried this and it does not work for our needs. Because Windows has already started serving applications the WM_POWEREVENT notification. It is too late. We can block hibernation only before the hibernate action is initiated. And of course we do not before time when that is going to happen. So we must ensure that all our neededntfs volumes are dismounted within 2 seconds. Otherwise the filesystem will be left in a bad state when the memory is frozen (still unmounting.

Recommendations:

For this reason we recommend NOT to use slow disk for this (for example over USB interface). And to only use ONE big shared NTFS data disk shared between the 2 OSES. And not trying to be sharing or unmounting 2+ NTFS disks with this method. In mountvol.exe /? help Microsoft says it will take care of forcing closed all open file handles by applications. And guarantees closing (not in a specific time). In my early test the /P (unmount+ take offline) happens fast enough and seems to work OK. However I have not tested with a heavily loaded system. Anyway it is best to wait for idle times before choosing to hibernate windows. IF there is some delay unmounting, then should be safe as in linux ntfs-3g will probably detect the disk was not cleanly dismounted yet.

Windows - Distinguishing between hibernation and sleep:

Unfortunately this is not possible either. I have spent 1 whole day just trying to do this. What we find is that there is a KernelPower info msg written to the SystemEventLog. Which can be read using the cmdline utility 'wevtutil.exe'. For example the following command will tell us if there was such a new event written to the windows event log within the last 15 seconds:

# targetstate: hibernate=5 sleep=4
wevtutil.exe qe System "/q:*[System[Provider[@Name='Microsoft-Windows-Kernel-Power'] and (Level=4) and TimeCreated[timediff(@SystemTime) <= 15000]]] and *[EventData[Data[@Name='TargetState']=5]]"

The result is that if we run the above command immediately after receiving the WM_POWEREVENT, then unfortunately there is nothing in the event log yet. After windows is resumed from it's hibernation, only then do we see the log message has been written. By that time we must have already needed to dismount the volume. SO the implications is it is possible to distinguish between resuming from either hibernation OR sleep. But not entering hibernation OR sleep. The conclusion is we must mount / dismount out NTFS volume regardless for both of these power states. Ideally we would avoid this on sleep because it is only needed for hibernation. Bah ^*&$ M%£&Soft. Luckily if we do not sleep very often, then it does not matter. And the added unmount / remounting does not negatively impact the sleeping times too much since the system waits 2 seconds anyway.

===
In windows, install autohotkey program. version 1.16+. Then open Task Scheduler program, and add a system task with fill admin privelidges. To open and run this .AHK autokotkey file (NOT .exe). The task can be started at windows startup and run as 'system' account, een when no user is logged in. This ensures that the Data drive will be dismounted even hibernating from the login screen.

YOU MUST REPLACE THE ntfs VolumeName UUID and Drive letter of your own shared NTFS Data drive (cannot be C drive!). use mountvol.exe /? to list them.

hibernate_hook.ahk:
; hibernation hook

#SingleInstance, force

; Listen to the Windows power event "WM_POWERBROADCAST" (ID: 0x218):
OnMessage(0x218, "func_WM_POWERBROADCAST",10)
Return

; on sleep - 4, on wake 18, 7
; on hibernate - 4, on resume 18, 7

func_WM_POWERBROADCAST(wParam, lParam)
{
If (lParam = 0) {

If (wParam = 4) ;PBT_ APM SUSPEND
{
objShell := ComObjCreate("WScript.Shell")
objExec := objShell.Exec("mountvol.exe D: /P")
Return
}

Else If (wParam = 7) ;PBT_APM RESUME SUSPEND
{
objShell := ComObjCreate("WScript.Shell")
objExec := objShell.Exec("mountvol.exe D: \\?\Volume{b03e36d5-51f3-11e4-8263-001eecd73a1f}\")
Return
}

; Else If (wParam = 18) ;PBT_ APM RESUME AUTOMATIC
; {
; objShell := ComObjCreate("WScript.Shell")
; objExec := objShell.Exec("mountvol.exe D: \\?\Volume{b03e36d5-51f3-11e4-8263-001eecd73a1f}\")
; }

}
Return
}

And that takes care of windows hibernation site. Now for the linux side, in Ubuntu.


LINUX SIDE:

We also need a counterpart script for linux hibernation. This still has an issue and might not be correct yet. Here is a temporary link:

https://gist.github.com/anonymous/d12357711ad679fb4d06

Where "/Data" disk entry is wrtten in the fstab file /dev/sda9 the same NTFS D: drive as we have on Windows 8.

If no file handles are open on the disk at the time of linux hibernation, then the first command "umount /Data" will call ntfs-3g driver and unmount the ntfs disk cleanly. No problems whatsoever. I can hibernate between Linux and Windows many times and all Data is preserved.

Problem:

However the entire point of hibernation is to be able to resume running applications / have files open in the workspace. So what is really needed is for the linux ntfs driver to do for me (as 'root' user) is the same sort of thing as what mountvol.exe /P on Windows will do: The un-mount command must force-close all currently open file handles / file descriptiors and invalidate them. Flush current writes but make any new handles impossible to create. And further disk writes from those previous handles impossible / invalid / ignored.

Unfortunately in this situation with open file handles ('lsof /dev/sda9'). Then the normal 'umount /Data' command refuses to unmount forcefully. And the --lazy option seems not to work re-mounting afterwards (some bizarre error message).

Searching stackoverflow for that bizarre error message (caused by --lazy), I have then found this interesting command instead:

fusermount -uz "/Data"

My Question:

Although I don't understand what such 'fusermount' command is doing at all. But seems a bad idea because the disk unmounts. The mountpoint disappears from 'df' command. However if I have an open Bash window, then it still seems to have a handle on the stale directory. And will ls, mkdir will can continue to write data on to the NTFS disk. This very bad and results in corruption of the ntfs disk (after resuming the session from hibernation, when the bash window or whatever application continues to be running). This looks like some limitation in ubuntu's ntfs driver - unless there is some way around the problem?

In an ideal world, I would like to be able to run some command as root that force-closes all open handles + unmounts the NTFS volume immediately. Making it non-writable. Otherwise such a busy application may immediately re-open a new handle to write to. Is there any way on Linux I can do such things? (reliably!)

Or if the function does not exist today, can it ever be added in the future to such FUSE user-land drivers ?

Many thanks for any input / comments.


Fri Oct 17, 2014 11:12
Profile

Joined: Fri Jul 17, 2009 19:12
Posts: 12
Post Re: Dual Hibernation, almost! How 2 force unmount/close handles?
Reference: http://stackoverflow.com/questions/7878 ... 59#7878759

Ok, perhaps we may have found a solution now. These commands:
0. sudo fuser -m /dev/sda9 # show the processes using the disk
1. sudo umount --lazy "/Data"
1.b sleep 1 # can give a small delay to let applications finish writing to their current descriptors
2. sudo fuser -m /dev/sda9 -k # or sudo lsof /dev/sda9
3. sudo fuser -m /dev/sda9 -k

What happens is this:

The 1st command (a --lazy unmount) does not close the decriptor for bash shell. And bash still has an open handle when it's CWD id /Data... the handle remains open forever / indefinately. But lazy unmount does stop all future requests. Between the 0. and 2. command it no longer shows bash is having any open file descriptors on /dev/sda9 *even though that is not true*. And this discrepancy allows us to run the 3rd command without killing any userland applications. So bash is still left running but it's descriptor is forced closed. Because in the 3rd command (as root / sudo) we are actually killing the process known as 'mount.ntf' owned by 'root' user (or may not be root if was run in userland). So by killing only 'the 'mount.ntf' process we preserve the state of running applications without killling them before hibernation.

Actually we only need to run commands 1. and 3. Maybe 1.b also if we want to be kind to give some delay for the running processes to stop writing.


New question:

Does the process 'mount.ntf' exist for only one per NTFS disk, or 1 process for all of the mounted NTFS disks? Because if we kill it and have other disks still mounted, that would not be very desirable behaviour. For example a USB thumb drive happens to be inserted etc.

Jean pierre - do you agree / disagree ?
Many thanks for any help here.


Fri Oct 17, 2014 20:16
Profile
NTFS-3G Lead Developer

Joined: Tue Sep 04, 2007 17:22
Posts: 1286
Post Re: Dual Hibernation, almost! How 2 force unmount/close handles?
Hi,

You have done much more tests than I have ever done on this matter, and I will probably be of no much help.

Quote:
In an ideal world, I would like to be able to run some command as root that force-closes all open handles + unmounts the NTFS volume immediately. Making it non-writable. Otherwise such a busy application may immediately re-open a new handle to write to. Is there any way on Linux I can do such things? (reliably!)

Or if the function does not exist today, can it ever be added in the future to such FUSE user-land drivers ?

You would probably have to send some signal to fuse to put it in a pre-hibernation state. Better ask on the fuse mailing list.

Quote:
The un-mount command must force-close all currently open file handles / file descriptiors and invalidate them.

Will you not need some cooperation from the applications holding these handles ? What is to happen when the system is awaken ? For instance you have an open spreadsheet and the file descriptor is forced closed for hibernation, will the spreadsheet reopen it on awakening (assuming the file was not updated by the other system) ? How does Windows behave on open files when the partition holding them is unmounted/remounted the way you explained ?

Requiring to have no open file when hibernating is probably an acceptable constraint (after all, you have to close your files when logging off).
Quote:
Does the process 'mount.ntf' exist for only one per NTFS disk, or 1 process for all of the mounted NTFS disks? Because if we kill it and have other disks still mounted, that would not be very desirable behaviour. For example a USB thumb drive happens to be inserted etc.

There is one independent process per mounted partition. If you force kill one of them, there is no impact on the other ones, but the fuse thread serving the killed process will be left in a bad state until you unmount.

Regards

Jean-Pierre


Fri Oct 17, 2014 21:59
Profile

Joined: Fri Jul 17, 2009 19:12
Posts: 12
Post Re: Dual Hibernation, almost! How 2 force unmount/close handles?
Thanks. I am still looking into this further. I did not quite realise yesterday. But the default command fuser -k will send a SIGKILL which may kill off the ntfs-3g executable without letting it first un-mount cleanly the ntfs volume it controls. At least the general concept is demonstrated even if it is not actually implemented yet.

Probably we need some more appropriate signal instead like SIGHUP or SIGQUIT to exit cleanly. To there is a graceful shutdown and to properly un-mount the ntfs volume. Unfortunately there is no mention of any POSIX signal handling in the ntfs-3g manpage. So either the feature isn't implemented, or it is not documented. I will need to look into further and examine inside the ntfs-3g source code (and possibly the libfuse.so library also).

jpa wrote:
Will you not need some cooperation from the applications holding these handles ? What is to happen when the system is awaken ?


Ah yes. Good questions.

In general:

We do not wish to hibernate unless the computer seems to be approximately 'idle'. The problem unfortunately is some persistent or background programs or task will hold always open their file handles. And the same programs can be painful to shutdown or re-open every time. And this is also for the purpose of hibernation, why we want to use it. To keep the programs open.

Specifically:

A few good programs will respond nicely to SIGINT (to interrupt) and SIGCONT (to continue / reload). For those programs which do it is not a problem to tell them from the same script with 'kill -SIGCONT <process>' to continue or reload themselves. Of course some applications do have this feature but instead use SIGUSR1 and SIGUSR2 for that same purpose (e.g. to restart the application). Programs such as nginx or djb-dns will respect this. They can be added (manually per-application) to a simple text file and that list of programs can be read during the hibernate-thaw times.

Then we have another group of programs. Popular programs and often used. But are safe to cut off. And do not seem to require any such notification messages sent to them.

One such program we have already identified is the bash process. Which just sits there doing nothing with forever an open handle to the directory it sits in (CWD=/Data/*). The process is completely idle and safe to cut off.

Then P2P programs: Who like to download or cache their data to the disk a lot. But their data is all checksummed and does not matter anything of corruption.

Example: Spotify music player. It has up to 5GB music cache downloaded p2p from other Spotify clients. It does not matter anything if those audio data becomes locally corrupted. It is all checksummed and can easily be re-downloaded again. Don't ever want to close-open-close Spotify. Because when it opens it may update itself, and it goes to the advertising screen. So Spotify is less annoying is it is just always left open. Should recover OK and resume by itself.

Another program: Bit-torrent. Which can tolerate invalid data which is just re-downloaded again. We often don't wish to close that kind of program either because it can take a long time to shut down or start up again.

Another one: BtSync. Personally I plan to use Bit-torrent sync in *read-only* mode. To backup and also keep a copy of my iTunes library synced up on linux (again, just read-only). So then the situation is pretty much the same as bit-torrent. We are assuming these programs have their data on the ntfs volume. But a seperate config folder in ~/. which is not on the ntfs drive and therefore the program settings are kept safe from any possible corruption.

Now of course there will be other programs (like you say - e.g. the speadsheet), where it may not be entirely safe to cut them off from the disk when they have open handles. So what I am intending to do is have the hibernate script compare to a 'safe list'. Which is just a simple text file of the process names. If 'fuser' or 'lsof' shows any unknown processes which are not cleared on the user-configurable list (so if the process name is not 'bash', 'spotify', 'bittorrent', or 'btsync'). Then by default hibernation will abort. This lets the user save their spreadsheet, or re-evaluate their running programs safely after hibernation was cancelled. Or if they are really not sure, they may have the opportunity shutdown fully (instead of hibernate).

Other times the user may be running temporarily a new program (or wish to test a new program out). So for that they can override the safe mode and set a temporary flag to force hibernation anyway. e.g. running 'pm-hibernate' and from GUI menu is always safe. But running 'pm-hibernate-force' will force hibernation.

We can also create a 2nd list for 'unsafe programs' that are known to not work after their descriptors are forced-closed. For those processes that we know for sure are going to mess up. It would not be safe to invalidate their open descriptors. Then even with force flag set, if an unsafe program is found. Then it is still respected (and hibernation is always aborted).

All of that is possible from inside a simple shell script in "/etc/pm/sleep.d/" configuration folder. The only thing i need from the ntfs-3g executable (and possibly the libfuse.so) is to respect a posix signal such as SIGHUP and force a dismount / cleanly unmount the ntfs volume.

Quote:
There is one independent process per mounted partition. If you force kill one of them, there is no impact on the other ones, but the fuse thread serving the killed process will be left in a bad state until you unmount.


Great! Good news then. I shall subscribe to the FUSE mailing list too. Many thanks.


Sat Oct 18, 2014 12:51
Profile
NTFS-3G Lead Developer

Joined: Tue Sep 04, 2007 17:22
Posts: 1286
Post Re: Dual Hibernation, almost! How 2 force unmount/close handles?
Hi,

Quote:
Probably we need some more appropriate signal instead like SIGHUP or SIGQUIT to exit cleanly. To there is a graceful shutdown and to properly un-mount the ntfs volume. Unfortunately there is no mention of any POSIX signal handling in the ntfs-3g manpage. So either the feature isn't implemented, or it is not documented. I will need to look into further and examine inside the ntfs-3g source code (and possibly the libfuse.so library also).

There is no signal handling in ntfs-3g, but there is some in the fuse library (among them SIGINT, which might interfere). The normal unmount procedure starts from the fuse library detecting the unmount request, and sending a destroy callback to ntfs-3g. Upon this destroy callback ntfs-3g syncs the data to partition, frees its own allocated memory, and returns to fuse which in turn does its own housekeeping, and finally returns from the initial call from ntfs-3g, which exits.

So, you might have a SIGINT handler variant in the fuse library, which simulates an unmount request in order to trigger the above procedure. To restart you would have to repeat the initial mount with the same arguments. This would probably lead to problems with open descriptors referencing closed files.

I do not know any procedure for ntfs-3g to tell fuse it wants to leave, but it must be possible for ntfs-3g on some signal to drop any metadata copy, and return an error for each of the subsequent fuse callbacks, until awakening and rebuilding the in-memory metadata from the partition (actually the ntfs-3g part of a standard mount). This would probably be nicer to open descriptors, but more complex to implement.

Regards

Jean-Pierre


Sat Oct 18, 2014 14:32
Profile

Joined: Fri Jul 17, 2009 19:12
Posts: 12
Post Re: Dual Hibernation, almost! How 2 force unmount/close handles?
Ah OK. Perhaps I was mistaken by the meaning of the SIGINT signal. Yes. This all seems to require some more deeper thought and planning. I am not certain which exact signals its best to use. Any combination of signals can be used to trigger the forced unmount mode. Or even a different mechanism which is not signal handling.

If keeping the same ntfs-3g driver process alive and open it may be valid to consider SIGSTP (or SIGSTOP*) and SIGCONT. (not SIGINT)

http://major.io/2009/06/15/two-great-si ... d-sigcont/

However that signal does not necessarily mean to say to the ntfs-3g driver should free or release a shared resource (e.g. to force it un-mounted and available for Windows). Or that now their resource has just become unavailable.

SIGSTP and SIGCONT just mean that the process should pause and expect later on to be continued (or perhaps terminated). One alternative use for those same pause-continue pair of signals is if the disk must go to sleep or loose power for a while - to return later. For example a power-saving state and other functions continue to work. (like for intel's new haswell-y CPU sleep states).

So in that scenario I think you guys might need that to NOT mean "force unmount the disk for windows". Because there may be no need to dismount it. Just only to pause or make unavailable for a while.


There are other possible signals: SIGUSR1, SIGUSR2 and SIGABRT. Which currently are not being handled by the FUSE library *at all*.

In fuse, it listens only on these ones:

sigaddset(&newset, SIGTERM);
sigaddset(&newset, SIGINT);
sigaddset(&newset, SIGHUP);
sigaddset(&newset, SIGQUIT);

They all go to the same exit_handler() function. Which calls fuse_session_exit(fuse_instance). So there is only 1 mechanism or way to notify the ntfs-3g driver. I do not know if we ask FUSE library to make new (seperate) handlers for more / new signals. It is a good idea to at leas tell them if we intend to use any new SIGNAL, do make sure they do not ever want to officially support it in future. But even better they add an official handler for such new signals. So great - I understand why we must ask on the FUSE mailing list.

Another possibility is to receive a combination of multiple signals. This can be useful.

For example. If we recieve SIGABRT first. Then send a SIGINT (or SIGHUP etc). Then that will tell us to force unmount. Instead of refuse to unmount.

Or if we first send a SIGSTP. Then on the SIGINT we force it to unmount. Otherwise we ignore SIGSTP and SIGCNT for the time being unless they need to be implemented for some future circumstance (like haswell-y power saving states). e.g. Like this:

SIGSTP or SIGSTOP-> disk unavailable (powered down)
-> stop servicing requests for the time being
SIGCNT-> resume servicing requests (never unmounted
or
SIGINT/HUP/QUIT/TERM
-> forced unmount (because we were previously stopped).

If either of those suggestion are not good please say so. Before we can decide which signals to ask for on the FUSE mailing list.

USR1 and USR2 are also possible. So is SIGURG (for 'urgent').

* Please NOTE: there are actually 2 different sig stop signals 'SIGSTOP' and 'SIGSTP'. They are 2 different signals.


Sat Oct 18, 2014 20:32
Profile

Joined: Fri Jul 17, 2009 19:12
Posts: 12
Post Re: Dual Hibernation, almost! How 2 force unmount/close handles?
Maybe that suggestion was not best. A SIGSTOP followed by a INT/TERM/HUP could be considered to be the same thing as just a INT/TERM/HUP alone. No different to current functionality - which is to always play safe and back out if there are open files.

Maybe the best choice (for a forced un-mount) is to use SIGURG followed by the existing INT/TERM/HUP. Because SIGURG represents the closest match for 'do it urgently' / Which is pretty much exactly what that situation is supposed to be. And URG signal is not very likely to conflict or be confused with any other possible kinds of usage or future features that are not implemented yet.

Of course that would also mean we would a) quit after un-mounting. Which isn't necessarily best for hibernation if we can recover later on to re-service the same handles. Perhaps what you envisage is more complex where b) we *pretend* to stay mounted (but for a while at the disk level have unmounted we don't tell the applications that).

In terms of FUSE and other libraries - we can imagine that SIGURG then INT/TERM/HUP also makes a good sense for a few other kinds of file systems. For example in Mac OS X if we want to dismount some removable media that is still in use, there is a "Force Eject" button that will dismount your ISO or portable USB drive immediately. That is the same sort of idea. Or for filesystems where the backing resource is not always guaranteed (Perhaps a network-backed store like SSHFS or NFS). We may want to force a networked filesystem closed if the network link goes down (again, even if there were open file handles).

Can't think of any better suggestions right now. And the other parts I have not thought about yet. As you say - they are also very important how the driver should handle such situation. I guess firstly it helps to decide which feature to implement either a) or b) or both of them.

We have identified a reasonable combination of POSIX signals for feature a) but not for feature b). And not yet asked or raised to anybody on the FUSE mailinglist.


Sun Oct 19, 2014 00:09
Profile

Joined: Fri Jul 17, 2009 19:12
Posts: 12
Post Re: Dual Hibernation, almost! How 2 force unmount/close handles?
Oh, and here is a relevent linux kernel mailinglist thread from *10 years ago*

http://www.gossamer-threads.com/lists/l ... nel/480857


Sun Oct 19, 2014 12:32
Profile
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 8 posts ] 


Who is online

Users browsing this forum: No registered users and 4 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group.
Original forum style by Vjacheslav Trushkin.