mirror of
https://github.com/ewwhite/zfs-ha.git
synced 2026-05-15 22:05:04 -06:00
[GH-ISSUE #16] PCS cannot unmount file system during failover event #13
Labels
No labels
pull-request
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: github-starred/zfs-ha#13
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @intentions on GitHub (Oct 19, 2017).
Original GitHub issue: https://github.com/ewwhite/zfs-ha/issues/16
While migrating data onto my new zfs system I attempted a failover to do some work on one of the heads. The process failed, with pcs being unable to unmount the zfs file system. I tried unmounting by hand and was told
root@scifs1701:~] zpool export -f expphyvol umount: /expphyvol/hallc: target is busy. (In some cases useful info about processes that use the device is found by lsof(8) or fuser(1)) cannot unmount '/expphyvol/hallc': umount failed root@scifs1701:~] umount -f /expphyvol/hallc/ umount: /expphyvol/hallc: target is busy. (In some cases useful info about processes that use the device is found by lsof(8) or fuser(1))looking at pcs this was after the IP had been shut down, so I don't know how new writes could be coming to the device.
@ewwhite commented on GitHub (Oct 31, 2017):
Did you check the output of
lsof /expphyvol/hallc?@intentions commented on GitHub (Oct 31, 2017):
the lsof returns nothing.
I asked about this on the zfs mailer and I got one response of "yea it happens sometimes", so I'm guessing it isn't a problem with the pacemaker setup
@colttt commented on GitHub (Nov 13, 2017):
Hello,
I've the same issue, stop the nfs-server before you export the zfs pool, and start it before you import it. I was wondering why don't happen this in this how-to
@intentions commented on GitHub (Nov 13, 2017):
Thanks, though the last time I restarted NFS all the clients yelled about stale file handlers and I had to reboot the head anyway.
I'm closing this because it now seems to be more of an issue with ZFS then what PCS is doing.
@colttt commented on GitHub (Nov 13, 2017):
thats not an issue with ZFS! its an issue with NFS, because they dont stop the TIME_WAITS (it doesn't if the interface is down) and wait ca 2-4minutes and then stop this, you can decrease this parameters, but i don't remeber which paramters.. sorry
@ewwhite commented on GitHub (Nov 13, 2017):
Are you using NFSv3 or NFSv4? For NFSv3, I find that it's good enough to keep the NFS daemon enabled and running on both hosts. The zpool export handles client notification, unexporting of the NFS share and the re-export all in one action.
@intentions commented on GitHub (Nov 13, 2017):
nfs v3
During my initial testing (10 odd clients) I didn't see any problems, but once the system entered production use (~900 clients) I started seeing this problem.
@colttt commented on GitHub (Nov 14, 2017):
we use nfs v4.2 (tcp) if you use nfs v3 and 10G you have a high risk of dataloss (because its UDP).
@intentions commented on GitHub (Nov 14, 2017):
Data is going out over 56G FDR (but the clients are all on 40G QDR), we are using tcp