Details
-
Bug
-
Status: Closed
-
Medium
-
Resolution: Fixed
-
CernVM-FS 2.7.4
-
None
-
ANY
-
Description
We run GC regularly like this:
/usr/bin/cvmfs_server gc -a -l -f
It ran normally until:
Preserving Revision 4264 (17 Nov 2020 11:03:17 / added @ 17 Nov 2020 11:03:17)
|
├─ c8d2d9b86403219ddd398a75de0a425ad223d01c-shake128 /
|
failed to load catalog 98d744d30060a2a6bfe21f6ed2c85c1adc65171a-shake128C (3 - network failure)
|
garbage collection failed
|
Segmentation fault
|
Fail (6)!
|
umount: /cvmfs/soft.computecanada.ca: device is busy.
|
(In some cases useful info about processes that use
|
the device is found by lsof(8) or fuser(1))
|
I am not sure why there was a network failure, maybe some momentary blip, but even if a network failure does occur, preferably CVMFS server would handle it and fail gracefully. Instead there was a seg fault and the GC processes hung indefinitely.
There were also many abort processes piled up (from users trying to clear up the status of the repo)
root 9692 1 0 05:51 ? 00:00:00 /bin/sh /usr/bin/cvmfs_server abort -f soft.computecanada.ca
|
root 9835 9692 0 05:51 ? 00:00:00 /bin/mount /var/spool/cvmfs/soft.computecanada.ca/rdonly
|
root 9836 9835 0 05:51 ? 00:00:00 cvmfs2 soft.computecanada.ca /var/spool/cvmfs/soft.computecanada.ca/rdonly -o rw,nodev,allow_other,config=/etc/cvmfs/repositories.d/soft.computecanada.ca/client.conf:/var/spool/cvmfs/soft.computecanada.ca/client.local,cvmfs_suid,suid
|
root 20371 1 0 Nov29 ? 00:00:00 cvmfs2 restricted.computecanada.ca /var/spool/cvmfs/restricted.computecanada.ca/rdonly -o rw,nodev,allow_other,config=/etc/cvmfs/repositories.d/restricted.computecanada.ca/client.conf:/var/spool/cvmfs/restricted.computecanada.ca/client.local,cvmfs_suid,suid
|
root 20376 1 0 Nov29 ? 00:00:00 cvmfs2 restricted.computecanada.ca /var/spool/cvmfs/restricted.computecanada.ca/rdonly -o rw,nodev,allow_other,config=/etc/cvmfs/repositories.d/restricted.computecanada.ca/client.conf:/var/spool/cvmfs/restricted.computecanada.ca/client.local,cvmfs_suid,suid
|
root 20967 20962 0 02:00 ? 00:00:00 /bin/sh /usr/bin/cvmfs_server gc -a -l -f
|
root 21128 20967 0 02:00 ? 00:00:00 /bin/sh /usr/bin/cvmfs_server gc -a -l -f
|
root 21131 21128 0 02:00 ? 00:00:00 /bin/sh /usr/bin/cvmfs_server gc -a -l -f
|
When I ran abort, I saw it hung on this log message:
2020-11-30T10:51:17.588-08:00 local@harpsponge.comp.uvic.ca user.notice cvmfs2: (soft.computecanada.ca) another process holds ./lock_cachedb, wai
|
ting.
|
After killing the hung gc processes (and aborts) it recovered.
Attachments
Issue Links
- relates to
-
CVM-1967 GC crash on file IO error
-
- Closed
-