Uploaded image for project: 'CernVM'
  1. CernVM
  2. CVM-1200

Restarting autofs on SL7 breaks CVMFS repositories

    XMLWordPrintable

Details

    • Improvement
    • Resolution: Fixed
    • High
    • CernVM-FS 2.3.2, CernVM-FS 2.3.3
    • CVMFS
    • None
    • SL7

    • x86_64-slc6-gcc48-opt

    Description

      On SL6 machines with the CVMFS client installed (e.g. worker nodes), autofs can be restarted and CVMFS does not seem to be affected. However, on SL7, if the repository is in use while autofs restarted, any operations result in "Transport endpoint is not connected" errors, e.g.

      [root@vm4 ~]# ls /cvmfs/cms.cern.ch/
      ls: cannot open directory /cvmfs/cms.cern.ch/: Transport endpoint is not connected

      I can reproduce this with 2.3.2 and 2.3.3, just by doing this on one terminal:

      cd /cvmfs/cms.cern.ch/SITECONF

      and then running "systemctl restart autofs" in another terminal. The repository only becomes useable again after running:

      umount -l /cvmfs/cms.cern.ch

      Obviously we generally don't restart autofs on worker nodes running jobs, but occasionally when our configuration management system does updates it results in systemd restarting autofs.

      Here's an example from /var/log/messages of what happens on a worker node running jobs at the time autofs was restarted. In this case CVMFS seems to crash:

      2017-02-28T16:55:23.026438+00:00 lcg1662 systemd: Stopping Automounts filesystems on demand...
      2017-02-28T16:55:23.046421+00:00 lcg1662 cvmfs2: (lhcb.cern.ch) cache manager disappeared, aborting
      2017-02-28T16:55:23.046701+00:00 lcg1662 cvmfs2: (cms.cern.ch) cache manager disappeared, aborting
      2017-02-28T16:55:23.046953+00:00 lcg1662 cvmfs2: (alice.cern.ch) cache manager disappeared, aborting
      2017-02-28T16:55:23.171872+00:00 lcg1662 cvmfs2: (grid.cern.ch) cache manager disappeared, aborting
      2017-02-28T16:55:23.195837+00:00 lcg1662 cvmfs2: (atlas.cern.ch) cache manager disappeared, aborting
      2017-02-28T16:55:23.227994+00:00 lcg1662 cvmfs2: (sft.cern.ch) cache manager disappeared, aborting
      2017-02-28T16:55:28.810096+00:00 lcg1662 automount[1799]: umount_autofs_indirect: ask umount returned busy /cvmfs
      2017-02-28T16:55:53.068270+00:00 lcg1662 cvmfs2: (lhcb.cern.ch) stack trace generation failed
      2017-02-28T16:55:53.068583+00:00 lcg1662 cvmfs2: (alice.cern.ch) stack trace generation failed
      2017-02-28T16:55:53.068834+00:00 lcg1662 cvmfs2: (lhcb.cern.ch) Signal 6, errno 0
      2017-02-28T16:55:53.069071+00:00 lcg1662 cvmfs2: (alice.cern.ch) Signal 6, errno 0
      2017-02-28T16:55:53.072047+00:00 lcg1662 cvmfs2: (cms.cern.ch) stack trace generation failed
      2017-02-28T16:55:53.072282+00:00 lcg1662 cvmfs2: (cms.cern.ch) Signal 6, errno 0
      2017-02-28T16:55:53.157243+00:00 lcg1662 cvmfs2: (grid.cern.ch) stack trace generation failed
      2017-02-28T16:55:53.157518+00:00 lcg1662 cvmfs2: (grid.cern.ch) Signal 6, errno 0
      2017-02-28T16:55:53.209107+00:00 lcg1662 cvmfs2: (grid.cern.ch) Backtrace (10 symbols):#012/lib64/libcvmfs_fuse.so(+0xb54b3) [0x7f5086ecc4b3]#012/lib64/libpthread.so.0(+0xf370) [0x7f5089609370]#012/lib64/libc.so.6(gsignal+0x37) [0x7f5088a4d1d7]#012/lib64/libc.so.6(abort+0x148) [0x7f5088a4e8c8]#012/lib64/libc.so.6(+0x2e146) [0x7f5088a46146]#012/lib64/libc.so.6(+0x2e1f2) [0x7f5088a461f2]#012/lib64/libcvmfs_fuse.so(+0xd12df) [0x7f5086ee82df]#012/lib64/libcvmfs_fuse.so(+0x1268e3) [0x7f5086f3d8e3]#012/lib64/libpthread.so.0(+0x7dc5) [0x7f5089601dc5]#012/lib64/libc.so.6(clone+0x6d) [0x7f5088b0f73d]
      2017-02-28T16:55:53.209716+00:00 lcg1662 cvmfs2: (cms.cern.ch) Backtrace (10 symbols):#012/lib64/libcvmfs_fuse.so(+0xb54b3) [0x7f9cc2b524b3]#012/lib64/libpthread.so.0(+0xf370) [0x7f9cc528f370]#012/lib64/libc.so.6(gsignal+0x37) [0x7f9cc46d31d7]#012/lib64/libc.so.6(abort+0x148) [0x7f9cc46d48c8]#012/lib64/libc.so.6(+0x2e146) [0x7f9cc46cc146]#012/lib64/libc.so.6(+0x2e1f2) [0x7f9cc46cc1f2]#012/lib64/libcvmfs_fuse.so(+0xd12df) [0x7f9cc2b6e2df]#012/lib64/libcvmfs_fuse.so(+0x1268e3) [0x7f9cc2bc38e3]#012/lib64/libpthread.so.0(+0x7dc5) [0x7f9cc5287dc5]#012/lib64/libc.so.6(clone+0x6d) [0x7f9cc479573d]
      2017-02-28T16:55:53.210123+00:00 lcg1662 cvmfs2: (grid.cern.ch) address of g_cvmfs_exports: 0x7f50873e0580
      2017-02-28T16:55:53.210328+00:00 lcg1662 cvmfs2: (cms.cern.ch) address of g_cvmfs_exports: 0x7f9cc3066580
      2017-02-28T16:55:53.210554+00:00 lcg1662 cvmfs2: (lhcb.cern.ch) Backtrace (10 symbols):#012/lib64/libcvmfs_fuse.so(+0xb54b3) [0x7fa2cae3a4b3]#012/lib64/libpthread.so.0(+0xf370) [0x7fa2cd577370]#012/lib64/libc.so.6(gsignal+0x37) [0x7fa2cc9bb1d7]#012/lib64/libc.so.6(abort+0x148) [0x7fa2cc9bc8c8]#012/lib64/libc.so.6(+0x2e146) [0x7fa2cc9b4146]#012/lib64/libc.so.6(+0x2e1f2) [0x7fa2cc9b41f2]#012/lib64/libcvmfs_fuse.so(+0xd12df) [0x7fa2cae562df]#012/lib64/libcvmfs_fuse.so(+0x1268e3) [0x7fa2caeab8e3]#012/lib64/libpthread.so.0(+0x7dc5) [0x7fa2cd56fdc5]#012/lib64/libc.so.6(clone+0x6d) [0x7fa2cca7d73d]
      2017-02-28T16:55:53.210966+00:00 lcg1662 cvmfs2: (lhcb.cern.ch) address of g_cvmfs_exports: 0x7fa2cb34e580
      2017-02-28T16:55:53.211185+00:00 lcg1662 cvmfs2: (alice.cern.ch) Backtrace (10 symbols):#012/lib64/libcvmfs_fuse.so(+0xb54b3) [0x7ff26fb6a4b3]#012/lib64/libpthread.so.0(+0xf370) [0x7ff2722a7370]#012/lib64/libc.so.6(gsignal+0x37) [0x7ff2716eb1d7]#012/lib64/libc.so.6(abort+0x148) [0x7ff2716ec8c8]#012/lib64/libc.so.6(+0x2e146) [0x7ff2716e4146]#012/lib64/libc.so.6(+0x2e1f2) [0x7ff2716e41f2]#012/lib64/libcvmfs_fuse.so(+0xd12df) [0x7ff26fb862df]#012/lib64/libcvmfs_fuse.so(+0x1268e3) [0x7ff26fbdb8e3]#012/lib64/libpthread.so.0(+0x7dc5) [0x7ff27229fdc5]#012/lib64/libc.so.6(clone+0x6d) [0x7ff2717ad73d]
      2017-02-28T16:55:53.211606+00:00 lcg1662 cvmfs2: (alice.cern.ch) address of g_cvmfs_exports: 0x7ff27007e580
      2017-02-28T16:55:56.209270+00:00 lcg1662 systemd: Starting Automounts filesystems on demand...
      2017-02-28T16:55:56.249835+00:00 lcg1662 systemd: Started Automounts filesystems on demand.

      Attachments

        Activity

          People

            jblomer Jakob Blomer
            alahiff Andrew David Lahiff (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:
              Actual End: