Uploaded image for project: 'CernVM'
  1. CernVM
  2. CVM-1515

catalog updates are not working when HTTP access is not used

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Medium
    • Resolution: Fixed
    • Affects Version/s: CernVM-FS 2.4.4
    • Fix Version/s: CernVM-FS 2.5.1
    • Component/s: CVMFS
    • Labels:
      None
    • Operating System:
      GNU/Linux
    • Platforms:
      x86_64-slc6-gcc48-opt
    • Development:

      Description

      Hello,

      We have a major computing site that uses a preloaded alien cache, and HTTP access is disabled in the manner described in CVM-1507. We are finding that nodes are not updating to the latest repository revision available in the alien cache in a regular or consistent manner. In some cases, even after 3-4 days a node still has an old catalog revision. The issue is widespread and affects most nodes in the cluster.

      Looking at the output of cvmfs_config stat -v, we see , for example:
      File Catalog Revision: 1153 (expires in 1 minutes)

      After the timer reaches 0 minutes, these messages are logged:

      Apr 20 15:23:21 nia-login01 cvmfs2: (soft.computecanada.ca) failed to resolve IP addresses for NULL (4 - unknown host name)
      Apr 20 15:23:22 nia-login01 cvmfs2: (soft.computecanada.ca) failed to download repository manifest (6 - proxy connection problem)

      We speculate that after the attempt to download the manifest via HTTP fails, no further action is taken - whereas the client should be looking at the cvmfschecksum files in the alien cache in order to find the latest available root catalog hash, and start using it.

      We tried adjusting CVMFS_MAX_TTL but it did not help.
      We tried changing CVMFS_HTTP_PROXY to DIRECT and found that this caused the node to load the latest catalog revision almost immediately. However we then run into the problem that files in the latest revision are not available in the preloaded alien cache yet, and the client does not have a writable cache to store any downloaded files in.

       

      This is a large MPI cluster , so it is important for the contents of CVMFS to be consistent across different nodes in the cluster, which is one of the reasons we decided on this configuration of alien cache. How can we get the nodes to stay up to date with the latest version in the alien cache? Is there a bug or misconfiguration?

      Thanks!

       

        Attachments

          Activity

            People

            • Assignee:
              jblomer Jakob Blomer
              Reporter:
              rptaylor Ryan Taylor
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:
                PlannedEnd:
                PlannedStart: