Uploaded image for project: 'CernVM'
  1. CernVM
  2. CVM-1480

Larger accumulated cpu time for cvmfs2 process when network is limited

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Medium
    • Resolution: Fixed
    • Affects Version/s: CernVM-FS 2.4.1
    • Fix Version/s: CernVM-FS 2.5
    • Component/s: CVMFS
    • Labels:
      None
    • Environment:
    • Operating System:
      GNU/Linux
    • Development:

      Description

      Hi,

      I was recently testing the effect of putting artificial network bandwidth limits on a machine and measuring how the CPU time of some experiment workloads varied. I saw that a cvmfs process took much more CPU time when the network was limited than without the limit. My machine was an otherwise idle centos 7 machine, with cvmfs 2.4.1 (mounted with autofs), no virtualisation.

      I hope this will reproduce the effect for you:

      # ls -al /cvmfs/atlas.cern.ch/repo/sw/software/x86_64-slc6-gcc49-opt/20.7.5/sw/IntelSoftware/linux/x86_64/xe2013/composer_xe_2013.3.163/compiler/lib/intel64/libintlc.so.5
       
      -rwxr-xr-x. 1 cvmfs cvmfs 351578 Mar 15  2013 /cvmfs/atlas.cern.ch/repo/sw/software/x86_64-slc6-gcc49-opt/20.7.5/sw/IntelSoftware/linux/x86_64/xe2013/composer_xe_2013.3.163/compiler/lib/intel64/libintlc.so.5
       
      # cvmfs_config wipecache
      # systemctl restart autofs
      # sync
      # echo 3 > /proc/sys/vm/drop_caches
      # iptables -I INPUT 1 -p tcp -m hashlimit -m tcp ! -s <mylatop.cern.ch> --tcp-flags SYN,FIN NONE --hashlimit-name hl1 --hashlimit-above 100kb/s -j DROP
      # time dd if=/cvmfs/atlas.cern.ch/repo/sw/software/x86_64-slc6-gcc49-opt/20.7.5/sw/IntelSoftware/linux/x86_64/xe2013/composer_xe_2013.3.163/compiler/lib/intel64/libintlc.so.5 of=/dev/null  bs=131072
      2+1 records in
      2+1 records out
      351578 bytes (352 kB) copied, 0.00193543 s, 182 MB/s
       
      real	8m53.300s
      user	0m0.002s
      sys	0m0.005s
      #  ps auxgww | grep cvmfs2
      # [...]
      cvmfs    16338 98.6  0.0 706944 31456 ?        Sl   08:52   7:55 /usr/bin/cvmfs2 -o rw,fsname=cvmfs2,allow_other,grab_mountpoint,uid=994,gid=991 atlas.cern.ch /cvmfs/atlas.cern.ch
      # iptables -D INPUT 1
      

      The cvmfs2 process accumulated 475 CPU seconds and the copy took 533 wall seconds with the stock 2.4.1 build. I made some investigation, and a trial patch (see attachment). This reduced the CPU time in this test to 11s CPU (534 wall).

      Without the network limit the timings were CPU ~1s: wall 2.9s (-0.4/+0.2) (stock) / CPU ~1s: wall 2.4s (-0.3/+0.1) (patch).

      I don't know how much effect the potentially large CPU usage has in a realistic deployment - my network limit is extreme and artificial. But it seems something that might be useful to investigate. I attach the patch I tried - this will let you know the area I was looking at. (The patch is only a suggestion - to show there may be a reduction in cpu to be had)

      David

        Attachments

        1. after.txt
          4 kB
        2. before.txt
          4 kB
        3. cvmfs.patch
          0.8 kB

          Issue Links

            Activity

              People

              • Assignee:
                rpopescu Radu Popescu
                Reporter:
                dhsmith David Smith
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:
                  PlannedEnd:
                  PlannedStart: