Uploaded image for project: 'ROOT'
  1. ROOT
  2. ROOT-9369

A couple issues involving SetMustClean

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed (View Workflow)
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 6.12/06
    • Fix Version/s: 6.12/08, 6.14/00, 6.16/00
    • Component/s: Core Libraries
    • Labels:
      None
    • Environment:

      root 6.12.06 on lxplus (ATLAS master/Athena builds)

      Description

      hi -

      When we switched from root 6.10.06 to root 6.12.06, we observed some
      issues related to the use of SetMustClean. The first i think is
      relatively straightforward, and can be reproduced like this:

      $ python
      >>> import ROOT
      >>> ROOT.gROOT.SetMustClean(False)
      >>> t=ROOT.TTree('z','z')
      Error in <ROOT::Internal::TCheckHashRecursiveRemoveConsistency::CheckRecursiveRemove>: The class TNamed overrides TObject::Hash but does not call TROOT::RecursiveRemove in its destructor (seen while checking TTree).
      

      It appears that the checks done in TCheckHashRecursiveRemoveConsistency,
      which are new in this version of root, don't work correctly if MustClean
      is off.

      However, we've also been seeing crashes related to the same bit of code.
      Here is an example stack trace from such a crash:

      #0 0x000000000a563b60 in ?? ()
      #1 0x00007f96690e6e22 in TList::FindObject (this=0x1f7bc70, obj=0xa563b60)
       at /mnt/build/jenkins/workspace/lcg_release_tar/BUILDTYPE/Debug/COMPILER/gcc62binutils/LABEL/slc6/build/projects/ROOT-6.12.06/src/ROOT/6.12.06/core/cont/src/TList.cxx:614
      #2 0x00007f96690e316e in THashTable::FindObject (this=0x1f79950, 
       obj=0xa563b60)
       at /mnt/build/jenkins/workspace/lcg_release_tar/BUILDTYPE/Debug/COMPILER/gcc62binutils/LABEL/slc6/build/projects/ROOT-6.12.06/src/ROOT/6.12.06/core/cont/src/THashTable.cxx:245
      #3 0x00007f96690e1c12 in THashList::Remove (this=0x1f798d0, obj=0xa563b60)
       at /mnt/build/jenkins/workspace/lcg_release_tar/BUILDTYPE/Debug/COMPILER/gcc62binutils/LABEL/slc6/build/projects/ROOT-6.12.06/src/ROOT/6.12.06/core/cont/src/THashList.cxx:381
      #4 0x00007f966a6c56fd in TTree::~TTree (this=0xba1c5f0, 
       __in_chrg=<optimized out>)
       at /mnt/build/jenkins/workspace/lcg_release_tar/BUILDTYPE/Debug/COMPILER/gcc62binutils/LABEL/slc6/build/projects/ROOT-6.12.06/src/ROOT/6.12.06/tree/tree/src/TTree.cxx:927
      #5 0x00007f966a6c5a2a in TTree::~TTree (this=0xba1c5f0, 
       __in_chrg=<optimized out>)
       at /mnt/build/jenkins/workspace/lcg_release_tar/BUILDTYPE/Debug/COMPILER/gcc62binutils/LABEL/slc6/build/projects/ROOT-6.12.06/src/ROOT/6.12.06/tree/tree/src/TTree.cxx:958
      #6 0x00007f96592185e3 in std::auto_ptr<TObject>::~auto_ptr (
       this=0x7ffe2ee433f0, __in_chrg=<optimized out>)
       at /cvmfs/atlas-nightlies.cern.ch/repo/sw/master/sw/lcg/releases/gcc/6.2binutils/x86_64-slc6-gcc62-opt/include/c++/6.2.0/backward/auto_ptr.h:170
       
      #7 0x00007f96591e96e7 in dqutils::MonitoringFile::mergeDirectory (
       this=0x7bf3830, outputDir=0xb9e0bc0, inputFiles=..., 
       has_multiple_runs=false, prefixes=0x7ffe2ee44bc0)
       at /afs/cern.ch/user/s/ssnyder/atlas-work3g/DataQuality/DataQualityUtils/src/MonitoringFile.cxx:592
      

      The crash, however, does not always happen, even rerunning the exact same code.

      I think what's going on is the same issue as illustrated by this example:

      $ python
      >>> import ROOT
      >>> ROOT.gROOT.SetMustClean(False)
      >>> t=ROOT.TTree('z','z')
      Error in <ROOT::Internal::TCheckHashRecursiveRemoveConsistency::CheckRecursiveRemove>: The class TNamed overrides TObject::Hash but does not call TROOT::RecursiveRemove in its destructor (seen while checking TTree).
      >>> c=t.CloneTree()
      >>> del c
      >>> del t
       *** Break *** segmentation violation
      

      When the tree is cloned, the clone gets added to gROOT->fCleanups.
      But if MustClean is off when the clone is deleted, it doesn't get
      removed from the list, leaving a dangling pointer to a deleted object,
      which we hit when t is deleted.

      Now, i don't completely understand this: as i mentioned, the crash in our
      code only happens some of the time. Further, it worked ok in 6.10.06, while
      the example above also crashes in 6.10.06. My guess is that the change
      in root versions affected the order in which the trees get deleted,
      but i haven't proven that.

      It may well be that the answer to these issues, especially the crash,
      is `don't do that then'. In fact, if i remove the SetMustClean call,
      then both problems go away. The original author of the code says that
      turning off SetMustClean was done originally to address performance issues.
      We should check on our side if that's still an issue.

       

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              pcanal Philippe Canal
              Reporter:
              ssnyder Scott Snyder
              Votes:
              1 Vote for this issue
              Watchers:
              7 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:
                Actual Start:
                Actual End: