Uploaded image for project: 'ROOT'
  1. ROOT
  2. ROOT-10524

Performance anomaly when running TChain::AddFriend()

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed (View Workflow)
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 6.18/04
    • Fix Version/s: 6.22/00
    • Component/s: TTree
    • Labels:
      None
    • Environment:

      CentOS 7, Ubuntu 18.04

      Description

      The following reproducer:

      #include <iostream>
      #include "TChain.h"
      #include "ROOT/RDataFrame.hxx"
      #include "TH1D.h"
       
      void tchain_macro_reproducer () {
       
          std::vector<std::string> main_files_names = {"filexy.root"};
          std::vector<std::vector<std::string>> friend_tag_names = {{"filez.root"}, {"filez2.root"}};
       
          TChain* chain = new TChain("myTree");
       
          for(auto it: main_files_names) {
              chain->Add(it.c_str());
          }
       
       
          for(auto&& it: friend_tag_names) {
              TChain* fchain = new TChain("myTree");
              for(auto s_it: it) {
                  fchain->Add(s_it.c_str());
              }
              chain->AddFriend(fchain);
          }
       
          double p;
          chain->SetBranchAddress("x", &p);
          for (unsigned long i = 0; i < chain->GetEntries(); i++) {
              chain->GetEntry(i);
          }
      }
       
      int main() {
          tchain_macro_reproducer();
          return 0;
      }
      

      seems to act in a weird way, causing a drastic decrease of performance.

      The idea is simple: we have a main TChain, created from a TTree "myTree" saved inside filexy.root, that has two branches (x and y) with 1800000 events. 

      Then we create two other TChains out of the TTrees "myTree" saved in filez.root and filez2.root, each one containing only one branch (z in the case of filez.root and z2 in the case of filez2.root). 

      Note that we put the names of the files in vectors because in principle we want to add all the files in the vector to the appropriate TChain.

       

      The point is that in the loop we don't even touch branches that belong to the friend TChains, but their addition seem anyway to affect the performance of the program. You can easily check the difference between the two configuration by simply commenting out the addition of the friend TChains.

      I attach the following:

      • FlameGraph of the script, where a weird (in my opinion) call to TTree::RemoveFriend() can be seen (I say weird because we don't explicitly remove friends in any part of the code);
      • the scripts used to create the TFiles mentioned in the script;
      • a copy of the script itself. 

       

      To compile and produce the FlameGraph I did the following:

      • clone the FlameGraph repository and move inside it;
      • copy there the reproducer and the root files produced with the Python scripts attached;
      • after compiling the reproducer and getting an executable called exe, run the following: 

        sudo perf record -a -g ./exe
        sudo perf script | ./stackcollapse-perf.pl > out.perf-folded
        ./flamegraph.pl out.perf-folded > perf-kernel.svg

        Attachments

        1. fill_tree_xy.py
          0.4 kB
        2. fill_tree_z.py
          0.3 kB
        3. fill_tree_z2.py
          0.3 kB
        4. perf-kernel_fixed.svg
          729 kB
        5. perf-kernel.svg
          721 kB
        6. RDataFrame.C
          0.7 kB
        7. RDataFrame.svg
          762 kB
        8. tchain_macro_reproducer.C
          0.8 kB

          Activity

            People

            • Assignee:
              pcanal Philippe Canal
              Reporter:
              gallim Massimiliano Galli
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: