Uploaded image for project: 'ROOT'
  1. ROOT
  2. ROOT-9476 [DF] Upgrade of Data Frame for 6.16
  3. ROOT-9491

[DF] Add common base class to all node types

    XMLWordPrintable

Details

    • Sub-task
    • Status: Closed (View Workflow)
    • High
    • Resolution: Fixed
    • None
    • 6.16/00
    • None
    • None

    Description

      There are a number of trivial operations that users often want to perform on dataframes that are surprisingly hard to get right, for example adding several `Define`s in a loop or conditionally adding a `Filter` depending on a runtime boolean (both use-cases are challenging in C++, trivial in python).

      Difficulties boil down to the fact that different dataframe nodes have different types (because their types incorporate e.g. the type of the callable passed to a `Filter` and the type of their parent node in the computation graph).

      I propose to add a common base class ROOT::Detail::RDF::RNodeBase` to all nodes of the graph (except leaves a.k.a results, which have a completely different interface),
      so that users can, for example:

      • take any dataframe node by reference in non-template functions as `RNode&`
      • `emplace_back` dataframe nodes in ~`std::vector<RNode>`~ `vector<RInterface<RNode>>`
      • have non-const pointers to dataframe nodes

      and so on.

      For example, conditionally adding a `Range` do a dataframe now looks like this:

      auto maybe_ranged = [&df, mustAddRange]() -> ROOT::RDF::RNode {                                                                                       
            return mustAddRange ? d.Range(1) : d;                   
      }();  
      

      while before this change one would have to add fake `Filter("true")` filters to normalize the return type of the lambda, involving the interpreter for no reason.

      Internal `RDataFrame` code is also simplified by the introduction of this common base class.
      The only downside I can think of is that if this mechanism is abused users might end up with extra, unnecessary virtual calls in their event loop – on the other hand, this mechanism should only be used in situations that required either complex template magic or dirty and slow tricks before.

      Attachments

        Activity

          People

            eguiraud Enrico Guiraud
            eguiraud Enrico Guiraud
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:
              Actual Start:
              Actual End: