Uploaded image for project: 'ROOT'
  1. ROOT
  2. ROOT-8858 Wishes for the TDataFrame for 6.14
  3. ROOT-9371

[TDF] Allow users to define and execute custom actions



    • Sub-task
    • Resolution: Fixed
    • High
    • 6.14/00
    • None
    • None
    • None


      We cannot satisfy the needs of each individual user with a finite number of TDF actions.
      Users in need of something more specific than the default actions we provide need a way to plug their custom ones into the TDF machinery, retaining the benefits of lazy execution, implicit parallelization and so forth.

      The idea is to allow the following syntax:

      auto customResult = tdf.Book(CustomHelper(“x”));

      where CustomHelper is a user-defined type that must satisfy a minimal set of requirements:

      • Helper(Helper &&): a move-constructor is required. Copy-constructors are discouraged.
      • ColumnTypes_t: alias for a ROOT::TypeTraits::TypeList instantiation that specifies the types of the columns to be passed to this action helper.
      • Result_t: alias for the type of the result of this action helper. Must be default-constructible.
      • ROOT::Detail::TDF::ColumnNames_t GetColumnNames() const: return the names of the columns processed by this action. The number of names must be equal to the size of ColumnTypes_t.
      • void Exec(unsigned int slot, ColumnTypes...columnValues): each working thread shall call this method during the event-loop, possibly concurrently. No two threads will ever call Exec with the same 'slot' value: this parameter is there to facilitate writing thread-safe helpers. The other arguments will be the values of the requested columns for the particular entry being processed.
      • void InitSlot(TTreeReader *, unsigned int slot): each working thread shall call this method during the event loop, before processing a batch of entries. This method can be used e.g. to prepare the helper to process a batch of entries in a given thread. Can be no-op.
      • void Initialize(): this method is called once before starting the event-loop. Useful for setup operations. Can be no-op.
      • void Finalize(): this method is called at the end of the event loop. Commonly used to finalize the contents of the result.
      • Result_t &PartialUpdate(unsigned int slot): this method is optional, i.e. can be omitted. If present, it should return the value of the partial result of this action for the given 'slot'. Different threads might call this method concurrently, but will always pass different 'slot' numbers.
      • std::shared_ptr<Result_t> GetResultPtr() const: return a shared_ptr to the result of this action (of type Result_t). The TResultPtr returned by Book will point to this object.




            eguiraud Enrico Guiraud
            eguiraud Enrico Guiraud
            0 Vote for this issue
            0 Start watching this issue


              Actual Start:
              Actual End: