Details
-
Sub-task
-
Resolution: Fixed
-
High
-
master, 6.14/00
-
None
-
None
-
any
Description
An ALICE user on the forum has built an RDataFrame application that writes 243 branches to disk. It takes around 18 seconds on my workstation to just-in-time compile a Snapshot invocation with that amount of template parameters (it takes slightly more to compile it, but that's not at runtime). At the same time, it is unreasonable to expect that a user types out explicitly 243 branch types as template parameters.
More recently, ATLAS users converting an ntuple->ntuple conversion tool to RDataFrame encountered long startup times for a similar reason: they perform 30 different calls to Snapshot, jitting the same 43 template arguments. Agan, it's probably unreasonable to expect users to spell out those 43 types explicitly as template arguments.
The following little snippet checks how long it takes to just-in-time compile a Snapshot call with 40 column parameters:
#include <ROOT/RDataFrame.hxx>
|
#include <chrono>
|
#include <iostream>
|
#include <vector>
|
#include <string>
|
|
int main() |
{
|
auto d = ROOT::RDataFrame(1);
|
ROOT::RDF::RSnapshotOptions opts;
|
opts.fLazy = true; |
|
auto start1 = std::chrono::system_clock::now();
|
const std::vector<std::string> columnList(40, "tdfentry_"); |
auto l = d.Snapshot("t", "foo.root", columnList, opts); |
auto end1 = std::chrono::system_clock::now();
|
std::cout << (end1 - start1).count() / 1000000000. << " s" << std::endl; |
|
auto start2 = std::chrono::system_clock::now();
|
d.Snapshot("t", "foo.root", columnList, opts); |
auto end2 = std::chrono::system_clock::now();
|
std::cout << (end2 - start2).count() / 1000000000. << " s" << std::endl; |
|
*l; // work around ROOT-9466 |
return 0; |
}
|
On my workstation, with BUILD_TYPE=Release this takes 1.6s for the first Snapshot and 1.17 for the second (which, I guess, re-uses some memoized template instantiations from the first).
Snapshot is particularly susceptible to this problem because it's the one action that is often applied to a large number of columns, while other operations are either not just-in-time compiled or are applied to few columns at a time.