as promised from my presentation, here's the bugreport for the (unnecessary) slowness of TMethodCall::Execute.
I think it's mostly clear from the callgrind information in the attached image (taken with kcachegrind disassembly).
Sadly, line number information is somehow missing from the disassembly, but I am pretty sure it is happening in "core/meta/src/TClingCallFun
This is also present in some of the other branches of "TClingCallFunc::exec_with_valref_return", actually.
It triggers cling::Value construction, move, and destruction.
Most time in this cling::Value constructor seems to go into
which of course only needs to be done when a fresh cling::Value is really necessary.
From callgrind it seems that this eats more than 50 % of the cycles if the function called itself is lightweight and PoD-returning.