Details
-
Sub-task
-
Status: Closed (View Workflow)
-
High
-
Resolution: Fixed
-
6.10/04
-
None
Description
Report from Dan Riley:
I have a couple of more immediate questions about the TBufferMerger from my scaling studies. The first has to do with this commit,
https://github.com/root-project/root/commit/a3b9d864a0017479a694f0d6dddb926f4f79f80b
Prior to that commit, the TBufferMerger::WriteOutputFile() central loop had a fairly limited lock on gROOTMutex:
{
|
TDirectory::TContext ctxt;
|
TMemFile *tmp;
|
{
|
R__LOCKGUARD2(gROOTMutex);
|
tmp = new TMemFile(fName.c_str(), buffer->Buffer() + buffer->Length(), length, "READ");
|
}
|
buffer->SetBufferOffset(buffer->Length() + length);
|
merger.AddAdoptFile(tmp);
|
merger.PartialMerge();
|
merger.Reset();
|
}
|
Following that commit, and on into the current version, the gROOTMutex lock expanded to encompass the entire partial merge:
{
|
R__LOCKGUARD(gROOTMutex);
|
memfile.reset(new TMemFile(fName.c_str(), buffer->Buffer() + buffer->Length(), length, "read"));
|
buffer->SetBufferOffset(buffer->Length() + length);
|
merger.AddFile(memfile.get(), false);
|
merger.PartialMerge();
|
}
|
The expanded scope of the gROOTMutex is a serious limit on scaling the TBufferMerger for our purposes (partially for reasons related to my next question). Is that scope expansion necessary? Is there any prospect to remove it back to the original limited scope of creating the TMemFile?
Second question is, unfortunately, far more amorphous. With the prototype I'm working with, I see the AOD output (but not the MINIAOD) writer thread consuming far more CPU time than I would have expected, and from performance traces it seems to be mostly in the zlib deflate() routine (typical stack trace below). This is something of a mystery, as the TBufferMergerFiles being merged are LZMA compressed, and should (I think) have compression fully applied before being passed to the TBufferMerger. Do you know of any ROOT file internal structure that might account for the zlib compression?