[ROOT-6664] Problems fitting sub-ranges with RooChebychev Created: 08/Sep/14  Updated: 21/Mar/19  Resolved: 29/Sep/14

Status: Closed
Project: ROOT
Component/s: RooFit/RooStats
Affects Version/s: None
Fix Version/s: 5.34/24

Type: Bug Priority: High
Reporter: Cristiano Alpigiani Assignee: Wouter Verkerke
Resolution: Fixed Votes: 7
Labels: None

Mac Os X 10.9.4 (X86-64)
ROOT 5.34/20
Apple LLVM version 5.1 (clang-503.0.40) (based on LLVM 3.4svn)
Target: x86_64-apple-darwin13.3.0
Thread model: posix

Attachments: Text File Debug_report.txt     File FitUnblindedDataset.C     File RooChebychev.cxx     File RooChebychev.h    
Actual End:



I am performing a fit of a blinded dataset using a RooChebychev.
I have already posted this issue on the ROOT forum in June, but I didn't get any reply. Now we are a little stuck with our analysis because we need to be able to perform this kind of fit.

I did lots of checks try to debug the problem, but I didn't manage to find any good solution.

In attachment you can find a macro that shows the problem. The macro is divided into "TEST1" and "TEST2".
– TEST1 generates a dataset using a 1st order Chebychev in the mass region [4700,5900] and a "blinded" dataset reducing the full dataset to the two sub-regions [4700,5100] and [5500,4900]. Then, it performs two fits:
1) fit the "full" dataset and plot the results (lines 75-79 )
2) fit the "blinded" dataset and plots the results (lines 86-90).

– TEST2, similar to TEST1, but with a different PDF (1st order Chebychev + Gaussian)

Now, if I do this test using a RooExponential, the fit is OK on both full and blinded dataset (you can try that commenting RooChebychev and de-commenting RooExponential), whereas if I do the test using RooChebychev, when I do the fit on the two sub-ranges, ROOT crashes and exits due to the assert() present in the RooChebychev.cxx (lines 141-142). In attachment you can find the bt obtained running lldb on my laptop.

I added some prints to RooChebychev.cxx


Now, looking at RooChebychev.h, if I understood correctly, "_x" should correspond to the variable "mass" (in my case) and so xminfull = _x.min() should always be equal to 4700, and xmaxfull = _x.max() should always be equal to 5900.
When I try to do the fit in the two sub-ranges I got this print (before ROOT crashes)

_x.min(rangeName) = 4700
_x.max(rangeName) = 5900
xminfull = 4700
xmaxfull = 5100

As you can see, it seems that the ranges are swapped, and so the assert() fails and exits ROOT. "rangeName" should be "SB1" or "SB2" (defined in lines 54-55).

I have tried to investigate the problem in more detail, but I din't find any proper solution. Maybe there is some problem in dealing with the ranges in the RooChebychev.cxx (maybe some problem in RooProxy?).
Looking at RooAbsOptTestStatistic.cxx, in lines 286-365 RooFit make adjustments for the fit ranges, I saw that, commenting line 311, ROOT does not crash, the fit with the Chebychev only seems to be OK (but not the normalisation, that is wrong), whereas the fit Chebychev+Gaussian converges but the chi2 is huge (so the fitted curve does not fit the data and the normalisation is wrong).

I hope these tests can help in figure out where the problem is. Let me know if I can help with some other tests and if you need more information.

Thanks a lot in advance!



Comment by Lorenzo Moneta [ 18/Sep/14 ]

There is no reason to the assert to be there. Sometimes one wants to compute the integral of the pdf in a range larger than the definition. Since it is a polynomial there is no problem for the function to be evaluated outside its range.
The given test program has been tested using numerical integration and the same result is obtained

Comment by Lorenzo Moneta [ 22/Sep/14 ]

The problem, remaining after removing the assert, is due to the definition of RooChebyshev (and it is the same for RooBernstein).
The evaluation of the polynomial depends on the defined range, therefore when doing a simultaneous fit on two separate ranges (e.g. the sidebands) RooFIt defines two RooChebyshev objects, each one with its own range.
The polynomial effective parameters depends therefore on the range and they are at the end different for the two ranges.
This is not the case when using a RooPolynomial, where the returned value in ::evaluate does not depend on the range.
I am not sure what is the best solution for this problem.


Comment by Lorenzo Moneta [ 22/Sep/14 ]

Thinking more, a possible solution is to add in RooChebyshev two new data members (_xmin, _xmax) which in the constructor are assigned from the current variable range, but they cannot then be modified later on. This solution fixes the problem observed in the test program.

Comment by Lorenzo Moneta [ 25/Sep/14 ]

This attached version of RooChebyshev seems to fix the problem. Needs to be confirmed before committing that fix

Comment by Lorenzo Moneta [ 29/Sep/14 ]

The problem is now fixed in 5.34, 6.02 and master by a new version of RooChebyshev using a reference range for evaluating the polynomial.
A similar fix is expected for RooBernstein

Comment by Konstantin Schubert (Inactive) [ 16/Dec/14 ]

To be more precise, this was fixed in the patch release v5-34-22

Generated at Mon Feb 24 17:11:25 CET 2020 using Jira 8.3.4#803005-sha1:1f96e09b3c60279a408a2ae47be3c745f571388b.