[ROOT-6069] Add Anderson-Darling test for histograms Created: 12/Feb/14  Updated: 15/May/19  Resolved: 25/Sep/14

Status: Closed
Project: ROOT
Component/s: Math Libraries
Affects Version/s: None
Fix Version/s: None

Type: New Feature Priority: Medium
Reporter: Lorenzo Moneta Assignee: Lorenzo Moneta
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File AD_Test.C     File ad_test_hist.C     File testBinnedAD.C    
Originator Email: Kuhan.Wang@cern.ch
Bug / Feature: Feature request
Severity: 3 - Normal
Development:

 Description   

Request from Kuhan Wang (ATLAS phd student) to add the AD test for histograms to have more sensitivity to tails in low statistics regimes

from a private email:

I have attached the code (AD_Test.C), I also attached a macro that generates pairs of histograms with data drawn at random from Gaussian distributions and then compares the two pairs by means of the AD Test and KologmorovTest given in TH1. The paper I took this from is here: http://arxiv.org/pdf/0804.0380v1.pdf. Page 16.

I hope I haven't written any bugs into the code but let me know if you guys see something.

I wrote this on my own so I am not sure if there is some ROOT coding convention that I may or may not be following.

Cheers,

Kuhan



 Comments   
Comment by Lorenzo Moneta [ 18/Feb/14 ]

Investigate also the code proposed in ROOT-4876
https://gist.github.com/4586791.

Comment by Lorenzo Moneta [ 17/Sep/14 ]

reattached files lost due to a JIRA failure

Comment by Lorenzo Moneta [ 25/Sep/14 ]

The Anderson-Darling test has ben added for the histograms following the formula in the cited paper which is derived from formula(6) in the "K-Sample Anderson-Darling Tests" pear of Sholtz and Stephens.
The new function TH1::AndersonDarlingTest uses th implementation in the GoFTest class which computes the p-value and the test statistics. After the fixes applied in ROOT-6666 it works fine.
The implementation proposed above it is not used since it is limited to histograms with identical axis. The current implementation works instead for 2 histograms, whatever axis they have.

The attached macro shows the p-value distributions obtained in comparing 2 identical histograms with the AD, KS or Chi2 test.

Lorenzo

Generated at Wed Sep 18 15:22:24 CEST 2019 using Jira 7.13.1#713001-sha1:5e06076c2d215a6f699b7e5c90ab2fae7ba5a1ce.