Learning to increase matching efficiency in identifying additional b-jets in the tt̅bb̅ process
The tt̅H(bb̅) process is an essential channel to reveal the Higgs properties but has an irreducible background from the tt̅bb̅ process, which produces a top quark pair in association with a b quark pair. Therefore, understanding the tt̅bb̅ process is crucial for improving the sensitivity of a search for the tt̅H(bb̅) process. To this end, when measuring the differential cross-section of the tt̅bb̅ process, we need to distinguish the b-jets originated from top quark decays, and additional b-jets originated from gluon splitting. Since there are no simple identification rules, we adopt deep learning methods to learn from data to identify the additional b-jets from the tt̅bb̅ events. Specifically, by exploiting the special structure of the tt̅bb̅ event data, we propose several loss functions that can be minimized to directly increase the matching efficiency, the accuracy of identifying additional b-jets. We discuss the difference between our method and another deep learning-based approach based on binary classification arXiv:1910.14535 using synthetic data. We then verify that additional b-jets can be identified more accurately by increasing matching efficiency directly rather than the binary classification accuracy, using simulated tt̅bb̅ event data in the lepton+jets channel from pp collision at √(s) = 13 TeV.
READ FULL TEXT