VBF Event Classification with Recurrent Neural Networks at ATLAS’s LHC Experiment

Auricchio, Silvia; Cirotto, Francesco; Giannini, Antonio

doi:10.3390/app13053282

Open AccessReview

VBF Event Classification with Recurrent Neural Networks at ATLAS’s LHC Experiment

by

Silvia Auricchio

^1,2,*,†

,

Francesco Cirotto

^1,2,*,†

and

Antonio Giannini

^3,*,†

¹

Dipartimento di Fisica “Ettore Pancini”, Università Degli Studi di Napoli Federico II, Complesso Univ. Monte S. Angelo, Via Cinthia, 21-Edificio 6, 80126 Napoli, Italy

²

INFN-Sezione di Napoli, Complesso Univ. Monte S. Angelo, Via Cinthia-Edificio 6, 80126 Napoli, Italy

³

Center of Innovation and Cooperation for Particle and Interaction (CICPI), University of Science and Technology of China (USTC), No. 96, JinZhai Road, Baohe District, Hefei 230026, China

^*

Authors to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Appl. Sci. 2023, 13(5), 3282; https://doi.org/10.3390/app13053282

Submission received: 11 February 2023 / Revised: 24 February 2023 / Accepted: 24 February 2023 / Published: 4 March 2023

(This article belongs to the Special Issue Machine Learning Applications in Atlas and CMS Experiments at LHC)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

A novel machine learning (ML) approach based on a recurrent neural network (RNN) for event topology identification in high energy physics (HEP) is presented. The vector-boson fusion (VBF) production mechanism arising in proton-to-proton collisions is predicted both from the current theoretical model of the particle physics, the standard model, and from its extensions that foresee potential new physics phenomena. This physical process has a well-defined event topology in the final state and a distinctive detector signature. In this work, an ML approach based on the RNN architecture is developed to deal with hadronic-only event information in order to enhance the acceptance of this production mechanism in physics analysis of the data. This technique was applied to a physics analysis in the context of high-mass diboson resonance searches using data collected by the ATLAS experiment.

Keywords:

ML; RNN; physics; ATLAS; diboson

1. Introduction

Nowadays, machine learning (ML) algorithms are largely used in many scientific fields as well as in many industrial applications. High energy physics (HEP) collaborations have been using ML techniques progressively in recent years, publishing many results that show significant improvements with respect to standard data analysis techniques. Specifically, both ATLAS and CMS [1,2] collaborations spent efforts to boost deep learning (DL) regime applications; indeed, the benefit of using such techniques is maximised since the huge amount of data available at the Large Hadron Collider (LHC) [3] comes from collisions between proton beams. Many different physics problems were addressed, such as the reconstructions of physics objects and their identification, the event classification of physics processes, anomaly detection analysis and many others.

In this growing environment, a novel deep learning approach based on a recurrent neural network (RNN) was developed and is presented in this work. The technique was applied to the categorisation task of the production mechanism of high-mass particles in the context of new resonance searches within the ATLAS collaboration.

The ATLAS detector is a multi-purpose experiment that has been gathering data since 2012 at the Large Hadron Collider at CERN. The second part of the data gathering, named Run-2, ended in 2018, and currently, many physics results have been released, and many others are expected soon, while a new round of data gathering, Run-3, is ramping up, and it is expected to gather data until 2024. Each sub-detector has the specific task of reconstructing a different kind of particles by measuring their kinematic quantities. In this work, final states with leptons and jets will be considered. A jet of hadrons is a typical signature produced from hadronization of quarks and gluons created during collisions. Specifically, the ATLAS jet reconstruction employs an anti-

k_{t}

algorithm based on calorimeters cells information [4].

Useful information about the nature of the production mechanism of new heavy particles might be found in the reconstructed events, specifically, among the hadronic part of the events. Since the experimental environment of the data collected at the LHC are dominated by hadronic secondary processes, this task becomes more challenging, and machine learning approaches can help to improve the performances. The physics problem can be translated into a general binary classification task that aims to categorise the events according to the event topology of the initial state that produces the new particle; therefore, it represents a signal-to-signal classification problem, and it was one of the first approaches using RNN developed and published in the ATLAS collaboration.

In this paper, an overview of the method is presented, and its application in the context of the diboson searches is performed by the ATLAS collaboration.

2. An Example of a Combinatorial Problem: The VBF Categorization Task

The standard model of particle physics is the theoretical framework describing the current knowledge of fundamental interactions. In this context and in its extension involving beyond standard model (BSM) theories, boson particles can be produced through several mechanisms: gluon–gluon fusion (GGF), Drell–Yan (DY) and vector-boson fusion (VBF) are some of the most common mechanisms that can take place at the LHC. In the case of the SM Higgs boson, the GGF is the dominating production mechanism, which occurs ∼90% of the time, while the VBF is the second one with less than

10 %

of the production rate. Examples of Feynman diagrams are shown in Figure 1 for GGF and VBF high-mass particle X productions. In these diagrams the X resonance decays into two vector bosons.

According to the theory expectation, each production mechanism is characterised by a different set of physics objects that are involved in the process and that are reconstructed in the final state of the event. In particular, the VBF mechanism is expected to have two extra jets in the event with distinct features with respect to the GGF mechanism. In addition, given the complexity of the interactions occurring at a hadronic collider, and the detector effects, each event has a different jet multiplicity, that may vary up to about tens of jets. Indeed, besides the main physics process, summarised by the Feynman diagram (hard process), many other interactions could take place, such as interactions among the spectator partons (quark/gluon inside the protons) named as underlying processes and interactions between protons from different bunches of the proton beams running in the LHC pipelines, known as pile-up interactions.

The classification of such VBF events was strongly driven in the past by the identification of two jets among the reconstructed ones in the events that are directly related to the topology of the process. This represents a combinatorial problem in which all the possible pairs of objects may diverge quickly as the total amount of objects (N) available increases. For instance, with

N = 3

only 3 pairs are possible, but with

N = 6

the number of possible pairs increases to 15 (Figure 2).

Of course, the misidentification rate of an algorithm will definitely affect the outcome of the combinatorial problem as much as jet multiplicity increments. Furthermore, given the geometrical acceptance of the ATLAS detector, during object reconstruction, one or more jets might be completely lost or wrongly identified, and this will contribute to a further increase of the misidentification (Figure 3).

In such a situation, since many traditional pairing algorithms based on physics quantities yield low accuracy, it is obvious that a machine learning approach was expected to be beneficial. In particular, rather than trying to solve the combinatorial problem that fails by definition in a significant amount of events since the true pair is not present in the data at all, it was found that the best way was to not solve the combinatorial problem and let an ML approach maximize all the information that is coded in the data.

3. A Recurrent Neural Network Using Jets to Solve an Event Level Problem

A recurrent neural network is a special type of artificial neural network which uses sequences of input data; time series data represent one of the most common inputs used for RNNs, but it is just one of the possible uses of such networks. The deep learning algorithms that constitute a recurrent neural network are commonly used to solve problems in which the order of the input data matters; examples of this are the speech recognition, language translation, natural language processing, and image captioning. Very popular applications, such as Google Translate [5,6], voice search tools [7] and Siri [8] rely on this kind of technology. As do all the neural networks, the RNN depends on a training dataset to learn the features of the problem and to adjust the weights of the model to solve it. The main difference with other traditional architectures, such as feedforward and convolutional neural networks (CNNs), is that the order of the input data in the sequence plays a crucial role in the training process, and in practise, the output of the network depends on their order.

Another peculiar characteristic of a recurrent neural network is that the parameters across each layer of the architecture are shared. Indeed, feedforward networks have dedicated weights across each node, while recurrent neural networks access the same weight parameter within each layer of the architecture. These weights are then adjusted during the learning process to facilitate reinforcement learning.

A very common example to explain the power of a recurrent neural network is speech recognition. Let us consider an idiom, such as “may we meet again”; these four words have a well-defined human meaning in the English language if they are presented in this order. Therefore, the neural network that is supposed to learn its meaning has to process not only the information about the four words themselves but also the order in which they are presented.

As hinted so far, the recurrent neural networks have a very distinctive way of dealing with input data. Usually, with feedforward networks, machine learning is used to map many inputs into one or multiple outputs. With the recurrent neural network philosophy, it is possible to map a sequence of input data to one output (many-to-one) or to many outputs that can even be a sequence themselves (many-to-many); the outputs can refer to the same scale of input sequence (text translation) or to a different scale (future time predictions). Furthermore, the architecture can be designed even with an additional degree of complexity; indeed, many sequences can be used as inputs. This will be the setup used in the application presented in this work.

Many different types of recurrent neural networks exist; the differences are about some specialised features of the architecture at the level of the “cell” of the RNN layer. Examples of different RNN types are bidirectional recurrent neural networks (BRNNs) [9], long short-term memory (LSTM) [10] and gated recurrent units (GRUs) [11]. BRNNs are able to push the information not only in the future direction of the input sequence but also in the past direction; in other words, if the network is dealing with some sentences as inputs, the performance in the prediction of the early words in the sequences are driven not only by the words before it in the sequence but also bythe words that follow. LSTM is a very popular type of RNN. It was introduced by Sepp Hochreiter and Juergen Schmidhuber as a solution to the vanishing gradient problem [10]. In order to improve the performance of the prediction across the input sequences, these networks have inner cells each with three gates, an input, an output and a forget gate; in practise, these gates control the flow of the information and, therefore, the output of the network. Finally, GRU based networks also work to address the short-term memory problem of RNN models. They have two gates in the inner cells, a reset and an update gate. Similar to the gates of the LSTMs, the reset and update gates control how much and which information to exploit.

The semi-leptonic diboson event topology results in a clear experimental signature: a vector boson (W or Z) decays in a lepton pair and the other in a quark pair. The leptonic decay is used to trigger the events recorded by the ATLAS experiment, and it allows heavily reducing the huge amount of fully quantum chromodynamics (QCD) events.

The jets recorded in the event bring information about both the hadronic decay of one of the two reconstructed bosons (

W / Z \to q q^{'}

) and the additional jets expected according to the vector-boson fusion production mechanism. Concurrent interactions (pile-up) and the remainder of the proton collision (underlying event) produce additional jets in the event rather than the ones coming from the Feynman diagrams (hard processes), and the final jet multiplicity in reconstructed events is relatively high.

A simple experimental representation of a generic event is shown in Figure 4. The two blue arrows represent the clear signature of the boson decaying leptonically (

Z \to ℓ ℓ

in this example), while the red triangles represent all the jets reconstructed. If the event was produced via VBF, four jets coming from the hard process could be, in principle, recorded in that sequence, and they bring useful information.

In order to improve the categorisation of the observed events in terms of the GGF and the VBF categories, a machine learning approach was introduced. Indeed, it was observed that for this binary classification problem, a standard data analysis, using jet identification algorithms and cut-based selection, would result in a very inefficient confusion matrix (CM) [12].

Machine learning approaches have been proven to outperform if used in looser phase spaces and using low-level input variables. Therefore, the traditional jet pair selection (aimed to identify the jets coming from the VBF selection) and the selection cuts were removed. Furthermore, the lowest level variables available at the event level were considered as input to the ML algorithm; in particular, the full four-momentum of the jets for the sequence was used. Given the variable jet multiplicity per event, an approach involving recurrent neural network layers was adopted to build a DeepNN.

4. Architecture of the Model

A DeepNN was designed for this binary classification task, a single standard neuron in the last layer is used to obtain one output score. Two hidden layers are based on recurrent cells. A total of 25 long short-term memory (LSTM) cells and a

t a n h

activation function was used as the core of the architecture. Furthermore, the last outputs of the second RNN layers move into a sigmoid activation function-based single node that allows obtaining a final score. In order to reduce possible overfitting during training, 30% of the connections among inner layers are randomly truncated (this method is called dropout). Finally, the model was trained with Adam optimiser. The network was trained up to 200 epochs, reaching a stable plateau and showing a great goodness of the model with no sign of overtraining. Roughly, two hundred thousand signal events were simulated using Monte Carlo generators [13] and were used for the training. Figure 5 shows a simplified scheme of the architecture employed in this analysis, while Figure 6 shows details of the logic of an LSTM cell.

This model was implemented and trained using the Keras package [14] with Theano [15] as backend for the mathematical computation. The architecture as well as the final weights were published through the HEPData of the ATLAS paper [16,17], and they are available for reproducing the results in whatever theoretical or experimental framework needed.

The choice of using recurrent-based layers was mainly driven by a physics constraint: since the objects used as input of the network are the jets reconstructed in the event, they constitute a set with a natural variable length size. The recurrent layers represent one of the possible choices to deal with the inputs without any further data manipulation. In addition, the RNN architecture allows exploiting the inputs such that the jets are treated as a sequence of objects. Therefore, all the physical features that are used as inputs are treated as sequences. In this way, any hidden correlation among the sequence of objects is, in principle, caught.

The hyperparameters used are the result of a simple optimisation; dedicated scans over the hyperparameters space were conducted to check the impact on the performance and on the stability of the training when the corner regions are explored. A set that prioritised the goodness of the model, quantified as the relative distance between the training and the test data sets and the optimal performance, was selected. In addition, the smoothness of the final score was considered to avoid big changes in the selection efficiencies when the cut is applied for the physics analysis.

According to the RNN architectures, the maximum length of the sequences used as inputs can be modified. In some sense, it represents an additional hyperparameter that can be tuned. The performances were tested as a function of this parameter, and the physics considerations were accounted. The fraction of events with more than 5–6 jets is relatively small, and the additional information coming from those events do not really improve the performances; on the other hand, as soon as the length of the sequence is too short, significant physics information is lost, and the performance breaks down. The optimal choice was found considering the full event topology; indeed, in addition to the two real VBF jets (if reconstructed), the other jets come from the boson decaying hadronically; the maximum number of jets is therefore optimised on this consideration. An additional practical benefit from this design is that the approach is completely transferable to other topologies in which the decay channel of the final state is different.

5. Performance of the Method

The RNN model was developed in order to classify the production mechanism of both SM and BSM events. Examples of new hypothetical models that predict a VBF signal production at LHC are the heavy vector triplets (HVT) [18] and Randall–Sundrum (RS) frameworks [19,20,21,22]. For these models, multiple independent signal hypotheses were considered, varying the spin and the mass of the particle. Spin values of 0, 1 and 2 and masses in the 0.3–5 TeV range were considered. The final state where the model was studied is called semi-leptonic: the signal mass X decay to two vector bosons, which can be a Z or a W boson that subsequently decays to two leptons and two quarks. According to the boson decaying leptonically, three distinct channels are possible: 0-lepton (

X \to Z V \to ν ν q q

), 1-lepton (

X \to W V \to ℓ ν q q

) and 2-lepton (

X \to Z V \to ℓ ℓ q q

). Here, V denotes a vector boson (Z or W) and ℓ a lepton (an electron e or a muon

μ

).

The model reaches high performances for this classification task with the value of the area under the ROC curve ∼85–90%; the performances were found stable across the different signal hypotheses. The discrimination power is represented in the shapes distributions of the RNN score for a GGF/DY signal hypothesis and for a VBF hypothesis in Figure 7 where the three different spin hypotheses for RS Radion, HVT and RS gravitons BSM models were shown in the 2-lepton decay channel. The performances of the RNN score were found to be quite stable across the different spin hypotheses considered; this is related to the fact that the main features of this topology are weakly dependent on the spin of the produced resonance, and the kinematics of the two extra associated jets produced in the VBF mechanism are dominant with respect to the other production mechanisms in all the theoretical scenarios.

The RNN score was used to categorise the observed events; in particular, a cut equal to

0.8

was applied. Therefore, events with an

RNN score \geq 0.8

were categorised as VBF events, and events with an

RNN score < 0.8

as GGF/DY (no-VBF) events. Figure 8 shows the signal efficiency in the VBF category for VBF and GGF/DY signal hypotheses.

The choice of the cut was made to preserve a similar background contamination from other SM processes in the VBF category as observed in previous studies with standard data analysis techniques [12]. The improved confusion matrix allows enhancing the VBF signal efficiency from

10 %

at low mass up to

60 %

at the high-mass range, and therefore, the final sensitivity to such BSM models were improved. Such improvement is quite significant in the context of BSM searches, especially when this is coming from one source. In this case, an ML approach used to solve the classification problem is improving the confusion matrix and, in the end, the results of the physics analysis. In general, the result of a physics analysis may be improved using more data collected by the experiment, but larger data gathering may happen in the scale of many years. Therefore, significant improvements introduced by a new data analysis strategy may allow us to reach better physics results beforehand.

Furthermore, this approach based on the RNN allowed us to naturally recover a significant part of signal events that would otherwise be lost due to the geometrical acceptance of the detector. Indeed, the two VBF jets are produced in the forward region of the detector, high pseudo-rapidity (

η

), and the resulting separation between the two jets is higher and higher as a function of the X particle mass. A relative fraction equal to

20 %

at low-mass range and

50 %

at high-mass range with just 1 VBF jet reconstructed in the events are expected [23]; this represents an additional conceptual improvement coming from using this novel architecture.

Finally, the used setup proved to be quite general and to allow implement generalisation of its application. Specifically, no significant dependency from the hypothesis of the mass of the signal was found during the training phase; the performances were also not affected at all if the decaying lepton channel was changed; indeed, the topology of the initial state was completely independent from the final state channel decay.

6. Application to an ATLAS Search and Results

This approach was validated using the data collected by the ATLAS experiment during the Run-2. A high-mass diboson semi-leptonic search was performed by the ATLAS collaboration [16] using the approach presented in this work to categorise data events in VBF and GGF/DY categories. Despite the lower decay rate, the VBF channel represents a promising signature since the amount of SM process is much more suppressed.

The analyses of diboson events have a long history in the context of high-energy particle searches. Observation and characterisation of this kind of events represented a crucial test of the electro-weak (EW) and the Higgs sector of the SM. In particular, high-energy

W_{L} W_{L}

scattering was a key feature supporting the presence of a new scalar boson to complete the SM framework [24]. A new scalar boson was then discovered observing ZZ events in the so-called golden channel [25]; the new particle was proved to be consistent with the prediction of the Higgs boson. However, the Higgs boson discovery is not the end of the story; the cross section of very high-mass

W_{L} W_{L}

scattering has not been directly measured, so the diboson channel might still hide new physics.

This channel is still one of the most promising channels to access and observe new physics. There are different theoretical models that foresee the presence of new high-mass resonances that may decay into diboson systems. All the possible final states are investigated.

For the semi-leptonic search, a complex analysis selection was designed to enhance the signal sensitivity of these models in dedicated regions where a signal is expected to be found (Signal Regions, SRs). The first selection applied is the reconstruction of the vector boson which decays into leptons. Candidate events are selected according to the number of leptons (with a maximum of 2 leptons per event) and then assigned to 0, 1 or 2 lepton channels. After the leptonic selection, the VBF to GGF/DY categorisation is performed via the RNN score. Once categorisation has been applied, the identification of hadronically vector bosons is implemented. The complete analysis flow is shown in Figure 9. Multiple SRs are designed for each channel and category; the details of the applied selections can be found in [16].

There are many physical processes with final states similar to the interested signals. These events constitute the background sources of an analysis. The impact of most relevant backgrounds can be estimated with proper methods. In this analysis, dedicated control regions (CRs) were designed: each CR was intended to estimate a specific source of background, which is controlled by comparison to data samples. Finally, a statistical procedure based on a maximum likelihood fit was then performed in a combined CRs+SRs fit; the outcome of the procedure tests and the consistency between the observed data and the theory expectations revealed any deviation from the SM background processes.

Examples of signal regions are shown in Figure 10; the distributions show the expected background coming from the standard model processes and the observed data for the final discriminating variable in 2-lepton and 1-lepton channels. In this search, the reconstructed transverse mass (

m_{T}

) of the diboson system for the 0-lepton channel and the diboson invariant mass for the 1-lepton (

m_{ℓ ν j j}

) and 2-lepton (

m_{ℓ ℓ j j}

) channels were used to extract possible new signals. Typically, the presence of a new heavy resonance would manifest as a resonant peak above the standard model background visible in the invariant mass distribution, or in a large enhancement in transverse mass distribution.

Since no significant deviation from the SM predictions were observed, the data were used to constrain the production cross section of such models.

These observed results improved previous outcomes on the constraints of such new physics models. Finally, it contributes significantly to the combination results involving many other channels in the context of the heavy resonances combination of ATLAS [26].

7. Conclusions

Machine learning applications are spreading in many fields, including particle physics. We presented an ML approach involving a recurrent neural network for a signal-to-signal classification task, namely the categorisation of the two main production mechanisms in high-mass resonance searches. The designed algorithm exploits reconstructed jet information and leads to significant improvements with respect to a typical analysis performed with standard selections. The application of RNN architectures in physics analyses is relatively novel in the ATLAS collaboration; this kind of architecture was proven to be helpful in diboson semi-leptonic final states. The significant improvement obtained with this technique allows us to open many possible extensions to other applications. Similar channels in which one of the two bosons is replaced by a Higgs boson represent a natural extension of this approach because it is predicted in the same theoretical models; dedicated ATLAS results are planned to be released in the summer of 2023. Furthermore, this approach can be extended to other analyses exploiting fully hadronic final states in

V V / V h / h h

channels and to fully leptonic final states that are more sensitive to other theoretical models, such as the Georgi Macheck charged Higgs [27]. Finally, the approach can be extended to exploit other similar signatures such as the Higgs to four leptons measurement [28]. The existing ATLAS electro-weak diboson plus two jets production measurement [29] was updated to use this kind of approach exploiting a higher jet multiplicity and additional jet features; the updated result is expected to be released later in the 2023.

Author Contributions

The authors (S.A., F.C. and A.G.) equally contributed to the conceptualization, the writing, review and editing of this work. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

ATLAS Collaboration. The ATLAS Experiment at the CERN Large Hadron Collider. Jinst 2008, 3, S08003. [Google Scholar] [CrossRef] [Green Version]
The CMS Collaboration; Chatrchyan, S.; Hmayakyan, G.; Khachatryan, V.; Sirunyan, A.M.; Adam, W.; Bauer, T.; Bergauer, T.; Bergauer, H.; Dragicevic, M.; et al. The CMS Experiment at the CERN LHC. Jinst 2008, 3, S08004. [Google Scholar] [CrossRef] [Green Version]
Evans, L.; Bryant, P. LHC Machine. Jinst 2008, 3, S08001. [Google Scholar] [CrossRef] [Green Version]
ATLAS Collaboration. Jet energy scale and resolution measured in proton–proton collisions at $\sqrt{s} = 13$ TeV with the ATLAS detector. Eur. Phys. J. C 2021, 81, 689. [Google Scholar] [CrossRef]
Team, G.B. A Neural Network for Machine Translation, at Production Scale. 2016. Available online: https://ai.googleblog.com/2016/09/a-neural-network-for-machine.html (accessed on 11 February 2023).
Sutskever, I.; Vinyals, O.; Le, Q.V. Sequence to Sequence Learning with Neural Networks. arXiv 2014, arXiv:1409.3215. [Google Scholar] [CrossRef]
Graves, A.; Mohamed, A.R.; Hinton, G. Speech Recognition with Deep Recurrent Neural Networks. arXiv 2013, arXiv:1303.5778. [Google Scholar] [CrossRef]
Team, S. Hey Siri: An On-Device DNN-Powered Voice Trigger for Apple’s Personal Assistant. 2017. Available online: https://machinelearning.apple.com/research/hey-siri (accessed on 11 February 2023).
Schuster, M.; Paliwal, K. Bidirectional recurrent neural networks. Signal Process. IEEE Trans. 1997, 45, 2673–2681. [Google Scholar] [CrossRef] [Green Version]
Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
Chung, J.; Gulcehre, C.; Cho, K.; Bengio, Y. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling. arXiv 2014, arXiv:1412.3555. [Google Scholar] [CrossRef]
ATLAS Collaboration. Searches for heavy ZZ and ZW resonances in the ℓℓqq and ννqq final states in pp collisions at $\sqrt{s} = 13$ TeV with the ATLAS detector. J. High Energy Phys. 2018, 2018, 9. [Google Scholar] [CrossRef] [Green Version]
Alwall, J.; Frederix, R.; Frixione, S.; Hirschi, V.; Maltoni, F.; Mattelaer, O.; Shao, H.S.; Stelzer, T.; Torrielli, P.; Zaro, M. The automated computation of tree-level and next-to-leading order differential cross sections, and their matching to parton shower simulations. J. High Energy Phys. 2014, 2014, 79. [Google Scholar] [CrossRef] [Green Version]
Chollet, F. Keras. 2015. Available online: https://keras.io (accessed on 11 February 2023).
Theano Development Team. Theano: A Python framework for fast computation of mathematical expressions. arXiv 2016, arXiv:1605.02688. [Google Scholar]
ATLAS Collaboration. Search for heavy diboson resonances in semileptonic final states in pp collisions at $\sqrt{s} = 13$ TeV with the ATLAS detector. Eur. Phys. J. C 2020, 80, 1165. [Google Scholar] [CrossRef]
ATLAS Collaboration. Search for Heavy Diboson Resonances in Semileptonic Final States in pp Collisions at $\sqrt{s} = 13$ TeV with the ATLAS Detector. HepData: RNN Model Files. 2020. Available online: https://www.hepdata.net/record/94809 (accessed on 11 February 2023).
Pappadopulo, D.; Thamm, A.; Torre, R.; Wulzer, A. Heavy Vector Triplets: Bridging Theory and Data. J. High Energy Phys. 2014, 9, 060. [Google Scholar] [CrossRef] [Green Version]
Goldberger, W.D.; Wise, M.B. Modulus stabilization with bulk fields. Phys. Rev. Lett. 1999, 83, 4922–4925. [Google Scholar] [CrossRef] [Green Version]
Goldberger, W.D.; Wise, M.B. Phenomenology of a stabilized modulus. Phys. Lett. B 2000, 475, 275–279. [Google Scholar] [CrossRef] [Green Version]
Oliveira, A. Gravity particles from Warped Extra Dimensions, predictions for LHC. arXiv 2014, arXiv:1404.0102. [Google Scholar]
Agashe, K.; Davoudiasl, H.; Perez, G.; Soni, A. Warped Gravitons at the LHC and Beyond. Phys. Rev. D 2007, 76, 036006. [Google Scholar] [CrossRef] [Green Version]
Giannini, A. Machine Learning Methods for Diboson Searches in Semi-Leptonic Final States with the ATLAS Experiment at LHC. Ph.D. Thesis, Università degli Studi di Napoli Federico II, Naples, Italy, 20 April 2020. [Google Scholar]
Szleper, M. The Higgs boson and the physics of WW scattering before and after Higgs discovery. arXiv 2014, arXiv:1412.8367. [Google Scholar] [CrossRef]
ATLAS Collaboration. Observation of a new particle in the search for the Standard Model Higgs boson with the ATLAS detector at the LHC. Phys. Lett. B 2012, 716, 1–29. [Google Scholar] [CrossRef]
Combination of Searches for Heavy Resonances Using 139 fb⁻¹ of Proton–Proton Collision Data at $\sqrt{s} = 13$ TeV with the ATLAS Detector. Technical Report, CERN, Geneva. 2022. Available online: https://atlas.web.cern.ch/Atlas/GROUPS/PHYSICS/CONFNOTES/ATLAS-CONF-2022-028 (accessed on 11 February 2023).
Ghosh, N.; Ghosh, S.; Saha, I. Charged Higgs boson searches in the Georgi-Machacek model at the LHC. Phys. Rev. D 2020, 101, 015029. [Google Scholar] [CrossRef] [Green Version]
ATLAS Collaboration. Higgs boson production cross-section measurements and their EFT interpretation in the 4ℓ decay channel at $\sqrt{s} = 13$ TeV with the ATLAS detector. Eur. Phys. J. C 2020, 80, 957. [Google Scholar] [CrossRef]
ATLAS Collaboration. Search for electroweak diboson production in association with a high-mass dijet system in semileptonic final states in pp collisions at $\sqrt{s} = 13$ TeV with the ATLAS detector. Phys. Rev. D 2019, 100, 032007. [Google Scholar] [CrossRef] [Green Version]

Figure 1. Feynman diagrams of the production of a high-mass particle X that decays into a diboson system; ggF (left) and VBF (right) are the two dominant channels.

Figure 2. Schemes representing a combinatorial problem of a pairing, given N objects with

N = 3

(a) and

N = 6

(b).

Figure 2. Schemes representing a combinatorial problem of a pairing, given N objects with

N = 3

(a) and

N = 6

(b).

Figure 3. Schemes representing a combinatorial problem of a pairing, given

N = 6

objects; the collection is made by the true pair (green elements) and from other objects (blue elements). If the true pair of objects is present in collection the efficiency of tagging will depend only on the accuracy of the algorithm (a); otherwise, the efficiency will be even lower if one object is not present in the collection at all (b).

Figure 3. Schemes representing a combinatorial problem of a pairing, given

N = 6

objects; the collection is made by the true pair (green elements) and from other objects (blue elements). If the true pair of objects is present in collection the efficiency of tagging will depend only on the accuracy of the algorithm (a); otherwise, the efficiency will be even lower if one object is not present in the collection at all (b).

Figure 4. Schematic representation of an event reconstructed at the ATLAS experiment; many jets are present in the event since the LHC is a hadronic collider.

Figure 5. Simplified view of the neural network used in this work; RNN layers are the core of this network that is built to have up to N jets as input and only one output node according to the binary classification aim.

Figure 6. In−depth view of an LSTM cell that is the building block of an RNN layer.

Figure 7. RNN score distributions for the production of a 1 TeV resonance in the signal models considered for this search for the 2-lepton channel.

Figure 8. The fractions of signal events passing the VBF requirement on the RNN score as functions of the resonance mass for both VBF and GGF production in the 2-lepton channel.

Figure 9. Selection flow and signal region definitions of the

X \to V V \to V_{ℓ} V_{h}

search. Here, ℓ and h denote, respectively, a lepton and a hadronic decay.

Figure 9. Selection flow and signal region definitions of the

X \to V V \to V_{ℓ} V_{h}

search. Here, ℓ and h denote, respectively, a lepton and a hadronic decay.

Figure 10. Comparisons of the observed data and the expected background distributions of

m_{ℓ ℓ j j}

in the VBF resolved 2−lepton SR (left) and of

m_{ℓ ν j j}

in the VBF resolved 1−lepton SR (right). The bottom panels show the ratios of the observed data to the background predictions. The hatched bands represent the uncertainties in the total background predictions, combining statistical and systematic contributions.

Figure 10. Comparisons of the observed data and the expected background distributions of

m_{ℓ ℓ j j}

in the VBF resolved 2−lepton SR (left) and of

m_{ℓ ν j j}

in the VBF resolved 1−lepton SR (right). The bottom panels show the ratios of the observed data to the background predictions. The hatched bands represent the uncertainties in the total background predictions, combining statistical and systematic contributions.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Auricchio, S.; Cirotto, F.; Giannini, A. VBF Event Classification with Recurrent Neural Networks at ATLAS’s LHC Experiment. Appl. Sci. 2023, 13, 3282. https://doi.org/10.3390/app13053282

AMA Style

Auricchio S, Cirotto F, Giannini A. VBF Event Classification with Recurrent Neural Networks at ATLAS’s LHC Experiment. Applied Sciences. 2023; 13(5):3282. https://doi.org/10.3390/app13053282

Chicago/Turabian Style

Auricchio, Silvia, Francesco Cirotto, and Antonio Giannini. 2023. "VBF Event Classification with Recurrent Neural Networks at ATLAS’s LHC Experiment" Applied Sciences 13, no. 5: 3282. https://doi.org/10.3390/app13053282

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

VBF Event Classification with Recurrent Neural Networks at ATLAS’s LHC Experiment

Abstract

1. Introduction

2. An Example of a Combinatorial Problem: The VBF Categorization Task

3. A Recurrent Neural Network Using Jets to Solve an Event Level Problem

4. Architecture of the Model

5. Performance of the Method

6. Application to an ATLAS Search and Results

7. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI