import qrules
reaction = qrules.generate_transitions(
initial_state=[("J/psi(1S)", (-1, +1))],
final_state=["K0", "Sigma+", "p~"],
allowed_interaction_types="strong",
mass_conservation_factor=0.0,
max_angular_momentum=3,
topology_building="isobar",
)5 The ComPWA project
In this chapter, we discuss how the Common Partial-Wave Analysis (ComPWA) project has evolved from a single C++ framework for amplitude analysis to a collection of more dynamic Python tools and documentation workflows that use the techniques described in Chapter 4. The end of the chapter describes how we exposed and extended different functionalities of the original ComPWA framework as easily installable Python packages that can be used independently of each other.
5.1 C++ origins
As described in Section 4.1, most analysis code in high-energy physics and amplitude analysis is written in C++ to facilitate efficient computation and manipulation of large datasets. However, the complexity of C++ can be a barrier to implement and extend large amplitude models. To address this, many projects organise their analysis code as a framework that (1) can build up amplitude models dynamically at runtime and (2) offers a high-level interface for users to define these models through a configuration file that is loaded at runtime without requiring recompilation.
Originally, the Common Partial-Wave Analysis (ComPWA) project also started as a single C++ framework for amplitude analysis. The initial requirement specifications were drafted in 2012 with the intention of designing a collaboration-independent framework that could formulate arbitrary amplitude models, can be flexibly configured, and documents the model in a human-readable format [1]. Inspiration was taken from frameworks like Tara for the Crystal Barrel experiment [2], PWA2000 [3], ROOTPWA for the COMPASS experiment, GPUPWA for BESIII [4], and Laura++ for BaBar and LHCb [5]. The idea was to design a framework with a more modular design, with interfaces that are abstract enough to allow for different implementations for different spin formalisms, estimator methods, minimiser strategies and algorithms, and data sample formats for different collaborations [6].
Figure 5.1 shows the core modules of the original ComPWA framework. Further details can be found in the doctoral theses of M. Michel, P. Weidenkaff, and S. Pflüger [7–9]. The modules provide interfaces that could essentially operate independently of each other and served as protocols for streaming data through the framework. Four-momenta data could be either loaded from different file formats or generated using different Monte Carlo generators. Physics, such as coherent sum of amplitudes, would be implemented through a function tree (model) that can operate on that data. Different estimator implementations could quantise the discrepancy between the model and the data sample. And finally, different optimiser implementations could tweak the parameters in the function in order to minimise the estimator value. The key part is that interfaces were general and simple enough that they encapsulated the expected data flow, but still allowed for extending the functionality through specific implementations of the interfaces. ComPWA followed the classic definition of a framework as an object-oriented abstract design, with the preference that classes inherit from an (abstract) base class at most [10; 7, §3.5].
The concept of a function tree deserves special attention. As we will see later, the concept is widely used in other computational frameworks. ComPWA implemented models in the physics module as tree-like representations, with the nodes representing mathematical operations on branches (data points or numbers) or other nodes. This not only provided an interface that can be called agnostically by other modules, but also to facilitate techniques to optimise computations. A case in point is that one can track which nodes depend on certain parameters. If parameters are kept fixed during a minimisation process, the corresponding nodes can be precomputed and cached. In addition, specific nodes can be optimised directly based on their structure. For example, a node with one child can be directly ‘collapsed’ (restructured) into a single node. Initial tests showed that the speed-ups for certain parameter combinations could be up to 50 times faster [7, p. 82].
Function trees could also be easily serialised, which was intended as a way to document an analysis as well as to export and modify models for reevaluation, as intended by the original requirement specifications [1]. The fact that function trees could be built up from a human-readable format led to the idea of an “expert system” that could automatically generate models based on a set of rules. In the case of amplitude analysis, this means generating decay topologies based on quantum number conservation rules and using that as a template to build up a function tree for the amplitude model.
Finally, a Python interface, pycompwa, was developed to provide a more dynamic workflow with the ComPWA framework [11]. First, it provided Python bindings to the C++ implementations of each of the core modules, so that the user could interact with the framework through Python scripts and Jupyter notebooks without having to compile the C++ code. Second, it was complemented by the expertsystem, which was implemented purely in Python [12]. The combination of the bindings and the expertsystem allowed for a more dynamic workflow with higher-level analysis code.
5.2 Transition to Python
As discussed in Section 4.1, array programming allows for concise and expressive code that is easier to read and write. While array-oriented code has been popular in the Python community since the 2000s with libraries like NumPy, it gained more traction with the emergence of Machine Learning (ML) libraries like Numba since 2012 [13], TensorFlow in 2015 [14], PyTorch in 2016 [15], and JAX in 2018 [16]. Importantly, these libraries represent their computations (such as neural networks) in the form of a computational graph, which is similar to the function tree concept in ComPWA. Computational graphs, however, usually provide the more advanced features discussed in Section 4.1, such as optimisations through backtracing on the JIT-compiled functions that those graphs represent, or automatic differentiation.
The popularity of these ML frameworks and the fact that they are developed by companies and large groups of professional software developers, makes the effort to maintain an own implementation of a function tree futile. In 2020, it was therefore decided to reimplement the ComPWA framework purely using the Python APIs of those libraries. The decision was further encouraged by initiatives like TensorFlowAnalysis and its follow-up AmpliTF for the amplitude analysis community [17; 18], as well as zfit, which serves as an alternative to RooFit with TensorFlow as computational backend [19].
As a first step, we reimplemented the ComPWA framework using only TensorFlow as computational backend, following the design of Figure 5.1. The first release candidate was able to reproduce the standard test cases of the original ComPWA framework and demonstrated that similar fit performance could be achieved with TensorFlow as computational backend. A first release candidate was released on July 9th, 2020, see v0.0.0-alpha0. Unit tests and tutorial examples of the repository at that point in time showed the feasibility of the reimplementation.
The TensorFlow computational graphs were still built up from the same XML files generated from the expertsystem that were used as input for the ComPWA C++ framework, which we used as a cross-check for the new implementation. This led us to the realisation that we could use the same input file for constructing computational graphs with different computational libraries. Similarly to AmpliTF, we therefore implemented the amplitude model builder with JAX, next to the existing TensorFlow implementation.
As a final step that led towards the adoption of a CAS (Section 4.2), we explored alternative representations of the XML format for the generated amplitude model. The fact that amplitude models are inherently large mathematical expressions suggested that we adopt a standardised format, such as MathML, for serialising such expressions. Constructing a tree of mathematical nodes to serve as a template for such a serialisation format is essentially a reimplementation of the expression trees used in Computer Algebra Systems, or for instance Python’s built-in operator module [20]. We therefore decided formulate the amplitude model in the expertsystem using SymPy.
As noted in Section 4.2.2,SymPy has the additional benefit that it already provides code generation capabilities for expression serialisation as well as for different programming languages. The rest of the software packages therefore became more of a wrapper around SymPy code generation that streamlines the process of performing amplitude analyses with different computational backends, following the original design of the ComPWA framework of Figure 5.1. All in all, the transition to Python and the adoption of a CAS removed the workload of maintaining a computational backend, while opening the door to more flexible workflows, a much easier developer experience(Section 5.4), and the more advanced computational features described in Section 4.1. This lead us to transition the project completely to Python.
5.3 Main Python packages
Python makes it easier to split up frameworks into smaller, specialised packages and allows for a fast release flow. Naturally, during the transition to Python, the ComPWA framework evolved into a collection of dedicated Python packages that can mostly be used independently of each other. Figure 5.2 shows the components of the original C++ framework and the new core packages that the ComPWA project has evolved into. The ComPWA framework provided Python bindings to its C++ backend through the pycompwa package. This package was replaced by the Python-only TensorWaves package, which streamlines the conversion of SymPy expressions to numerical functions and provides additional functionality for performing amplitude analyses. The expertsystem was split into the QRules and Ampform packages, which together provide the necessary tools for formulating symbolic amplitude models. All packages are all available on PyPI as well as conda-forge, so that they can be easily installed or used as libraries in downstream projects.
QRules
Many steps in the construction of an amplitude model can be standardised, especially once a decay is modelled with the helicity formalism (Chapter 3). Deciding which decays are allowed, and what resonances one can expect, is something one can work out on paper using quantum conservation laws. This is, however, error-prone and something that can be automated. Already at the inception of the ComPWA project [1], this led to the idea of developing a rule-based “expert system” that can automatically generate decay topologies based on quantum number conservation rules.
An expert system emulates the decision-making ability of a human expert. It consists of two main components: a knowledge base and an inference engine. In the case of formulating particle decays, the knowledge base consists of conservation laws, such as charge conservation, and input information about the decay, such as the initial state and final state. The inference engine can be any algorithm that deduces what kind of decay topologies can connect the initial and final state, based on the facts and rules in the knowledge base. The “expert” implements knowledge through the knowledge base, while a user can use the inference engine to query (“ask advice”) the knowledge base.
The inference engine can be implemented in many ways, such as a decision tree, a set of rules, or a neural network. The latter has become more popular in recent years. At the time of writing, the main inference engine used in the ComPWA project is a Constraint Satisfaction Problem (CSP) solver [21]. In most cases, the user specifies outside constraints, such as initial and final state, expected decay topology types, and the conservation rules that apply (such as parity conservation in the case of a strong-force decay). The CSP then tries to satisfy these constraints by building decay topologies and filling them with quantum numbers that comply with the conservation rules.
As described in Section 5.1, the ComPWA framework originally included a package named expertsystem. However, since this also had the responsibility of formulating amplitude models and exporting them to XML, we extracted a specialised package called QRules with (PyPI name qrules) that only contains the particle reaction solver [22].
As an example, the following code (Listing 5.1) shows how a user can request QRules to find which decays can occur in a \(J/\psi\) meson decaying to \(K^0_S\), \(\varSigma^+\), and \(\bar{p}\) if one assumes sequential two-body decays (isobar) and applies conservation rules for the strong force. QRules works with flavour eigenstates, because it can only handle the quantum numbers of the particles. In this case, the user has to select \(K^0\) instead of the short-lived mass eigenstate \(K^0_S\). Additional information, such as the maximum angular momentum and a factor that specifies how far resonance may lay outside the kinematically allowed phase space, can be optionally be specified.
In this example, the user has specified that the initial state does not have longitudinal polarisation, which is indicated by the (-1, +1) in the initial state. This is the case in the BESIII experiment, where the \(J/\psi\) meson is produced in \(e^+ e^-\) collisions. The allowed_interaction_types argument specifies which conservation rules (Table 1.2) are enforced at each node of the sequential decay chains built in the “isobar” topology (two-body decay chain). The CSP solver is then instructed to restrict intermediate states to those with an orbital angular momentum of at most \(L=3\) and a mass conservation factor of \(f=0\), which means that any intermediate state \(r\) must lie within \(m_r \pm f \, \Gamma_r\) of the available kinematic phase space.
(0) and (1)) represent decay vertices with information like angular momentum.
Internally, QRules first builds a list of all possible decay topologies for three-body decay (Figure 5.3). It then fetches the quantum numbers of the initial- and final-state particles (outer edges) that the user has specified in the definition of the problem. By default, this information is obtained from the Particle Data Group [23] through the particle package [24], but the user can also provide a database of custom particle definitions when searching for unknown or exotic resonances. Next, QRules fills the inner edges of the topologies with quantum numbers, conservation rules, and expected quantum number domains.
Figure 5.4 shows a QRules rendering of one such filled topology. It represents only one of many possible graphs, since the CSP inference engine explores the full set of topologies from Figure 5.3 as well as different combinations of quantum numbers. The outer edges correspond to the initial- and final-state particles, with one particular set of allowed spin projections shown in brackets. Each node and intermediate edge is annotated with conservation rules, together with priorities that determine the order in which they are applied. They also specify domains for each quantum number type considered at that node or edge during the CSP solving process. For instance, baryon number conservation has the highest priority at both nodes, with possible values of –1, 0, or +1 on the intermediate edge. Strangeness has slightly lower priority and can take values from –3 to +3.
The CSP solver now determines which sets of quantum numbers can be assigned to the graph edges and nodes, while satisfying the conservation rules at each two-body decay node and the fixed quantum numbers on the outer edges. Figure 5.5 shows one of the solutions for the problem defined in Figure 5.4. Here, edges and nodes are assigned explicit quantum numbers such as spin magnitude \(J\), spin projection \(J_3\), parity \(P\), strangeness, and others, along with Particle Data Group (PDG) identifiers of the matching particle definitions. Quantum numbers with value \(0\) are omitted from the listing. For example, the antiproton \(\bar{p}\) (PDG ID -2212) carries no strangeness, while the \(K^0\) meson (PDG ID 311) carries no baryon number.
Finally, QRules searches through the particle database for candidates that match the quantum number sets assigned to the intermediate edges of the graph. Figure 5.6 shows the decay topologies that QRules identified for the problem defined in Listing 5.1, collapsed into two graphs. The full reaction object also records the allowed combinations of spin projections. In this case, only two decay-chain topologies were found: one with a subsystem \(\varSigma^* \to K^0_S \bar{p}\) and another with \(N^* \to K^0_S \varSigma^+\). QRules did not find transitions for a third subsystem, since no \(K^*\) mesons were found with masses that satisfy conservation in the topology \(J/\psi \to K^0_S (K^* \to \bar{p} \varSigma^+)\). This is a consequence the mass_conservation_factor chosen in Listing 5.1, together with the particle definitions available in the default database provided by the particle package.
While the figures shown in this section illustrate the strategy that QRules uses to generate allowed decay topologies, the system allows for other solver strategies. For example, instead of using sequential-decay topologies, the system can also check whether a particle reaction is allowed at all by checking whether the quantum numbers of the initial and final state simply add up and comply with conservation rules (such as net charge conservation). The system is also designed to be extendible to more general reactions, for instance involving an initial state of multiple particles. For the purposes of this thesis, this functionality has not been further developed.
AmpForm
As can already be seen from Figure 5.5, the decay graphs that QRules generates, contain much more information than just which resonances can be expected. For example, the graphs contain LS-coupling values, parities, spin projections, and masses and widths of the particles. This information can be used to formulate an amplitude model with the helicity formalism (Chapter 3).
For this purpose, we extracted a second package, AmpForm (PyPI name ampform), from the original expertsystem package [25]. The package formulates amplitude models symbolically using SymPy rather than serialising them directly to XML, as happened in the original C++ framework. As such, AmpForm serves both as an extension of SymPy – a library of expression classes that are useful for amplitude analysis – and as a tool for formulating amplitude models using different spin formalisms. In the usual workflow, particle reaction graphs generated by QRules serve as input to AmpForm’s amplitude model builder, but the user can also use the expression classes directly to build up their own amplitude models or any other parametrisations that are of interest to the analysis.
Amplitude model building
Listing 5.2 shows how the amplitude model is built up for the decay \(J/\psi \to K^0_S \varSigma^+ \bar{p}\) that was generated with QRules in Listing 5.1. It first creates an amplitude model builder, which is a factory class that can be configured dynamically before formulating the model. Next, a dynamics builder class is instantiated that can formulate a relativistic Breit-Wigner functions with an energy-dependent width and two vertex form factors for a given intermediate edge. This dynamics builder is then assigned to each of the intermediate edges in the decay graph. Finally, the formulate() method returns a class structure that contains all the relevant definitions for an amplitude model.
import ampform
from ampform.dynamics.builder import RelativisticBreitWignerBuilder
model_builder = ampform.get_builder(reaction)
dynamics_builder = RelativisticBreitWignerBuilder(
energy_dependent_width=True,
form_factor=True,
)
for resonance in reaction.get_intermediate_particles():
model_builder.dynamics.assign(resonance.name, dynamics_builder)
model = model_builder.formulate()
model.intensity\(\displaystyle \sum_{m_{0}\in\left\{1,-1\right\}} \sum_{m_{1}=0} \sum_{m_{2}=-1/2}^{1/2} \sum_{m_{3}=-1/2}^{1/2}{\left|{A^{12}_{m_{0}, m_{1}, m_{2}, m_{3}} + A^{13}_{m_{0}, m_{1}, m_{2}, m_{3}}}\right|^{2}}\)
The model is formulated in terms of SymPy expressions, which can be combined into a single expression for generating numerical code of the full amplitude model. At the top level, this is represented by model.intensity. As its rendering in Listing 5.2 shows, it is a coherent sum over amplitudes (excluding the longitudinal polarisation of the \(J/\psi\) meson). The amplitudes \(A^{12}_{m_0,m_1,m_2,m_3}\) and \(A^{13}_{m_0,m_1,m_2,m_3}\) describes the transition of a \(J/\psi\) with spin projection \(m_0\) into the final state \(K^0_S\), \(\varSigma^+\), and \(\bar{p}\) with spin projections \(m_1\), \(m_2\), and \(m_3\), respectively. They correspond to intermediate states in chain \((12)3\) (subsystem \(K^0_S\varSigma^+\)) and chain \((13)3\) (subsystem \(K^0_S\bar{p}\)), respectively. Note that this amplitude model does not take spin alignment of the baryons in the final state into account and naively implements polarisation by excluding the longitudinal polarisation of the \(J/\psi\) meson (\(m_0 \neq 0\)).
model.amplitudes\(\displaystyle \begin{aligned} A^{12}_{-1, 0, - \frac{1}{2}, - \frac{1}{2}} \;&=\; \textstyle \frac{C_{J/\psi \xrightarrow[S=1]{L=0} N^*_{1} \overline{p}; N^*_{1} \xrightarrow[S=1/2]{L=1} K^0_S \varSigma^+} \Gamma_{N^*_{1}} m_{N^*_{1}} C^{\frac{1}{2},\frac{1}{2}}_{0,0,\frac{1}{2},\frac{1}{2}} C^{1,0}_{0,0,1,0} C^{1,0}_{\frac{1}{2},- \frac{1}{2},\frac{1}{2},\frac{1}{2}} C^{\frac{1}{2},\frac{1}{2}}_{1,0,\frac{1}{2},\frac{1}{2}} \mathcal{F}_{1}\left(m_{12}^{2}, m_{1}, m_{2}\right) D^{\frac{1}{2}}_{- \frac{1}{2},\frac{1}{2}}\left(- \phi^{12}_{1},\theta^{12}_{1},0\right) D^{1}_{-1,0}\left(- \phi_{12},\theta_{12},0\right)}{- m_{12}^{2} + \left(m_{N^*_{1}}\right)^{2} - i m_{N^*_{1}} \Gamma_{1}\left(m_{12}^{2}\right)} \\ \;&+\; \textstyle \cdots \\ \;&+\; \textstyle \frac{C_{J/\psi \xrightarrow[S=2]{L=2} N^*_{2} \overline{p}; N^*_{2} \xrightarrow[S=1/2]{L=1} K^0_S \varSigma^+} \Gamma_{N^*_{2}} m_{N^*_{2}} C^{\frac{1}{2},\frac{1}{2}}_{0,0,\frac{1}{2},\frac{1}{2}} C^{\frac{3}{2},\frac{1}{2}}_{1,0,\frac{1}{2},\frac{1}{2}} C^{2,-1}_{\frac{3}{2},- \frac{3}{2},\frac{1}{2},\frac{1}{2}} C^{1,-1}_{2,0,2,-1} \mathcal{F}_{1}\left(m_{12}^{2}, m_{1}, m_{2}\right) D^{1}_{-1,-1}\left(- \phi_{12},\theta_{12},0\right) D^{\frac{3}{2}}_{- \frac{3}{2},\frac{1}{2}}\left(- \phi^{12}_{1},\theta^{12}_{1},0\right)}{- m_{12}^{2} + \left(m_{N^*_{2}}\right)^{2} - i m_{N^*_{2}} \Gamma_{2}\left(m_{12}^{2}\right)} \\ A^{12}_{-1, 0, - \frac{1}{2}, \frac{1}{2}} \;&=\; \textstyle \frac{C_{J/\psi \xrightarrow[S=1]{L=0} N^*_{1} \overline{p}; N^*_{1} \xrightarrow[S=1/2]{L=1} K^0_S \varSigma^+} \Gamma_{N^*_{1}} m_{N^*_{1}} C^{\frac{1}{2},\frac{1}{2}}_{0,0,\frac{1}{2},\frac{1}{2}} C^{1,0}_{0,0,1,0} C^{1,0}_{\frac{1}{2},\frac{1}{2},\frac{1}{2},- \frac{1}{2}} C^{\frac{1}{2},\frac{1}{2}}_{1,0,\frac{1}{2},\frac{1}{2}} \mathcal{F}_{1}\left(m_{12}^{2}, m_{1}, m_{2}\right) D^{\frac{1}{2}}_{\frac{1}{2},\frac{1}{2}}\left(- \phi^{12}_{1},\theta^{12}_{1},0\right) D^{1}_{-1,0}\left(- \phi_{12},\theta_{12},0\right)}{- m_{12}^{2} + \left(m_{N^*_{1}}\right)^{2} - i m_{N^*_{1}} \Gamma_{1}\left(m_{12}^{2}\right)} \\ \;&+\; \textstyle \cdots \\ \;&+\; \textstyle \frac{C_{J/\psi \xrightarrow[S=2]{L=2} N^*_{2} \overline{p}; N^*_{2} \xrightarrow[S=1/2]{L=1} K^0_S \varSigma^+} \Gamma_{N^*_{2}} m_{N^*_{2}} C^{\frac{1}{2},\frac{1}{2}}_{0,0,\frac{1}{2},\frac{1}{2}} C^{\frac{3}{2},\frac{1}{2}}_{1,0,\frac{1}{2},\frac{1}{2}} C^{2,-1}_{\frac{3}{2},- \frac{1}{2},\frac{1}{2},- \frac{1}{2}} C^{1,-1}_{2,0,2,-1} \mathcal{F}_{1}\left(m_{12}^{2}, m_{1}, m_{2}\right) D^{1}_{-1,-1}\left(- \phi_{12},\theta_{12},0\right) D^{\frac{3}{2}}_{- \frac{1}{2},\frac{1}{2}}\left(- \phi^{12}_{1},\theta^{12}_{1},0\right)}{- m_{12}^{2} + \left(m_{N^*_{2}}\right)^{2} - i m_{N^*_{2}} \Gamma_{2}\left(m_{12}^{2}\right)} \\\vdots \qquad \\ A^{13}_{1, 0, \frac{1}{2}, \frac{1}{2}} \;&=\; \textstyle \frac{C_{J/\psi \xrightarrow[S=0]{L=1} \varSigma^*_{1} \varSigma^+; \varSigma^*_{1} \xrightarrow[S=1/2]{L=0} K^0_S \overline{p}} \Gamma_{\varSigma^*_{1}} m_{\varSigma^*_{1}} \left(C^{\frac{1}{2},- \frac{1}{2}}_{0,0,\frac{1}{2},- \frac{1}{2}}\right)^{2} C^{0,0}_{\frac{1}{2},\frac{1}{2},\frac{1}{2},- \frac{1}{2}} C^{1,0}_{1,0,0,0} \mathcal{F}_{0}\left(m_{13}^{2}, m_{1}, m_{3}\right) D^{\frac{1}{2}}_{\frac{1}{2},- \frac{1}{2}}\left(- \phi^{13}_{1},\theta^{13}_{1},0\right) D^{1}_{1,0}\left(- \phi_{13},\theta_{13},0\right)}{- m_{13}^{2} + \left(m_{\varSigma^*_{1}}\right)^{2} - i m_{\varSigma^*_{1}} \Gamma_{1}\left(m_{13}^{2}\right)} \\ \;&+\; \textstyle \cdots \\ \;&+\; \textstyle \frac{C_{J/\psi \xrightarrow[S=2]{L=1} \varSigma^*_{2} \varSigma^+; \varSigma^*_{2} \xrightarrow[S=1/2]{L=2} K^0_S \overline{p}} \Gamma_{\varSigma^*_{2}} m_{\varSigma^*_{2}} C^{\frac{1}{2},- \frac{1}{2}}_{0,0,\frac{1}{2},- \frac{1}{2}} C^{1,-1}_{1,0,2,-1} C^{\frac{5}{2},- \frac{1}{2}}_{2,0,\frac{1}{2},- \frac{1}{2}} C^{2,-1}_{\frac{5}{2},- \frac{1}{2},\frac{1}{2},- \frac{1}{2}} \mathcal{F}_{2}\left(m_{13}^{2}, m_{1}, m_{3}\right) D^{1}_{1,-1}\left(- \phi_{13},\theta_{13},0\right) D^{\frac{5}{2}}_{- \frac{1}{2},- \frac{1}{2}}\left(- \phi^{13}_{1},\theta^{13}_{1},0\right)}{- m_{13}^{2} + \left(m_{\varSigma^*_{2}}\right)^{2} - i m_{\varSigma^*_{2}} \Gamma_{2}\left(m_{13}^{2}\right)} \\ \end{aligned}\)
Listing 5.3 shows some of amplitudes contained in the model.amplitudes mapping, with the \(\mathrm{\LaTeX}\) rendering generated directly by the codebase. For example, the first amplitude, \(A^{12}_{-1,0,-\frac{1}{2},-\frac{1}{2}}\) is a coherent sum of all amplitudes in the subsystem, \(J/\psi[-1] \to \bar{p}[-\frac{1}{2}]\,N^*\) for all resonances \(N^* \to K^0_S[0] \varSigma^+[-\frac{1}{2}]\). To save space, \(N^*\) and \(\varSigma^*\) resonances are marked with an index rather than their full name. The model is formulated in the canonical basis using the transformation Equation (3.17). AmpForm takes the \(LS\) couplings generated by QRules, but it can also formulate in the helicity basis. The structure of each amplitude is always the same: a complex-valued coefficient (scaling factor with phase), two form factors \(\mathcal{F}_{\!L}, \mathcal{F}_{\!\ell}\) for angular momentum \(L\) in the production vertex and \(\ell\) in the decay vertex, two Clebsch–Gordan coefficients and one Wigner \(D\)-function per vertex, and a Breit–Wigner function with energy-dependent width for the intermediate edge (resonance) in each graph.
The amplitudes contain symbols that can be interpreted as either a parameter (such as resonance mass \(m_{N^*_1}\) and width \(\Gamma_{N^*_1}\)) and kinematic variables (such as the invariant mass \(m_{12}\) and helicity angle \(\theta_1^{12}\) of particle \(1\) in the rest frame of particle \(1\) and \(2\)). AmpForm provides suggested default values for the parameters by extracting them from the particle database that the user provides in Listing 5.1. It also defines symbolic expressions for computing the kinematic variables from the three four-momenta of the final state. For example, \(\phi_1^{12}\) is defined as the azimuthal angle of the four-momentum of particle \(1\) boosted into the rest frame of the sum of the four-momenta of particle \(1\) and \(2\) and with the \(z\) axis rotated into the direction of movement (see Equation (3.9)).
phi = sp.Symbol("phi_1^12", real=True)
model.kinematic_variables[phi]\(\displaystyle \phi\left(\boldsymbol{B_z}\left(\frac{\left|\vec{{p}_{12}}\right|}{E\left({p}_{12}\right)}\right) \boldsymbol{R_y}\left(- \theta\left({p}_{12}\right)\right) \boldsymbol{R_z}\left(- \phi\left({p}_{12}\right)\right) p_{1}\right)\)
Together, all attributes of an amplitude model object generated by AmpForm contain the information to evaluate the intensity for a random point in phase space. For this, the amplitudes of Listing 5.3 are symbolically substituted into the main intensity definition of Listing 5.2. The resulting expression, as well as the definitions in model.kinematic_variables, can be used as template for generating numerical code for high-performance computations(see Section 5.3.4).
Nested expression definitions
In principle, amplitude models can be fully implemented with SymPy directly. However, amplitude models contain so many mathematical operations, that it is hard to render them concisely. For this reason, AmpForm provides a few tools that help to define amplitude model expressions in a nested form that is understandable, making it more suitable for a self-documenting workflow (Section 4.3). The expression \(\mathcal{F}_2\left(m_{12}^2,m_1,m_2\right)\) in Listing 5.3 is an example of this and is built up of other expression classes, like the Blatt–Weisskopf function and a squared breakup momentum function.
from ampform.dynamics import FormFactor
m_12, m_1, m_2 = sp.symbols("m_12 m_1 m_2", nonnegative=True)
expr = FormFactor(m_12**2, m_1, m_2, angular_momentum=2)\(\displaystyle \begin{aligned} \mathcal{F}_{2}\left(m_{12}^{2}, m_{1}, m_{2}\right) \;&=\; \sqrt{B_{2}^2\left(q^2_{12}\left(m_{12}^{2}\right)\right)} \\ B_{2}^2\left(q^2_{12}\left(m_{12}^{2}\right)\right) \;&=\; \frac{13 q^2_{12}\left(m_{12}^{2}\right)^{2}}{q^2_{12}\left(m_{12}^{2}\right)^{2} + 3 q^2_{12}\left(m_{12}^{2}\right) + 9} \\ q^2_{12}\left(m_{12}^{2}\right) \;&=\; \frac{\left(m_{12}^{2} - \left(m_{1} - m_{2}\right)^{2}\right) \left(m_{12}^{2} - \left(m_{1} + m_{2}\right)^{2}\right)}{4 m_{12}^{2}} \\ \end{aligned}\)
Nested expression classes like these can be easily extended using the decorator function, @unevaluated, provided by AmpForm. This allows for extending amplitude builders with large mathematical definitions. A simplified example is shown in Listing 5.4. The expression class is ‘unfolded’ to its full form using SymPy’s doit() method.
@unevaluated decorator to postpone evaluation of the expression until it is explicitly requested. The class is used to define a custom expression for \(f(x,y) = x^2 + y^2\). The second snippet uses AmpForm’s aslatex() helper function to render \(f\) and its doit() definition.
from ampform.sympy import unevaluated
@unevaluated
class MyExpr(sp.Expr):
x: sp.Symbol
y: sp.Symbol
_latex_repr_ = "f({x}, {y})"
def evaluate(self) -> sp.Expr:
x, y = self.args
return x**2 + y**2
a, b = sp.symbols("a b")
expr = MyExpr(a, b**2)from ampform.io import aslatex
Math(aslatex({expr: expr.doit()}))\(\displaystyle \begin{aligned} f(a, b^{2}) \;&=\; a^{2} + b^{4} \\ \end{aligned}\)
Cached expression unfolding
Combining all the definitions into one large expression template can be slow when using SymPy. AmpForm therefore provides ways to automatically cache the result of this “expression unfolding” to disk. All of these functions are available through the ampform.sympy.cached module, which caches expensive CAS operations to disk. Listing 5.5 shows two equivalent ways of fully unfolding the amplitude model expression that was generated with Listing 5.2.
cached that caches expensive CAS operations to disk. The first example shows how the full intensity expression is constructed from an amplitude model object and the second example shows how the cached.unfold() convenience function simplifies that procedure.
from ampform.sympy import cached
intensity_expr = cached.doit(model.intensity)
expr_with_amplitudes = cached.xreplace(intensity_expr, model.amplitudes)
full_expression = cached.doit(expr_with_amplitudes)full_expression = cached.unfold(model)These are two examples of additional functionality that AmpForm provides to facilitate working with large symbolic expressions for amplitude analysis. It contains more functionality, such as an UnevaluatableIntegral class that postpones evaluation of integrals that cannot be solved analytically, which is essential when parametrising decay dynamics with lineshapes that are analytically continuous.
AmpForm-DPD
The central physics contribution of this thesis is the implementation of spin-aligned amplitude models for three-body decays using Dalitz-plot decomposition. AmpForm can handle arbitrary multi-body decays, and works well as long as the decay does not involve chains with differing topologies that include final-state particles with spin. In such cases it becomes difficult to formulate spin-aligned models symbolically, since the Wigner rotation angles must be computed numerically (Equation (3.20)). To address this, a specialised package was developed alongside AmpForm: AmpForm-DPD (PyPI: ampform‑dpd). This package builds amplitude models with SymPy using the DPD method [26], in particular the unpolarised intensity of Equation (3.24) with the Dalitz-plot function of Equation (3.26). In the long term, the goal is to merge AmpForm-DPD back into AmpForm and extend spin alignment to arbitrary multi-body decays [27].
Listing 5.6 shows how an amplitude model is formulated with AmpForm-DPD. The procedure is similar as in Listing 5.2, but the main difference is that the amplitude model is built up from a custom class structure that represents decay chains for three-body decays. The rendered intensity expression is the implementation of the unpolarised differential cross section of Equation (3.24). As opposed to Listing 5.2, AmpForm-DPD correctly sums over all \(J/\psi\) helicities to get the unpolarised intensity. The trivial helicity \(\lambda_1=0\) for \(K^0_S\) has been omitted from the amplitude indices.
Note that the Wigner \(d\)-functions for the Wigner rotations appearing in the Dalitz-plot function of Equation (3.26) appear in this top-level intensity expression rather than in the amplitudes \(A^k_{\lambda_0\lambda_2\lambda_3}\). These individual amplitudes are shown in Listing 5.7. The amplitude rendering is clearer than that of Listing 5.3, because it uses AmpForm’s @unevaluated expression classes of Listing 5.4 to denote the vertex functions with \(\mathcal{F}\) and Breit–Wigner functions with \(\mathcal{R}\). AmpForm-DPD also formulates the model with one complex-valued coupling per decay node rather than with one coefficient per chain. The angles are formulated using Equation (3.27) and are shown in Listing 5.8.
from ampform_dpd import DalitzPlotDecompositionBuilder
from ampform_dpd.adapter.qrules import to_three_body_decay
from ampform_dpd.dynamics.builder import (
formulate_breit_wigner_with_form_factor
)
decay = to_three_body_decay(reaction.transitions)
model_builder = DalitzPlotDecompositionBuilder(decay)
for chain in model_builder.decay.chains:
model_builder.dynamics_choices.register_builder(
chain, formulate_breit_wigner_with_form_factor
)
dpd_model = model_builder.formulate(reference_subsystem=2)
dpd_model.intensity\(\displaystyle \sum_{\lambda_{0}=-1}^{1} \sum_{\lambda_{2}=-1/2}^{1/2} \sum_{\lambda_{3}=-1/2}^{1/2}{\left|{\sum_{\lambda_0'=-1}^{1} \sum_{\lambda_2'=-1/2}^{1/2} \sum_{\lambda_3'=-1/2}^{1/2}{A^{2}_{\lambda_0' \lambda_2' \lambda_3'} d^{\frac{1}{2}}_{\lambda_2',\lambda_{2}}\!\!\left(\zeta^2_{2(2)}\right) d^{\frac{1}{2}}_{\lambda_3',\lambda_{3}}\!\!\left(\zeta^3_{2(2)}\right) d^{1}_{\lambda_{0},\lambda_0'}\!\!\left(\zeta^0_{2(2)}\right) + A^{3}_{\lambda_0' \lambda_2' \lambda_3'} d^{\frac{1}{2}}_{\lambda_2',\lambda_{2}}\!\!\left(\zeta^2_{3(2)}\right) d^{\frac{1}{2}}_{\lambda_3',\lambda_{3}}\!\!\left(\zeta^3_{3(2)}\right) d^{1}_{\lambda_{0},\lambda_0'}\!\!\left(\zeta^0_{3(2)}\right)}}\right|^{2}}\)
aslatex() helper function, which renders mappings of expressions as a single \(\mathrm{\LaTeX}\) block.
from IPython.display import Math
from ampform.io import aslatex
Math(aslatex(dpd_model.amplitudes))\(\displaystyle \begin{aligned} A^{2}_{-1, 0, - \frac{1}{2}, - \frac{1}{2}} \;&=\; \textstyle 2 \sum_{\lambda_{R}=-3/2}^{3/2}{- \delta_{-1, \lambda_{R} + \frac{1}{2}} \mathcal{F}_{1}\left(m_{0}^{2}, m_{\varSigma^*_{0}}, m_{2}\right) \mathcal{F}_{2}\left(\sigma_{2}^{2}, m_{1}, m_{3}\right) \mathcal{H}^\mathrm{decay}_{\varSigma^*_{0}, - \frac{1}{2}, 0} \mathcal{H}^\mathrm{production}_{\varSigma^*_{0}, \lambda_{R}, - \frac{1}{2}} \mathcal{R}_{2}\left(\sigma_{2}, m_{\varSigma^*_{0}}, \Gamma_{\varSigma^*_{0}}\right) d^{\frac{3}{2}}_{\lambda_{R},- \frac{1}{2}}\left(\theta_{31}\right)} \\ \;&+\; \textstyle \cdots \\ \;&+\; \textstyle \sum_{\lambda_{R}=-5/2}^{5/2}{- \delta_{-1, \lambda_{R} + \frac{1}{2}} \mathcal{F}_{1}\left(m_{0}^{2}, m_{\varSigma^*_{2}}, m_{2}\right) \mathcal{F}_{2}\left(\sigma_{2}^{2}, m_{1}, m_{3}\right) \mathcal{H}^\mathrm{decay}_{\varSigma^*_{2}, - \frac{1}{2}, 0} \mathcal{H}^\mathrm{production}_{\varSigma^*_{2}, \lambda_{R}, - \frac{1}{2}} \mathcal{R}_{2}\left(\sigma_{2}, m_{\varSigma^*_{2}}, \Gamma_{\varSigma^*_{2}}\right) d^{\frac{5}{2}}_{\lambda_{R},- \frac{1}{2}}\left(\theta_{31}\right)} \\ A^{3}_{-1, 0, - \frac{1}{2}, - \frac{1}{2}} \;&=\; \textstyle 2 \sum_{\lambda_{R}=-3/2}^{3/2}{\delta_{-1, \lambda_{R} + \frac{1}{2}} \mathcal{F}_{2}\left(m_{0}^{2}, m_{N^*_{2}}, m_{3}\right) \mathcal{F}_{1}\left(\sigma_{3}^{2}, m_{1}, m_{2}\right) \mathcal{H}^\mathrm{decay}_{N^*_{2}, 0, - \frac{1}{2}} \mathcal{H}^\mathrm{production}_{N^*_{2}, \lambda_{R}, - \frac{1}{2}} \mathcal{R}_{1}\left(\sigma_{3}, m_{N^*_{2}}, \Gamma_{N^*_{2}}\right) d^{\frac{3}{2}}_{\lambda_{R},\frac{1}{2}}\left(\theta_{12}\right)} \\ \;&+\; \textstyle \cdots \\ \;&+\; \textstyle \sum_{\lambda_{R}=-3/2}^{3/2}{\delta_{-1, \lambda_{R} + \frac{1}{2}} \mathcal{F}_{0}\left(m_{0}^{2}, m_{N^*_{2}}, m_{3}\right) \mathcal{F}_{1}\left(\sigma_{3}^{2}, m_{1}, m_{2}\right) \mathcal{H}^\mathrm{decay}_{N^*_{2}, 0, - \frac{1}{2}} \mathcal{H}^\mathrm{production}_{N^*_{2}, \lambda_{R}, - \frac{1}{2}} \mathcal{R}_{1}\left(\sigma_{3}, m_{N^*_{2}}, \Gamma_{N^*_{2}}\right) d^{\frac{3}{2}}_{\lambda_{R},\frac{1}{2}}\left(\theta_{12}\right)} \\\vdots \qquad \\ A^{3}_{1, 0, \frac{1}{2}, \frac{1}{2}} \;&=\; \textstyle 2 \sum_{\lambda_{R}=-3/2}^{3/2}{\delta_{1, \lambda_{R} - \frac{1}{2}} \mathcal{F}_{2}\left(m_{0}^{2}, m_{N^*_{2}}, m_{3}\right) \mathcal{F}_{1}\left(\sigma_{3}^{2}, m_{1}, m_{2}\right) \mathcal{H}^\mathrm{decay}_{N^*_{2}, 0, \frac{1}{2}} \mathcal{H}^\mathrm{production}_{N^*_{2}, \lambda_{R}, \frac{1}{2}} \mathcal{R}_{1}\left(\sigma_{3}, m_{N^*_{2}}, \Gamma_{N^*_{2}}\right) d^{\frac{3}{2}}_{\lambda_{R},- \frac{1}{2}}\left(\theta_{12}\right)} \\ \;&+\; \textstyle \cdots \\ \;&+\; \textstyle \sum_{\lambda_{R}=-3/2}^{3/2}{\delta_{1, \lambda_{R} - \frac{1}{2}} \mathcal{F}_{0}\left(m_{0}^{2}, m_{N^*_{2}}, m_{3}\right) \mathcal{F}_{1}\left(\sigma_{3}^{2}, m_{1}, m_{2}\right) \mathcal{H}^\mathrm{decay}_{N^*_{2}, 0, \frac{1}{2}} \mathcal{H}^\mathrm{production}_{N^*_{2}, \lambda_{R}, \frac{1}{2}} \mathcal{R}_{1}\left(\sigma_{3}, m_{N^*_{2}}, \Gamma_{N^*_{2}}\right) d^{\frac{3}{2}}_{\lambda_{R},- \frac{1}{2}}\left(\theta_{12}\right)} \\ \end{aligned}\)
Math(aslatex(dpd_model.variables))\(\displaystyle \begin{aligned} \theta_{31} \;&=\; \operatorname{acos}{\left(\frac{2 \sigma_{2} \left(- m_{2}^{2} - m_{3}^{2} + \sigma_{1}\right) - \left(m_{0}^{2} - m_{2}^{2} - \sigma_{2}\right) \left(- m_{1}^{2} + m_{3}^{2} + \sigma_{2}\right)}{\sqrt{\lambda\left(m_{0}^{2}, m_{2}^{2}, \sigma_{2}\right)} \sqrt{\lambda\left(\sigma_{2}, m_{3}^{2}, m_{1}^{2}\right)}} \right)} \\ \theta_{12} \;&=\; \operatorname{acos}{\left(\frac{2 \sigma_{3} \left(- m_{1}^{2} - m_{3}^{2} + \sigma_{2}\right) - \left(m_{0}^{2} - m_{3}^{2} - \sigma_{3}\right) \left(m_{1}^{2} - m_{2}^{2} + \sigma_{3}\right)}{\sqrt{\lambda\left(m_{0}^{2}, m_{3}^{2}, \sigma_{3}\right)} \sqrt{\lambda\left(\sigma_{3}, m_{1}^{2}, m_{2}^{2}\right)}} \right)} \\ \zeta^0_{2(2)} \;&=\; 0 \\ \zeta^2_{2(2)} \;&=\; 0 \\ \zeta^3_{2(2)} \;&=\; 0 \\ \zeta^0_{3(2)} \;&=\; - \operatorname{acos}{\left(\frac{- 2 m_{0}^{2} \left(- m_{2}^{2} - m_{3}^{2} + \sigma_{1}\right) + \left(m_{0}^{2} + m_{2}^{2} - \sigma_{2}\right) \left(m_{0}^{2} + m_{3}^{2} - \sigma_{3}\right)}{\sqrt{\lambda\left(m_{0}^{2}, m_{3}^{2}, \sigma_{3}\right)} \sqrt{\lambda\left(m_{0}^{2}, \sigma_{2}, m_{2}^{2}\right)}} \right)} \\ \zeta^2_{3(2)} \;&=\; \operatorname{acos}{\left(\frac{2 m_{2}^{2} \left(- m_{0}^{2} - m_{1}^{2} + \sigma_{1}\right) + \left(m_{0}^{2} + m_{2}^{2} - \sigma_{2}\right) \left(- m_{1}^{2} - m_{2}^{2} + \sigma_{3}\right)}{\sqrt{\lambda\left(m_{0}^{2}, m_{2}^{2}, \sigma_{2}\right)} \sqrt{\lambda\left(\sigma_{3}, m_{2}^{2}, m_{1}^{2}\right)}} \right)} \\ \zeta^3_{3(2)} \;&=\; \operatorname{acos}{\left(\frac{2 m_{3}^{2} \left(- m_{0}^{2} - m_{1}^{2} + \sigma_{1}\right) + \left(m_{0}^{2} + m_{3}^{2} - \sigma_{3}\right) \left(- m_{1}^{2} - m_{3}^{2} + \sigma_{2}\right)}{\sqrt{\lambda\left(m_{0}^{2}, m_{3}^{2}, \sigma_{3}\right)} \sqrt{\lambda\left(\sigma_{2}, m_{3}^{2}, m_{1}^{2}\right)}} \right)} \\ \end{aligned}\)
TensorWaves
The last package in the toolchain is TensorWaves (PyPI name tensorwaves) [28]. At core, it is a wrapper around SymPy’s code generation capabilities that makes it easier to work with large symbolic expressions and fit them to large data samples. As discussedin Section 5.2, TensorWaves started as a reimplementation of the ComPWA framework in TensorFlow and inherits many of the interfaces from the original design. Importantly, the interfaces are kept as simple as possible, so that it is easy to insert custom implementations into the workflow (open-closed principle).
The central interface of TensorWaves is the Function class. It is the most generic class for defining mathematical functions that take input and return output. More specific interfaces are derived from this generic class. Two examples relevant for amplitude analysis are a ParametrizedFunction and an Estimator. A ParametrizedFunction can be used to separate the parameters from the kinematic variables in an amplitude model. It takes a DataSample, which is a mapping of string keys to data arrays (like pre-computed helicity angles), and carries around scalar parameter values that can be tweaked separately. The Estimator represents a function that takes a set of scalar parameters and returns a single estimator value, such as a likelihood or a chi-square value. Both functions take mappings of kinematic variables (data columns) and parameters as input rather than a list of positional arguments. This allows for a large number of arguments to be handled without being constrained to valid Python variable names.
For performing a fit, TensorWaves provides an Optimizer interface. It takes an Estimator along with a mapping with parameter values that serve as an initial starting point in parameter space. At the time of writing, TensorWaves comes with two implementations for the Optimizer interface: Minuit2 [29; 30] and ScipyMinimizer [31]. Other optimisers such as NLopt could easily be implemented without redesigning the interfaces. In the analyses presented in this thesis, we found that the Minuit2 implementation was the most robust and fastest and we therefore did not invest in implementing other optimisers. In addition, benchmarks using TensorWaves indicate that array-oriented libraries are much more performant when the computation is parallelised in parameter space as well. This suggests that it may be more efficient to perform fits using optimisers that are parallelisable in parameter space, rather than using optimisers that use gradient-descent methods, which is inherently sequential in parameter space.
Behind these main interfaces, TensorWaves uses SymPy’s code generation capabilities to generate code for the mathematical functions. This makes it possible to switch between different computational backends, such as TensorFlow, JAX, and NumPy. The generated code is optionally JIT-compiled and wrapped in classes that follow interfaces such as ParametrizedFunction and Estimator. These interface classes are designed so that a large number of implementations can be used without having to rewrite the workflow. Listing 5.9 shows how the unfolded intensity expression for the DPD model in Listing 5.6 is used to generate a ParametrizedFunction that can be used for fitting.
parameter_defaults are marked as parameters in the generated ParametrizedFunction. Any remaining free symbols become arguments to the function call through an input DataSample.
from tensorwaves.function.sympy import create_parametrized_function
intensity_func = create_parametrized_function(
expression=cached.unfold(dpd_model),
parameters=dpd_model.parameter_defaults,
backend="jax",
)Internally, the created function is a JAX function of a lambdified form that is shown in Section 4.2.3, but with dozens of arguments. The ParametrizedFunction wraps this function, so that the input data columns and current parameter values are mapped to the correct argument.
Ultimately, the goal of these packages is to translate the description of scattering problems into computational code that can efficiently evaluate amplitude models on large experimental data samples. Each package addresses a specific task within the workflow, and their combination provides a coherent toolchain for translating theoretical models into practical, high-performance analyses.
5.4 Developer Experience
As discussed in Section 5.2, the move to Python eliminated much of the burden of maintaining a high-performance computational backend in C++. This section highlights several ways in which Python also improves the developer experience within ComPWA and how this strengthens the prospects for academic continuity(see also Section 4.3.3). Beyond enabling amplitude analyses, the workflow and tools are designed to lower the entry barrier for new contributors, while ensuring that the code remains thoroughly tested and well documented. This is especially important in a small scientific community like hadron physics, where the roles of user and developer often overlap.
Source control
All projects of the ComPWA project are hosted as Git repositories on GitHub under github.com/ComPWA. The repositories enforce a linear commit history through pull requests (PRs) that are squash-merged into single commits. This ensures that changes to the code base remain comprehensive and that code and documentation quality is guaranteed before a PR is merged. All repositories that host general-use packages are public and licensed under an Apache-2.0 license.
Repositories that contain source code for Python packages deploy the package through the Python Package Index (PyPI), as well as conda-forge, and are versioned according to Semantic Versioning. These repositories are also automatically archived on Zenodo with a DOI, so that users can cite the code in their publications.
Pinned developer environment
All ComPWA repositories come with a pyproject.toml file that specifies the dependencies of the project, organised in terms of dependency groups. In repositories where stability is preferred, lock files are provided that pin all dependencies to specific versions, ensuring that the environment is consistent both on the operating system where the developer works as well as the continuous integration (CI) environment where the code is tested. Over time, we employed different strategies for generating lock files, but recently settled for uv for Python-only projects and Pixi for projects that require other languages, such as Julia. Both tools are able to generate and update cross-platform lock files with high speed. Cloning the repository and activating the environment with these tools is all the user has to do to start developing.
Code quality assurance
All packages include unit tests that are automatically executed whenever a pull request (PR) is submitted. Test coverage is tracked with the Codecov service to ensure the code remains well-tested. In addition, the repositories provide tutorials in the form of Jupyter notebooks. These notebooks also act as integration tests, since they are executed during the documentation build triggered by each PR. The GitHub repositories are configured so that code can only be merged into the production branch via a PR that has passed these checks. AmpForm and TensorWaves further include continuous benchmarks that track the runtime of selected tests and issue warnings if changes to the codebase result in performance regressions.
ComPWA enforces its style guidelines automatically rather than describing them in a document, so that the style can be continuously updated and developers can stay in the workflow without having to consult a guide. This is done through autoformatters such as Prettier and linters such as Ruff. These tools are enforced through Pre-commit, which performs a set of checks locally on every commit as well as on PRs, applying autofixes where possible. In projects with much documentation, spelling is checked through Pre-commit as well with a CSpell hook. Other developer tools that the ComPWA project uses, can be found under compwa.github.io/develop [32].
Consistent and interlinked documentation
All ComPWA packages come with documentation of the API (function and class descriptions) as well as tutorials in the form of Jupyter notebooks. The text is kept minimal by linking to relevant resources as well as to other parts of the documentation. An example are automatic links in code examples to the function description in the API, so that the main text does not have to explain the code example in detail. The interlinking of the documentation is continuously tested as well to ensure that links are not broken. On each release of a package, a separate, versioned website is hosted through the Read the Docs documentation hosting service. In addition, links to the APIs of external dependencies point to the specific version of the dependency at the time the documentation was built, so that links in older versions of the documentation remain valid. All of these techniques ensure that the documentation is consistent and up-to-date, even when the code changes, and reduce the maintenance load of the documentation.
Rapid prototyping
Python is a high-level language that allows for rapid prototyping of new features. This is especially important in a research environment, where new ideas and concepts need to be tested quickly. The ability to write code in a more human-readable way allows for faster iteration and experimentation. We test and document ideas in the form of Jupyter notebooks named “Technical Reports” hosted on compwa.github.io/report. The notebooks are automatically run, tested, and deployed as webpages for preservation.