LA-UR-22-31213

*Fall DNP*

Friday Oct. 28$^{th}$ 2022

Theoretical Division

**Los Alamos National Laboratory Caveat**

The submitted materials have been authored by an employee or employees of Triad National Security, LLC (Triad) under contract with the U.S. Department of Energy/National Nuclear Security Administration (DOE/NNSA).

Accordingly, the U.S. Government retains an irrevocable, nonexclusive, royaltyfree license to publish, translate, reproduce, use, or dispose of the published form of the work and to authorize others to do the same for U.S. Government purposes.

**We take a Bayesian approach:** our network inputs are sampled based on nuclear data uncertainties

Our network outputs are therefore **statistical**

We can represent outputs by a collection of Gaussians (for masses we only need 1)

Our network also provides a well-quanitfied estimate of uncertainty (fully propagated through the model)

**Network topology:** fully connected $\sim$4-8 layers; 8 nodes; regularization via weight-decay

Last layer converts output to Gaussian ad-mixture

Bishop (1994 & 2006) • Lovell (FRIB-TA early career award winner!), Mohan, Sprouse & **Mumpower** PRC 106 014305 (2022)

Writing down the many-body quantum mechanical Hamlitonian is hard; want to find ground state

Set: $f(\vec{\boldsymbol{x}}) = \vec{\boldsymbol{y}}$

$f$ will be a neural network (a model that can change)

$\vec{\boldsymbol{y}_{obs}}$ are the observations (e.g. Atomic Mass Evaluation) we want to match $\vec{\boldsymbol{y}}$ with

But what about $\vec{\boldsymbol{x}}$?

Why not a *physically motivated* feature space!?

proton number ($Z$), neutron number, ($N$), nucleon number ($A$), etc.

This is a **contrasting approach** than other avenues pursued in the literature

Past work relies on matching model residuals: $\vec{\boldsymbol{y}}$ = (model output) - data

Lovell (FRIB-TA early career award winner!), Mohan, Sprouse & **Mumpower** PRC 106 014305 (2022)

We train on mass data using a log-loss, $\mathcal{L}_1$

Training takes a considerably long time

Now we optimize data, $\mathcal{L}_1$, **and** introduce a second (physical) constraint, $\mathcal{L}_2$

We have tried many physics-based losses, this particular one (Garvey-Kelson relations) performs well

Garvey & Kelson PRL 16 197 (1966) • **Mumpower**, Sprouse, Lovell, & Mohan PRCL 106 021301 (2022)

$\mathcal{L}_\textrm{total} = \mathcal{L}_1 + \lambda_\textrm{phys} \mathcal{L}_2$ ; $\lambda_\textrm{phys}=1$ in this case

Good match to data does not imply good physics (although by happenstance it is in the end in this case)!

Garvey & Kelson PRL 16 197 (1966) • **Mumpower**, Sprouse, Lovell, & Mohan PRCL 106 021301 (2022)

Results plotted relative to the Los Alamos FRDM model

Modern mass models draw a 'straight line' through data

Train on only a few select nuclei (450 total, at random, across the chart of nuclides)

The rest of the chart (including AME points) are predictions

If we use no physics information in training the spread in predictions is large

Trends can be obtained, but models lack accuracy

Including physics constraint provides superior model predictions

Probabilistic network → well-defined uncertainties predicted for every nucleus ( ◼ 1$\sigma$, ◼ 2$\sigma$ ◼ 3$\sigma$)

New data (unknown to the model) are in agreement with the MDN model predictions

Model is now trained on only 20% of the AME and predicting 80%!

The training set is __random__; different datasets perform differently

Using GK relations alone (red) quickly diverges due to accumulated errors from repeated iteration

Importantly, our model training enforces a single iteration of GK across the chart of nuclides!

Our results are not limited to any mass region - all of them are described equally well ($Z \geq 20$)

Results are better than any existing mass model; uncertainties inherently increase with extrapolation

We can predict new measurements and capture trends (important for astrophysics)

One method is to calculate **Shapley values** (Shapely 1953 • Lundberg & Lee 2017)

Idea arises in Game Theory when a coalition of players (ML - features) optimize gains relative to costs

The players may contribute unequally, but the payoff is mutually beneficial

Applicable to many applications, ML in particular to understand feature importance

Lipovetsky, Stan, and Michael Conklin, Applied Stochastic Models in Business and Industry (2001) • Strumbelj, Erik, and Kononenko, Knowledge and information systems (2014) • Figure from SHAP Python package (https://shap.readthedocs.io/)

SHAP calculated for all of the AME (2020) [predictions!]

The model behaves like a theoretical mass model

Macroscopic variables most important followed by quantum corrections

The network is able to incorporate this knowledge and updated predictions via Bayes' theorem

New information provided at closed shell $N=126$

Sprouse *et al.* in preparation (2022) • See Mengke Li's presentation for extrapolation quality [EE.00002 @ 10:42 am; room: Celestin C]!

My collaborators

A. Lovell, A. Mohan, & T. Sprouse

▣ Postdoc

We have developed an interpretable neural network capable of predicting masses

**Advances:**

We can predict masses accurately and suss out trends in data

Our network is probabilistic, each prediction has an associated uncertainty

Our model can extrapolate and retains the physics

The quality of the training data is important (not shown)

We can classify the information content of existing and future measurements with this type of modeling

We are now ready to study the impact of this modeling in astrophysics!

Results / Data / Papers **@** MatthewMumpower.com