Physical Review B
covering condensed matter and materials physics
- Highlights
- Recent
- Accepted
- Collections
- Authors
- Referees
- Search
- Press
- About
- Editorial Team
Neural network backflow for ab initio quantum chemistry
An-Jun Liu and Bryan K. Clark
Phys. Rev. B 110, 115137 – Published 18 September 2024
- Article
- References
- No Citing Articles
PDFHTMLExport Citation
- Abstract
- Authors
- Article Text
Abstract
The ground state of second-quantized quantum chemistry Hamiltonians provides access to an important set of chemical properties. Wave functions based on machine-learning architectures have shown promise in approximating these ground states in a variety of physical systems. In this paper, we show how to achieve state-of-the-art energies for molecular Hamiltonians using the the neural network backflow (NNBF) wave function. To accomplish this, we optimize this ansatz with a variant of the deterministic optimization scheme based on selected configuration interaction introduced by Li etal., [J. Chem. Theory Comput. 19, 8156 (2023)], which we find works better than standard Markov chain Monte Carlo sampling. For the molecules we studied, NNBF gives lower energy states than both Coupled Cluster with Single and Double excitations and other neural network quantum states. We systematically explore the role of network size as well as optimization parameters in improving the energy. We find that, while the number of hidden layers and determinants play a minor role in improving the energy, there are significant improvements in the energy from increasing the number of hidden units as well as the batch size used in optimization, with the batch size playing a more important role.
1 More
- Received 1 June 2024
- Revised 3 September 2024
- Accepted 6 September 2024
DOI:https://doi.org/10.1103/PhysRevB.110.115137
©2024 American Physical Society
Physics Subject Headings (PhySH)
- Research Areas
Electronic structure
- Physical Systems
Molecules
- Techniques
Ab initio calculationsMachine learningQuantum chemistry methods
Condensed Matter, Materials & Applied Physics
Authors & Affiliations
An-Jun Liu* and Bryan K. Clark
- The Anthony J. Leggett Institute for Condensed Matter Theory and IQUIST and NCSA Center for Artificial Intelligence Innovation and Department of Physics, University of Illinois at Urbana-Champaign, IL 61801, USA
- *Contact author: anjunjl2@illinois.edu
Article Text (Subscription Required)
Click to Expand
References (Subscription Required)
Click to Expand
Issue
Vol. 110, Iss. 11 — 15 September 2024
Access Options
- Buy Article »
- Log in with individual APS Journal Account »
- Log in with a username/password provided by your institution »
- Get access through a U.S. public or high school library »
Article part of CHORUS
Accepted manuscript will be available starting18 September 2025.Images
Figure 1
Illustration of the neural network backflow (NNBF) architecture with an example input configuration: two spin-orbitals with one spin-up electron occupying the first spin-orbital and one spin-down electron occupying the second spin-orbital. The neural network takes the configuration string as input and outputs a set of configuration-dependent single-particle orbitals of shape . Rows of these orbitals are then selected based on the electron locations to form square matrices, from which determinants are computed. For this example, the first and last rows (the gray orbitals) are selected to compute the determinant. The sum of these determinants yields the amplitude for the input configuration.
Figure 2
A diagrammatic description of the workflow of fixed-size selected configuration (FSSC). Each circle represents one configuration. Initially, the algorithm initializes a parameterized wave-function and a core space of size . After each iteration, the amplitude moduli for all configurations in are computed, and the 10 largest unique ones (denoted by red configurations) are selected to form the new core space . The energy (depicted as the loss function in the orange box) and its gradient are estimated by constraining the relative sum to only consider , and the latter is used to update the model parameters.
Figure 3
Comparison between fixed-size selected configuration (FSSC) and Markov chain Monte Carlo (MCMC) schemes on lithium oxide (, and ). The red band represents the posttraining MCMC inference energy for the FSSC scheme with a width of in each direction around the mean. A moving average window of 100 is applied to improve readability.
Figure 4
Dissociation curve for obtained with neural network backflow (NNBF), Hartree-Fock (HF), Coupled Cluster with Single and Double excitations (CCSD), and Coupled Cluster with Single, Double and perturbative Triple excitations (CCSD(T)) methods. The full configuration interaction (FCI) energy is used as the ground-truth energy. The NNBF energy is trained using the fixed-size selected configuration (FSSC) scheme with , and the reported energy here is computed exactly, as it remains feasible to compute.
Figure 5
Effects of network architecture on the neural network backflow (NNBF) performance on (, and ). Each point is one run of the same model. Double precision is employed for calculating the exact inference energy, as higher precision is necessary when the NNBF state closely approaches the true ground state. (a)Effect of network depth. The increase in performance levels off after the addition of two hidden layers to NNBF states. (b)Effect of number of hidden units. A wider hidden layer consistently improves accuracy, with the energy error decreasing at a rate of with until . (c)Effect of number of determinants (). Expanding the number of determinants reduces the energy error and begins to plateau after four determinants.
Figure 6
Effects of network architecture on neural network backflow (NNBF) performance on the with . (a)Effect of network depth. The improvement with more layers quickly saturates after two layers are added to NNBF states. Baselines from other network quantum states (NNQS) works have been provided for comparison: QiankunNet [31], NAQS [28], MADE [29], [32], and RBM-SC [30]. The dark orange star denotes the best NNBF energy we have obtained with a greater . (b)Effect of number of hidden units. Wider hidden layer continuously improves the accuracy with the absolute error drops at a rate of with until . (c)Effect of number of determinants (). Increasing the number of determinants reduces the energy error, but the improvement begins to plateau after reaching four determinants.
Figure 7
(a)Experiments exploring the impact of batch size on energy improvements with and (2,256,1) on lithium oxide. The energy error decreases approximately following with an value of 0.900605 for and with for . Some data points from Fig.6are included for comparison in the inset. The black star represents a trial trained with using the sampling scheme from Ref.[30]. Its position is determined by the average batch size at convergence. The energy closely aligns with the fitted line for the fixed-size selected configuration (FSSC) scheme, indicating that using a dynamic batch size does not offer a noticeable improvement in training. (b)Demonstration of the effectiveness of the batch size scheduling (BSS) strategy, with a moving average window of 100 applied for improved readability. (c)Experiments examining the dependence of energy improvements on batch size with various values on methane. Each data point represents the average and the standard deviation of the exact neural network backflow (NNBF) energy across three seeds.
Figure 8
Optimization progress for , with and without configuration interaction with single and double excitations (CISD) pretraining (, and ). The training iteration comprises 20000 steps, and a moving average window of 1000 is applied for better readability.