FermiNet: Quantum physics and chemistry from first principles

1 3 minutes read

Unfortunately, an error of 0.5% is still not enough to be useful to the working chemist. The energy in molecular bonds is just a small fraction of the total energy of the system, and correctly predicting whether a molecule is stable can often be based on only 0.001% of the total energy of the system, or about 0.2% of the remaining “binding” energy.

For example, while the total energy of the electrons in a butadiene molecule is about 100,000 kcal per mole, the difference in energy between the different possible forms of the molecule is only 1 kcal per mole. This means that if you want to correctly predict the natural shape of butadiene, the same level of accuracy would be needed as measuring the width of a football field down to the millimeter.

With the advent of digital computing after World War II, scientists developed a wide range of computational methods that went beyond this mean field description of electrons. While these methods come in a patchwork of shortcuts, they all generally fall somewhere on an axis that trades accuracy for efficiency. On the one hand, there are essentially exact methods that scale worse than the exponential number of electrons, making them impractical for all but the smallest molecules. On the other hand, there are methods that scale linearly, but they are not very accurate. These computational methods have had an enormous impact on the practice of chemistry, with the 1998 Nobel Prize in Chemistry being awarded to the creators of many of these algorithms.

Fermionic neural networks

Despite the breadth of existing computational quantum mechanical tools, we felt that a new way to address the problem of efficient representation was needed. There’s a reason why the largest quantum chemical calculations are only tens of thousands of electrons even in the most approximate ways, while classical chemical calculation techniques like molecular dynamics can handle millions of atoms.

The state of a classical system can be easily described, all we have to do is keep track of the position and momentum of each particle. Representing the state of a quantum system is much more difficult. A probability must be assigned to each possible electron site configuration. This is encoded in the wave function, which assigns a positive or negative number to each configuration of electrons, and the square of the wave function gives the probability of finding the system in that configuration.

The space of all possible configurations is enormous – if you tried to represent it as a grid with 100 points along each dimension, the number of possible electronic configurations for a silicon atom would be greater than the number of atoms in the universe. This is exactly what we thought deep neural networks could help with.

In the past few years, there has been tremendous progress in representing complex, high-dimensional probability distributions using neural networks. We now know how to train these networks efficiently and scalably. We speculated that since these networks have already proven their ability to fit high-dimensional functions to AI problems, perhaps they could be used to represent quantum wave functions as well.

Researchers such as Giuseppe Carlio, Matthias Troyer, and others have shown how modern deep learning can be used to solve ideal quantum problems. We wanted to use deep neural networks to tackle more realistic problems in chemistry and condensed matter physics, and this meant including electrons in our calculations.

There is only one wrinkle when dealing with electrons. Electrons must obey the Pauli Exclusion Principle, which means they cannot be in the same place at the same time. This is because electrons are a type of particle known as fermions, which include the building blocks of most matter: protons, neutrons, quarks, neutrinos, and so on. Its wave function must be asymmetric. If you swap the positions of two electrons, the wave function will be multiplied by -1. This means that if there are two electrons on top of each other, the wave function (and the probability of this configuration) will be zero.

This meant that we had to develop a new type of neural network that was asymmetric with respect to its inputs, which we called FermiNet. In most quantum chemistry methods, asymmetry is introduced using a function called a determinant. The determinant of a matrix has the property that if you swap two rows, the resultant will be multiplied by -1, just like the wave function for fermions.

So, you can take a set of single-electron functions, evaluate them for every electron in your system, and pack all the results into a single matrix. Hence the determinant of that matrix is a properly asymmetric wave function. The main limitation of this approach is that the resulting function – known as the Slater determinant – is not very general.

The wave functions of real systems are usually much more complex. The typical way to improve this is to take a large linear combination of Slater determinants – sometimes millions or more – and add some simple corrections based on pairs of electrons. Even then, this may not be enough to calculate the energies accurately.

Don’t miss more hot News like this! Click here to discover the latest in AI news!

2024-08-22 19:00:00

1 3 minutes read