Density functional theory calculations using the finite element method

We propose a method to solve Kohn–Sham equations and to calculate electronic states, total energy, and material properties of non-crystalline, non-periodic structures with l-dependent fully non-local real-space ab initio pseudopotentials using finite elements. Contrary to the variety of well established k-space methods, which are based on Bloch’s theorem and applicable to periodic structures, we do not assume periodicity in any respect. Precise ab initio environment-reflecting pseudopotentials that have been applied in the k-space, plane wave approach so far, are connected with real space finite-element basis in this work. The main expected asset of the present approach is the combination of efficiency and high precision of ab initio pseudopotentials with applicability not restricted to periodic environment.


INTRODUCTION
In this paper we propose a method to solve Kohn-Sham equations and to calculate electronic states, total energy, and material properties of non-crystalline, non-periodic structures with l-dependent fully non-local real-space ab initio pseudopotentials using finite elements.
Contrary to the variety of well-established k-space methods that are based on Bloch's theorem and applicable to periodic structures, we do not assume periodicity in any respect.Precise ab initio environmentreflecting pseudopotentials, which have been applied in the k-space, plane wave approach so far, are connected with real space finite-element basis in our work.The main expected asset of the present approach is the combination of efficiency and high precision of ab initio pseudopotentials with applicability not restricted to periodic environment.
In Section 2 we discuss a Density Functional Theory (DFT) and how it is applied to electronic structure calculations.We try to derive all equations that are needed for our work and it should serve as a tutorial.In Section 3 we show some preliminary results of our code for one atomic calculations.In the next section we show incorporation of pseudopotentials together with equations and some other technical difficulties that have to be overcome in order to solve the problem.Our own contribution is to sort all the ideas out and present them in a fashion applicable to our problem.In the last section we show how to solve the Kohn-Sham equations using a finite element method -again pieces of the information could be found elsewhere in the literature, but our contribution is showing how the density functional theory problem with pseudopotentials could be solved using the finite element method.

Introduction
This section serves as an introduction to DFT and we try to derive all equations that are used in our program.We tried not to skip any important steps; however, we suggest that the literature be consulted if something is not clear enough.Excellent references are [2], [5], and [6].

Many-body Schrödinger equation
We use the Born-Oppenheimer approximation, which says that the nuclei of the treated atoms are seen as fixed.A stationary electronic state (for N electrons) is then described by a wave function Ψ(r 1 , r 2 , • • • , r N ) fulfilling the many-body Schrödinger equation is the kinetic term, Û = i< j U(r i , r j ) = 1 2 i, j U(r i , r j ), is the electron-electron interaction term, and is the interaction term between electrons and nuclei, where R k are positions of nuclei and Z k the number of nucleons in each nucleus (we are using atomic units).So for one atomic calculation with the atom nucleus in the origin, we have just v(r i ) = − Z |r i | .The formula |Ψ| 2 = Ψ * Ψ gives the probability density of measuring the first electron at the position r 1 , the second at r 2 , . . ., and the Nth electron at the position r N .The normalization is such that |Φ| 2 d 3 r 1 d 3 r 2 . . .
Integrating |Ψ| 2 over the first N − 1 electrons is the probability density that the Nth electron is at the position r N .Thus the probability density n(r) that any of the N electrons (i.e the first, or the second, or the third, . . ., or the Nth) is at the position r is called the particle (or charge or electron) density and is therefore given by Thus Ω n(r) d 3 r gives the number of particles (and also the amount of charge) in the region of integration Ω. Obviously n(r) The energy of the system is given by where It needs to be stressed that generally E is not a functional of n alone, only the V [n] is.In the next section we show, however, that if the |Ψ is a ground state (of any system), then E becomes a functional of n.

The Hohenberg-Kohn theorem
The SE gives the map where Ψ is the ground state and C is bijective (one-to-one correspondence), because to every V we can compute the corresponding Ψ from SE and two different V and V (differing by more than a constant) give two different Ψ, because if V and V gave the same Ψ, then by substracting Adding these two inequalities together gives which for n = n gives 0 < 0, which is nonsense, therefore n = n .So we have proved that for a given ground state density n 0 (r) (generated by a potential V0 ) it is possible to calculate the corresponding ground state wavefunction Ψ 0 (r 1 , r 2 , • • • , r N ), in other words, Ψ 0 is a unique functional of n 0 : The ground state energy E 0 is also a functional of n 0 We define an energy functional where |Ψ[n] is any ground state wavefunction (generated by an arbitrary potential), that is, n is a ground state density belonging to an arbitrary system.E 0 , which is generated by the potential V 0 , can then be expressed as and for n = n 0 we have (from the Ritz principle) and one has to minimize the functional E v 0 [n]: The term in (4) is universal in the sense that it does not depend on V0 .It can be proven [2] that F[n] is a functional of n for degenerated ground states too, so (5) stays true as well.
The ground state densities in (4) and ( 5) are called pure-state V-representable because they are the densities of the (possible degenerate) ground state of the Hamiltonian with some local potential v(r).One may ask the question if all possible functions are V-representable (this is called the V-representability problem).The question is relevant, because we need to know which functions should be taken into account in the minimization proccess (5).Even though not every function is V-representable [2], every density defined on a grid (finite of infinite) that is strictly positive, normalized, and consistent with the Pauli principle is ensemble V-representable.Ensemble V-representation is just a simple generalization of the above, for details see [2].In plain words, we are fine.
The functional E v 0 [n] in (5) depends on the particle number N, so in order to get n, we need to solve the variational formulation Let the n N (r) be the solution of (6) with a particle number N and the energy E N : The Lagrangian multiplier µ is the exact chemical potential of the system

The Kohn-Sham equations
Consider an auxiliary system of N noninteracting electrons (noninteracting gas): Then the many-body ground state wavefunction can be decomposed into single particle orbitals where From (6) we get The solution to this equation gives the density n s .Now we want to express the energy in (2) using T s and E H for convenience, where E H is the classical electrostatic interaction energy of the charge distribution n(r): or equivalently So from (4) we get The rest of the energy is denoted by E xc = U − E H + T − T s and it is called the exchange and correlation energy functional.From (6) so we arrive at The solution to this equation gives the density n.Comparing (11) to (7) we see that if we choose then n s (r) ≡ n(r).So we solve the Kohn-Sham equations of this auxiliary non-interacting system which yield the orbitals ψ i that reproduce the density n(r) of the original interacting system The sum is taken over the lowest N energies.Some of the ψ i can be degenerated, but it does not matter -the index i counts every eigenfunction including all the degenerated ones.In plain words, the trick is in realizing that the ground state energy can be found by minimizing the energy functional ( 4) and in rewriting this functional into the form (9), which shows that the interacting system can be treated as a noninteracting one with a special potential.

The exchange and correlation term
The exchange and correlation functional can always be written in the form where the ε xc (r ; n) is called the xc energy density.Unfortunately, no one knows ε xc (r ; n) exactly (yet).The most simple approximation is the local density approximation (LDA), for which the xc energy density ε xc at r is taken as that of a homogeneous electron gas (the nuclei are replaced by a uniform positively charged background, density n = const) with the same local density: The xc energy density ε LD xc of the homogeneous gas can be computed exactly [5]: where the ε LD x is the electron gas exchange term given by [5] ε The rest of ε LD xc is hidden in ε LD c (n) for which there does not exist any analytic formula, but the correlation energies are known exactly from quantum Monte Carlo (QMC) calculations by Ceperley and Alder, see for example the review article [6] for more information.The energies were fitted by Vosko, Wilkes, and Nussair (VWN) with ε LD c (n) and they got accurate results with errors less than 0.05 mRy in ε LD c , which means that ε LD c (n) is virtually known exactly.The VWN result: 72744, c = 12.93532,A = 0.0621814, and r s is the electron gas parameter, which gives the mean distance between electrons (in atomic units): .
The xc potential is then computed from (14): The LDA, although very simple, is surprisingly successful.More sophisticated approximations exist, for example the generalized gradient approximation (GGA), which sometimes gives better results than the LDA, but is not perfect either.Other options include orbital-dependent (implicit) density functionals or linear response type functionals, but this topic is still evolving.The conclusion is that the LDA is a good approximation to start with, and only when we are not satisfied, we will have to try some more accurate and modern approximation.
RLDA: Relativistic corrections to the energy-density functional were proposed by MacDonald and Vosko and basically are just a change in ε LD c .
We also need to calculate these derivatives:

Iteration to self-consistency
The V H and V xc potentials in the Kohn-Sham equations (12) depend on the solution n, thus the KS equations need to be iterated to obtain a self-consistent density.One can regard the KS procedure as a nonlinear operator F, which satisfies (at the Mth iteration) and the problem is to find the self-consistent density, which satisfies Due to the long-range nature of the Coulomb interaction, a small change in the input density n M can lead to a relatively large change in the output density Fn M , thus it is not possible to use the output density itself as the input density for the next iteration, because large unstable charge oscillations arise.Rather it is essential to mix input and output densities in an appropriate manner to obtain a new input density.
The n(r) is in practice defined on some grid, or using coefficients of plane waves, local orbitals, or the like, which means that the precise relation in the case of a grid (or some other basis like plane waves can be used instead of |r i ) and n(r So the problem is in solving the equation where x denotes a vector in many dimensions (the number of points in the grid).It can also be expressed in the form of the residual R Almost all of the methods start with approximating where the Jacobian We want R(x M+1 ) = 0, so substituting that into (15) we get If we knew the Jacobian exactly, this would be the multidimensional Newton-Raphson method, but we can only make approximations to J using a sequence of J 0 , J 1 , J 2 , . . .: and the rate of convergence is determined by the quality of the Jacobian.Methods of this type are called quasi-Newton-Raphson methods.
The simplest approach is to use the linear mixing scheme for which where 0 < α ≤ 1 is the mixing parameter, the working value is somewhere around α = 0.1 to α = 0.3.Unfortunately, this procedure is slow, besides we do not explore all the possible densities with this mixing, which means that we do not get the correct density with any accuracy, because we get stuck at a 'stiff' situation for which continued iteration does not improve the distance |R(x M )| between input and output densities.On the other hand, it is very easy to implement and it works in most cases, although slowly.Surprisingly, a very good method is this: and at every iteration adjust the parameters β i according to this very simple algorithm In our tests it behaved almost as well as the second Broyden method.
A more sophisticated approach is the Broyden update, which updates the J successively at every iteration.The first Broyden method uses this formula: which has the disadvantage that we need to compute the inverse Jacobian in (16) at every iteration, which is impossible in our case.The second Broyden method updates the inverse Jacobian directly using this formula starting with the linear mixing: It is impossible to store the whole dense matrix of the inverse Jacobian, but fortunately it is not necessary, realizing that formula (17) has a very simple structure [7]: so the whole inverse Jacobian can be written as and we only need to know how to apply such a Jacobian to an arbitrary vector, which is needed in ( 18) and (16): Thus instead of the whole dense matrix, we only need to save the vectors u and v from every iteration.
Vanderbilt and Louie [8] suggested a modified Broyden method, which incorporates weights, but Eyert [3] showed that if all the weights are used to tune the iteration process to its fastest convergence, they, in fact, cancel out.The result of the scheme is called by Eyert the generalized Broyden method, whose scheme shown by Eyert is exactly the same as for the Anderson mixing: which according to Eyert should converge even faster than the second Broyden method, but it does not in our own implementation.Addition of ω 0 is just for numerical stability; a good value is ω = 0.01, but it can also be switched off by ω 0 = 0.The p is the number of last iterations to use, a good value according to Eyert is p = 5, β M should not influence the convergence much for p = 5.
The problem with n is that there are two conditions that need to be satisfied n > 0 and the normalization n(r)d 3 r = Z.
The Newton method converges to the correct norm, but slowly.The condition n > 0 however causes great instability.One option could be to use a logistic function like r)  for sufficiently large C and solve for x, which can be both positive and negative.But a more elegant solution is to mix V h +V xc instead of the densities.

EXAMPLE OF Pb ATOM CALCULATION
To illustrate the explained theory, we will show how to calculate the Pb atom.We have N = 82 and and we need to sum over the lowest 82 eigenvalues in (13).The way to do that is to calculate the eigenfunction for the lowest n and l by iterating n = 0, 1, 2, . . .and taking l = 1, 2, . . ., n − 1 for each n, while summing the number of states s = 2(2l + 1) for each pair n, l, until we exceed Z and then taking another n just to be sure not to miss any state belonging to the lowest Z states.It turns out that the combination n, l does not change for an atom and it is equal to the well-known configuration of electrons that can be found in any chemistry textbook, as an example, for Pb we get (first number is n, the letter is l and the number in superscript gives the number of times the particular eigenvalue needs to be taken into account in the sum): 1S 2 , 2S 2 , 2P 6 , 3S 2 , 3P 6 , 3D 10 , 4S 2 , 4P 6 , 4D 10 , 4F 14 , 5S 2 , 5P 6 , 5D 10 , 6S The corresponding radial charge density is plotted in Fig. 1.Our code works for any atom (see e.g. a result for boron in Fig. 2).

Hermitian operators in spherical symmetry
We show that every Hermitian operator V in the spherical symmetric problem ( where the operator Vl = lm| V |lm has matrix elements Proof.Matrix elements of a general Hermitian operator V are In spherical symmetry, we have where R is the rotation operator (it is unitary).We have thus derived V (Rr, Rr ) = V (r, r ) true for any R, which means that the kernel only depends on ρ, ρ , and r • r , where r = ρ r and r = ρ r .So we obtain where In Dirac notation: From the above derivation we see that we must have where the operator Vl = lm| V |lm only acts on the radial part of the wavefunction and according to (20) it does not depend on m.Also according to (20) its matrix elements are

Nonlocal pseudopotentials
A nonlocal pseudopotential V is just a general Hermitian operator.We only want to construct pseudopotentials in the spherical problem, so every pseudopotential can be written in the form (19).In practice we only use either \it local\/ (the operator V is local) or \it semilocal\/ (the operator V is radially local, but angularly nonlocal) pseudopotential.
Local potential (radially and angularly local) is defined by: So we can simply write Thus it turned out that the kernel is local and does not depend on l and we get So we recover (21).But we are just fooling around, there is nothing new in these formulas.For a semilocal potential (radially local, but angularly nonlocal), the kernel cannot depend on m and is radially local, so: so the kernel is local and does depend on l and we simply write We can also calculate the same result explicitly in the r representation: and we recover (22).So, to sum it up: a semilocal pseudopotential is a general Hermitian operator in the spherically symmetric problem (i.e.V = R −1 V R) and radially local.All such operators can be written in the form ( 22).Now, it can be shown that if we make the assumption of radial locality, we will get 'correct' wavefunctions and energies in the linear approximation.We generally only take a few terms in the expansion (22), usually only V 0 , V 1 , and V 2 , sometimes also V 3 and V 4 .

Separable potentials
The pseudopotential above has the form which was first introduced by Hamman, Schlüter and Chiang [4].Or, equivalently, in the r representation: The first term does not cause a problem.Let us denote the second term (which is semilocal) simply by v: Let us choose a complete but otherwise arbitrary set of functions |φ i (they contain both a radial and an angular dependence) and define a matrix U by the equation So any Hermitian operator (including v) can be transformed exactly into the following form We diagonalize the matrix U by choosing such functions | φi for which the matrix φ j |v| φk (and hence the corresponding matrix U) is equal to \one.We can find such functions for example using the Gram-Schmidt orthogonalization procedure on |φ i with a norm f |v|g (for functions f and g); more on that later.Then We could take any |φ i and orthogonalize them.However, as we have v in the form of ( 22), we will be using |φ i in the form |φ i = |R nl |lm , because it turns out we will only need to orthogonalize the radial parts.The first term in (23) then corresponds to the KB potential.We of course take more terms and get accurate results without ghost states.
Let us look at the orthogonalization.We start with the wavefunctions: where R nl (ρ) = ρ|R nl and i goes over all possible triplets (nlm), for example in this order (but any order is fine): We can also relate the i and n, l, m using this formula The operator v acts on these |φ i like this Now we need to construct a new orthogonal set of functions | φi satisfying φi |v| φ j = δ i j .
This can be done using several methods.We chose the Gram-Schmidt orthogonalization procedure, which works according to the following scheme: We can verify by a direct calculation that this procedure ensures φi |v| φ j = δ i j .
It may be useful to compute the normalization factors explicitly: We can also write down a first few orthogonal vectors explicitly: which means that φ i |v|φ j = 0 if |φ i and |φ j have different l or m.In other words |φ i and |φ j for different |lm are already orthogonal.Thus the Gram-Schmidt orthogonalization procedure only makes the R nl orthogonal for the same |lm .To get explicit expressions for | φi , we simply use the formulas above and get: where we have constructed new | Rnl from original |R nl : For every V l , we construct and finally we arrive at the separable form of the l dependent pseudopotential Note: the V l is actually V l −V loc , but this is just a detail.
To have some explicit formula, let us write how the separable potential acts on a wavefunction: What we are actually doing is making the local potential V l nonlocal using: where or in r representation: which is useful when computing integrals of this type because the integral on the left hand side actually represents N 2 integrals, where N is the number of basis vectors |φ i .The sum on the right hand side however only represents K • N integrals, where K is the number of terms taken into account in (26).Of course, taking only a finite number of terms in (26) is just an approximation to Vl .In our case, we do not need these 1D integrals (which can be easily computed directly, because V l is local and the basis functions φ i are nonzero only around a node in the mesh, which means that the matrix V i j is sparse), but 3D integrals, where the angular parts of V are nonlocal and the radial part is local (so the matrix V i j is dense).Therefore, the above procedure is the only way how to proceed, because we decompose the matrix V i j into the sum of matrices in the form p i p * j , which can easily be handled and solved.
The scheme for the separation described above works for any functions R nl (ρ).Because of the form of the expansion (26) however, we will use R nl from one atomic calculation.We need to approximate V l by as few terms as possible.So imagine how the V l (ρ) acts on the lowest radial function in the l subspace, which is |R l+1;l , and we see that all the terms in (26) except the first one V l | Rl+1;l Rl+1;l |V l give zero, because they are orthogonal to |R l+1;l .For the function |R l+2;l all the terms except the first two are zero, because Rnl |V 0 |R l+2;l = 0 only for n = l + 1 or n = l + 2 (because the vectors |R l+1;l and |R l+2;l span the same subspace as | Rl+1;l and | Rl+2;l and using (24)).For functions that are a little different from all |R nl (n > l), we will generally not get precise results taking any (finite) number of terms in (26), but the higher terms should give smaller and smaller corrections.
To sum it up: we take all the V l in (25) as we did in ( 22).Theoretically we should take Rnl for all n = l + 1, l + 2, l + 3, . . ., but practically it suffices to only take several Rnl for a given l from one atomic calculation.Let us give an example: we are calculating 14 electrons, so we will only take into account the lowest 14 eigenvalues in the Kohn-Sham equations, which are |φ 1 up to |φ 14 .The lowest radial functions in each l subspace are |φ i for i = 1, 3, 4, 5, 10, 11, 12, 13, 14.On these 9 functions we get a precise result with only one term in the expansion (26).For the other 5 functions (i = 2, 6, 7, 8, 9) we will have to take into account more terms.Let us look in more detail at the case l = 0 (i.e.i = 1, 2, 6).Then and for the case i = 1 we see that one term in (26) is enough: For the case i = 2 we get the correct result with 2 terms in (26) For the case i = 6 we need to take into account 3 terms etc.We can see from this example that taking |R nl from one atomic calculation, we get precise results (with the same atom) only taking into account a finite number of terms in (26), for 14 electrons actually only 3 terms.For several atoms calculation, we will not get precise results, but it should be a sufficiently good approximation.
The described method is general, the only drawback is that if we do not take functions |R nl that are similar to the solution, we need to take a large number of terms in (26), resulting in many matrices of the form p i p * j , which we do not want, even though, theoretically we can get a solution with any precision we want taking more and more terms in (26).

FINITE ELEMENT METHOD
This chapter explains FEM and gives concrete formulas which are needed in the calculation.

Weak formulation of the Schrödinger equation
One particle Schrödinger equation is We multiply both sides by a test function v and integrate over the whole volume we are interested in and using the vector identity which is the weak formulation.The problem reads: find a function ψ such that (28) holds for every v.
The boundary condition of all functions is that they disappear at the boundary (infinity) and have zero normal derivative there.That follows from the physical requirement of normalizability of the wavefunction.

Finite elements
We choose a basis φ i and substitute φ i for v and expand ψ = q j φ j h2 2m ∇φ j • ∇φ i dV + φ i V φ j dV q j = Eφ j φ i dV q j + h2 2m which can be written in a matrix form where Usually we set F i = 0. We decompose the domain into elements and compute the integrals as the sum over elements.For example: where K E i j is the integral over one element only The integral is computed numerically using a Gauss integration: x q are Gauss points (there are N q of them), w q is the weight of each point, and the Jacobian | det J( xq )| is there because we are actually computing the integral on the reference element instead in the real space.
The surface integrals are computed similarly, but in our case they are all zero, as the normal derivative of the wavefunction is zero at the boundary.
The actual assembly is a little more complex (you need to loop over all elements, calculate the integrals above, and put the numbers to the correct place in the global matrix), but the above sketch should make the main idea clear.
As to the concrete form (shape) of the basis (shapefunctions), it is well known that quadratic elements are generally more precise than linear, and that cubes are generally more precise compared to tetrahedra.However, in practice that needs to be tried on a case by case basis, also depending on the mesh.Our code can work with both linear and quadratic elements and with tetrahedra, cubes, prisms, and pyramids.

Pseudopotentials formulation
There are no problems with other matrix elements in (29) except V i j = φ i (r)V φ j (r)d P l (r • r )V l (ρ) Rnl (ρ)φ i (r)φ j (r )V l (ρ ) Rnl (ρ )d 3 rd 3 r , which is a real number, thus nlm p i p * j is also a real number, which means that we can calculate with only the real parts of the matrix p i p * j , because the imaginary parts cancel out in the result: Just do not confuse the basis function φ i (r) with the spherical integration variable φ .

CONCLUSION
We proposed a method to solve Kohn-Sham equations and to calculate electronic states and other properties of non-crystalline, non-periodic structures with l-dependent fully non-local real-space ab initio pseudopotentials using finite elements, together with some preliminary results of our program.We believe this new approach could yield some new interesting results in electronic structure calculations.
which is in contradiction to the assumption that V and V differ by more than a constant.
Similarly, from the ground state wavefunction Ψ we can compute the charge density n giving rise to the map D : Ψ → n, which is also bijective, because to every Ψ we can compute n from (1) and two different Ψ and Ψ give two different n and n , because different Ψ and Ψ give