Solving discrete finite element systems rely heavily on the computation of matrix-vector products with sparse matrices. These matrix-vector products tend to dominate the total amount of work required for the average finite element program, and, as these matrices can be quite large and do not fit into the cache of a modern machine, accessing the data from RAM has become a major bottleneck in finite element computations. Instead of computing the entries of and storing a system matrix, matrix-free methods define an operation on a vector as a loop over the cells in the domain, applying small, dense matrices defined locally on the cell, and summing the results over the entire domain. In this way, one replicates the action of a matrix while never actually storing a matrix. These methods show significant gains in the time of a matrix-vector product as compared to traditional matrix-based methods when using degree 2 and higher finite elements .
In many applications inside ASPECT, much of the computational time is spent in solving the Stokes equations. These equations are discretized with a degree k finite element for velocity and degree k - 1 for pressure, where k ≥ 2. Traditionally, the discretized system is solved using GMRES with a suitable preconditioner based on an Algebraic Multigrid v-cycle. Now a matrix-free method for solving the Stokes system has been implemented based on a Geometric Multigrid method (developed in ) that sees improvements in GMRES iterations, especially for adaptive computations due to the more robust, geometric definition of the multigrid hierarchy of subproblems, as well as improvements in overall runtime from the combination of lower iterations and faster matrix-vector products. This new method also allows for larger computations as storing the matrix has dominated the memory consumption of the matrix-based method.
Figure 2 demonstrates the strong scaling of the GMRES solve with GMG preconditioning on the Nsinker benchmark in ASPECT whose solution is given in Figure 1. The computations are performed on Stokes systems from globally refined meshes with up to 3.4 billion unknowns and run on up to 24,576 processors. We see scaling to around 20K unknowns per processor. Figure 3 gives a comparison of the new GMG method to the current AMG method. There we demonstrate the weak scaling of the GMRES solve on the same Nsinker benchmark. Each data point represents a solve on a mesh which comes from adaptive refinement of the previous mesh with a Kelly estimator, roughly doubling the problem size (18M unknowns to 2.2B unknowns). The GMG-based solve is anywhere from 2.5-4x faster than the AMG-based solve, and exhibits better scaling.
The current implementation has many features (free-slip, no-slip and prescribed boundaries, incompressible and (explicitly defined) compressible higher-order finite elements), however one drawback is that coefficient averaging must be used for viscosity. Over the next few months we will look at tackling this problem by including a projection of viscosity to a degree 2 space and interpolating to quadrature points, hopefully recovering the regularity in the solution required for optimal convergence rates. Additionally, implicit compressibility, free-surface, and melt transport have not been implemented, and remain an open question.
Contributed by Thomas C. Clevenger
 M. Kronbichler and K. Kormann. A generic interface for parallel cell-based finite element operator application. Computers & Fluids, 63:135-147, 2012.
 T.C. Clevenger, T. Heister, G. Kanschat, and M. Kronbichler. A flexible, parallel, adaptive geometric multigrid method for fem. submitted.
Figure 1. Solution of the Nsinker benchmark in ASPECT.
Figure 2. GMRES solve time to reduce the residual by 1e6 for the 3D Stokes equations of the Nsinker benchmark in ASPECT (4 sinkers, viscosity ratio of 1e4) for the new matrix-free solver with a GMG preconditioner. Discretized with Taylor-Hood elements [Q2]d X Q1. The plot represents strong scaling for 4 different meshes created by global refinement of the unit cube.
Figure 3. GMRES solve time to reduce the residual by 1e6 for the 3D Stokes equations of the Nsinker benchmark in ASPECT (4 sinkers, viscosity ratio of 1e4) for a matrix-based solve with an AMG preconditioner (current solver in most ASPECT applications) and the new matrix-free solve with a GMG preconditioner. Discretized with Taylor-Hood elements [Q2]d X Q1. The plot represents weak scaling where the first data point is a mesh with 18M unknowns solved with 48 processors, then each subsequent data point comes from an adaptive refinement step based on a Kelly estimator in which the problem size is roughly doubled (the last data point is 2.2B unknowns on 6,144 processors). The red dashed line represents the ideal scaling of the Geometric Multigrid method based on the imbalance of the parallel partition of cells due to adaptivity (see ).