[aspect-devel] Assembly speedup
Wolfgang Bangerth
bangerth at math.tamu.edu
Tue Oct 8 14:57:21 PDT 2013
> from my profiles the walltime in local_assemble_stokes_system for instance
> is spend in this nested loop
> if (rebuild_stokes_matrix)
> for (unsigned int i=0; i<dofs_per_cell; ++i)
> for (unsigned int j=0; j<dofs_per_cell; ++j)
> data.local_matrix(i,j) += ( eta * 2.0 *
> (scratch.grads_phi_u[i] * scratch.grads_phi_u[j])
> -
> (material_model->is_compressible()
> ?
> eta * 2.0/3.0 *
> (scratch.div_phi_u[i] * scratch.div_phi_u[j])
> :
> 0)
> - (pressure_scaling *
> scratch.div_phi_u[i] *
> scratch.phi_p[j])
> - (pressure_scaling *
> scratch.phi_p[i] *
> scratch.div_phi_u[j]))
> *
> scratch.finite_element_values.JxW(q);
>
> moving the
> material_model->is_compressible()
> ?
>
> outside the loop made this part of the code run 35% faster.
> other then that i do not observe any conditionals or are calls
> like scratch.phi_p[i] more complex then just returning a value?
Just for everyone's reference -- using the last few patches makes
assembly on Timo's recently added 3dbox.prm significantly faster
(timings for the first 20 time steps on 16 processors):
revision 1930 (baseline):
+---------------------------------------------+------------+------------
| Total wallclock time elapsed since start | 43.4s |
| | |
| Section | no. calls | wall time | % of total
+---------------------------------+-----------+------------+------------
| Assemble Stokes system | 23 | 7.83s | 18%
| Assemble temperature system | 23 | 19.4s | 45%
| Build Stokes preconditioner | 4 | 2.81s | 6.5%
| Build temperature preconditioner| 23 | 0.735s | 1.7%
| Solve Stokes system | 23 | 8.19s | 19%
| Solve temperature system | 23 | 1.13s | 2.6%
| Initialization | 4 | 0.128s | 0.29%
| Postprocessing | 21 | 0.741s | 1.7%
| Refine mesh structure, part 1 | 3 | 0.401s | 0.92%
| Refine mesh structure, part 2 | 3 | 0.103s | 0.24%
| Setup dof systems | 4 | 1.52s | 3.5%
+---------------------------------+-----------+------------+------------
revision 1931 (only loop over temperature/compositional fields in
advection assembly):
+---------------------------------------------+------------+------------
| Total wallclock time elapsed since start | 31.6s |
| | |
| Section | no. calls | wall time | % of total
+---------------------------------+-----------+------------+------------
| Assemble Stokes system | 23 | 7.83s | 25%
| Assemble temperature system | 23 | 6.87s | 22%
| Build Stokes preconditioner | 4 | 2.83s | 9%
| Build temperature preconditioner| 23 | 0.736s | 2.3%
| Solve Stokes system | 23 | 8.73s | 28%
| Solve temperature system | 23 | 1.21s | 3.8%
| Initialization | 4 | 0.125s | 0.4%
| Postprocessing | 21 | 0.741s | 2.3%
| Refine mesh structure, part 1 | 3 | 0.398s | 1.3%
| Refine mesh structure, part 2 | 3 | 0.104s | 0.33%
| Setup dof systems | 4 | 1.54s | 4.9%
+---------------------------------+-----------+------------+------------
revision 1932 (move is_compressible() out of the inner loop of Stokes
assembly):
+---------------------------------------------+------------+------------
| Total wallclock time elapsed since start | 27.7s |
| | |
| Section | no. calls | wall time | % of total
+---------------------------------+-----------+------------+------------
| Assemble Stokes system | 23 | 5.31s | 19%
| Assemble temperature system | 23 | 6.97s | 25%
| Build Stokes preconditioner | 4 | 2.79s | 10%
| Build temperature preconditioner| 23 | 0.719s | 2.6%
| Solve Stokes system | 23 | 7.5s | 27%
| Solve temperature system | 23 | 1.09s | 3.9%
| Initialization | 4 | 0.124s | 0.45%
| Postprocessing | 21 | 0.739s | 2.7%
| Refine mesh structure, part 1 | 3 | 0.399s | 1.4%
| Refine mesh structure, part 2 | 3 | 0.104s | 0.37%
| Setup dof systems | 4 | 1.53s | 5.5%
+---------------------------------+-----------+------------+------------
So temperature assembly now only takes 35% of the time it used to take,
Stokes assembly 68%. Good job, Thomas & Timo!
I'm working on filtering the local systems, let's see if that helps some
more.
Best
Wolfgang
--
------------------------------------------------------------------------
Wolfgang Bangerth email: bangerth at math.tamu.edu
www: http://www.math.tamu.edu/~bangerth/
More information about the Aspect-devel
mailing list