[aspect-devel] Assembly speedup
Wolfgang Bangerth
bangerth at math.tamu.edu
Tue Oct 8 15:42:32 PDT 2013
> revision 1932 (move is_compressible() out of the inner loop of Stokes
> assembly):
> +---------------------------------------------+------------+------------
> | Total wallclock time elapsed since start | 27.7s |
> | | |
> | Section | no. calls | wall time | % of total
> +---------------------------------+-----------+------------+------------
> | Assemble Stokes system | 23 | 5.31s | 19%
> | Assemble temperature system | 23 | 6.97s | 25%
> | Build Stokes preconditioner | 4 | 2.79s | 10%
> | Build temperature preconditioner| 23 | 0.719s | 2.6%
> | Solve Stokes system | 23 | 7.5s | 27%
> | Solve temperature system | 23 | 1.09s | 3.9%
> | Initialization | 4 | 0.124s | 0.45%
> | Postprocessing | 21 | 0.739s | 2.7%
> | Refine mesh structure, part 1 | 3 | 0.399s | 1.4%
> | Refine mesh structure, part 2 | 3 | 0.104s | 0.37%
> | Setup dof systems | 4 | 1.53s | 5.5%
> +---------------------------------+-----------+------------+------------
And this is after revision 1948 where I filter out all degrees of
freedom in the temperature assembly that I don't care about:
+---------------------------------------------+------------+------------
| Total wallclock time elapsed since start | 26.1s |
| | |
| Section | no. calls | wall time | % of total
+---------------------------------+-----------+------------+------------
| Assemble Stokes system | 23 | 5.37s | 21%
| Assemble temperature system | 23 | 6.13s | 23%
| Build Stokes preconditioner | 4 | 2.81s | 11%
| Build temperature preconditioner| 23 | 0.726s | 2.8%
| Solve Stokes system | 23 | 6.64s | 25%
| Solve temperature system | 23 | 1.14s | 4.3%
| Initialization | 4 | 0.125s | 0.48%
| Postprocessing | 21 | 0.742s | 2.8%
| Refine mesh structure, part 1 | 3 | 0.399s | 1.5%
| Refine mesh structure, part 2 | 3 | 0.104s | 0.4%
| Setup dof systems | 4 | 1.52s | 5.8%
+---------------------------------+-----------+------------+------------
This is probably almost in the noise, but should help significantly with
the problem Thomas sees on many processors. In any case, we're now at
less than 1/3 of the time for temperature assembly :-)
@Thomas: Can you see whether that makes a difference?
@Timo: Want to re-run your 3d simulation with the same setup and compare
results on your end?
Best
Wolfgang
--
------------------------------------------------------------------------
Wolfgang Bangerth email: bangerth at math.tamu.edu
www: http://www.math.tamu.edu/~bangerth/
More information about the Aspect-devel
mailing list