[aspect-devel] Assembly speedup

Thomas Geenen geenen at gmail.com
Fri Oct 11 01:40:12 PDT 2013


with respect to local_assemble_advection system i see a speedup of almost
20X using linear elements for temperature.
however the copy_local_to_global on 512 cores still takes to much time.
with the new patches it runs 10% faster but still a lot of time is spend in
inserting matrix values for off process entries
i will do some timing using mpi_Wtime to make sure we are not looking at
profiling overhead
if that gives the same results i will post this on the trilinos forum

cheers
Thomas



On Wed, Oct 9, 2013 at 12:42 AM, Wolfgang Bangerth
<bangerth at math.tamu.edu>wrote:

>
>  revision 1932 (move is_compressible() out of the inner loop of Stokes
>> assembly):
>> +-----------------------------**----------------+------------+**
>> ------------
>> | Total wallclock time elapsed since start    |      27.7s |
>> |                                             |            |
>> | Section                         | no. calls |  wall time | % of total
>> +-----------------------------**----+-----------+------------+**
>> ------------
>> | Assemble Stokes system          |        23 |      5.31s |        19%
>> | Assemble temperature system     |        23 |      6.97s |        25%
>> | Build Stokes preconditioner     |         4 |      2.79s |        10%
>> | Build temperature preconditioner|        23 |     0.719s |       2.6%
>> | Solve Stokes system             |        23 |       7.5s |        27%
>> | Solve temperature system        |        23 |      1.09s |       3.9%
>> | Initialization                  |         4 |     0.124s |      0.45%
>> | Postprocessing                  |        21 |     0.739s |       2.7%
>> | Refine mesh structure, part 1   |         3 |     0.399s |       1.4%
>> | Refine mesh structure, part 2   |         3 |     0.104s |      0.37%
>> | Setup dof systems               |         4 |      1.53s |       5.5%
>> +-----------------------------**----+-----------+------------+**
>> ------------
>>
>
> And this is after revision 1948 where I filter out all degrees of freedom
> in the temperature assembly that I don't care about:
>
> +-----------------------------**----------------+------------+**
> ------------
> | Total wallclock time elapsed since start    |      26.1s |
>
> |                                             |            |
> | Section                         | no. calls |  wall time | % of total
> +-----------------------------**----+-----------+------------+**
> ------------
> | Assemble Stokes system          |        23 |      5.37s |        21%
> | Assemble temperature system     |        23 |      6.13s |        23%
> | Build Stokes preconditioner     |         4 |      2.81s |        11%
> | Build temperature preconditioner|        23 |     0.726s |       2.8%
> | Solve Stokes system             |        23 |      6.64s |        25%
> | Solve temperature system        |        23 |      1.14s |       4.3%
> | Initialization                  |         4 |     0.125s |      0.48%
> | Postprocessing                  |        21 |     0.742s |       2.8%
> | Refine mesh structure, part 1   |         3 |     0.399s |       1.5%
> | Refine mesh structure, part 2   |         3 |     0.104s |       0.4%
> | Setup dof systems               |         4 |      1.52s |       5.8%
> +-----------------------------**----+-----------+------------+**
> ------------
>
> This is probably almost in the noise, but should help significantly with
> the problem Thomas sees on many processors. In any case, we're now at less
> than 1/3 of the time for temperature assembly :-)
>
>
> @Thomas: Can you see whether that makes a difference?
>
> @Timo: Want to re-run your 3d simulation with the same setup and compare
> results on your end?
>
>
> Best
>  Wolfgang
>
>
> --
> ------------------------------**------------------------------**
> ------------
> Wolfgang Bangerth               email:            bangerth at math.tamu.edu
>                                 www: http://www.math.tamu.edu/~**bangerth/<http://www.math.tamu.edu/~bangerth/>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://geodynamics.org/pipermail/aspect-devel/attachments/20131011/ded7063b/attachment.html>


More information about the Aspect-devel mailing list