[aspect-devel] Assembly speedup
Wolfgang Bangerth
bangerth at math.tamu.edu
Fri Oct 11 05:37:34 PDT 2013
On 10/11/2013 03:40 AM, Thomas Geenen wrote:
> with respect to local_assemble_advection system i see a speedup of almost 20X
> using linear elements for temperature.
> however the copy_local_to_global on 512 cores still takes to much time.
> with the new patches it runs 10% faster but still a lot of time is spend in
> inserting matrix values for off process entries
Thanks for doing these timings. I believe that when you measure
local_assemble_t and copy_l_to_g individually, that they are too fast to give
you any reasonable information. How did the time report every tenth time step
change when using 512 cores? They accumulate information over all individual
calls, of course, so they are not as fine grained but give a good picture
anyway I believe.
> i will do some timing using mpi_Wtime to make sure we are not looking at
> profiling overhead
> if that gives the same results i will post this on the trilinos forum
I think that would be useful. Feel free to CC: this mailing list here as well.
Best
W.
--
------------------------------------------------------------------------
Wolfgang Bangerth email: bangerth at math.tamu.edu
www: http://www.math.tamu.edu/~bangerth/
More information about the Aspect-devel
mailing list