[aspect-devel] Assembly speedup

Fri Oct 11 05:37:34 PDT 2013

On 10/11/2013 03:40 AM, Thomas Geenen wrote:
> with respect to local_assemble_advection system i see a speedup of almost 20X
> using linear elements for temperature.
> however the copy_local_to_global on 512 cores still takes to much time.
> with the new patches it runs 10% faster but still a lot of time is spend in
> inserting matrix values for off process entries

Thanks for doing these timings. I believe that when you measure 
local_assemble_t and copy_l_to_g individually, that they are too fast to give 
you any reasonable information. How did the time report every tenth time step 
change when using 512 cores? They accumulate information over all individual 
calls, of course, so they are not as fine grained but give a good picture 
anyway I believe.

> i will do some timing using mpi_Wtime to make sure we are not looking at
> profiling overhead
> if that gives the same results i will post this on the trilinos forum

I think that would be useful. Feel free to CC: this mailing list here as well.

Best
  W.

-- 
------------------------------------------------------------------------
Wolfgang Bangerth               email:            bangerth at math.tamu.edu
                                 www: http://www.math.tamu.edu/~bangerth/