[aspect-devel] Aspect Parallel Initialization
Eric Heien
emheien at ucdavis.edu
Tue Nov 27 23:36:12 PST 2012
It seems the slowdown had a rather unexpected cause. Since all cores write the parameter file simultaneously, the filesystem has locking issues that slow down all nodes. This would also explain the linear relationship with the number of cores. To fix this I changed the code to only write the parameter file on the root node which seems to solve the problem.
-Eric
On Nov 27, 2012, at 10:38 AM, Timo Heister wrote:
> Hi Eric,
>
> it would be great if you could tell us where the time is spent. 30
> minutes sounds WAY too long.
>
> We had noticed a regression inside the
> TrilinosWrappers::BlockSparseMatrix::reinit() call a while ago, that
> we never fixed. You might want to put timing around that call.
>
> Let me know what you find!
>
> On Tue, Nov 27, 2012 at 11:44 AM, Eric Heien <emheien at ucdavis.edu> wrote:
>> While running some tests on TACC Ranger I noticed that Aspect can take a long time to initialize. This appears to be a function of the number of nodes rather than model size, and becomes quite significant for large scale runs. When using 16 nodes (256 cores) initialization takes 30 minutes - extrapolating from the trend it appears using 10,000 cores would require a day to initialize. Before I delve into the reasons for this, has anyone encountered this and know why it is taking so long?
>>
>> -Eric
>>
>> _______________________________________________
>> Aspect-devel mailing list
>> Aspect-devel at geodynamics.org
>> http://geodynamics.org/cgi-bin/mailman/listinfo/aspect-devel
>
>
>
> --
> Timo Heister
> http://www.math.tamu.edu/~heister/
> _______________________________________________
> Aspect-devel mailing list
> Aspect-devel at geodynamics.org
> http://geodynamics.org/cgi-bin/mailman/listinfo/aspect-devel
More information about the Aspect-devel
mailing list