[aspect-devel] Error in moving temporary files during output
Rene Gassmoeller
rengas at gfz-potsdam.de
Mon Jan 13 09:05:21 PST 2014
Dear all,
I have a follow up question on the Input/Output issue Wolfgang had. I am
currently doing scaling tests on a Cray XC30 up to 4000 cores and
getting the same error messages. I am writing in a designated $WORK
directory that is intended for data output, however there is also a fast
local $TMPDIR directory on each compute node.
I guess my question is: Is the Number of grouped files = 1 parameter
always useful in case of a large parallel computation and an existent
MPI I/O system or is this cluster specific (in that case I will just
contact the system administrators for help)?
Another thing I would like to mention is that this system only allows
for static linking. With my limited knowledge I was only able to compile
ASPECT by commenting out the option to dynamically load external
libraries by user input. Could somebody who introduced the possibility
to dynamically load libraries at runtime comment on the work to make
this a switch at compile-time? I dont know much about this, otherwise I
would search for a solution myself. In case this creates a longer
discussion I will open up a new thread on the mailing list.
Thanks for comments and suggestions,
Rene
On 10/14/2013 05:02 PM, Wolfgang Bangerth wrote:
> On 10/11/2013 01:32 PM, Timo Heister wrote:
>>> I'm running this big computation and I'm getting these errors:
>>
>> For a large parallel computation I would use MPI I/O using
>> set Number of grouped files = 1
>
> Good point. I had forgotten about this flag.
>
>>
>>> ***** ERROR: could not move /tmp/tmp.gDzuQS to
>>> output/solution-00001.0007.vtu *****
>>
>> Is this on brazos? Are you writing/moving into your slow NFS ~/? I got
>> several GB/s writing with MPI I/O to the parallel filesystem (was it
>> called data or fdata?).
>
> Yes. I was just writing to something under $HOME. Maybe not very smart...
>
>
>> retry and otherwise abort the computation I would say.
>
> OK, that's what I'm doing now. It shouldn't just fail any more now.
>
> Best
> W.
>
>
More information about the Aspect-devel
mailing list