[aspect-devel] A weird error with MPI postprocessing

Timo Heister timo.heister at gmail.com
Mon Feb 29 06:24:41 PST 2016


Shangxin,

I have no clue why it could fail. What did you set "Number of grouped
files" to? How many MPI ranks? What kind of filesystem are you writing
to? Could you be running out of disk space?

Maybe we or the implementation is leaking file handles. Can you check
if the same thing happens if you run a simple 2d convection-box after
~3000 postprocessing steps?

On Mon, Feb 29, 2016 at 12:06 AM, Shangxin Liu <sxliu at vt.edu> wrote:
> Hi;
>
> Recently, when I'm running the time-dependent cases, sometimes the jobs will
> fail at the postprocessing part after running more than ~20 hours. I paste
> the error here (from one of my cases):
>
> ----------------------------------------------------
>
> Exception on MPI process <76> while running postprocessor
> <N6aspect11Postprocess13VisualizationILi3EEE>:
>
>
> --------------------------------------------------------
>
> An error occurred in line <6156> of file
> </home/shangxin/sources/dealii/source/base/data_out_base.cc> in function
>
>     void dealii::DataOutInterface<dim,
> spacedim>::write_vtu_in_parallel(const char*, MPI_Comm) const [with int dim
> = 3; int spacedim = 3; MPI_Comm = ompi_communicator_t*]
>
> The violated condition was:
>
>     err==0
>
> The name and call sequence of the exception was:
>
>     ExcMessage("Unable to open file with MPI_File_open!")
>
> Additional Information:
>
> Unable to open file with MPI_File_open!
>
> --------------------------------------------------------
>
>
> Aborting!
>
> ----------------------------------------------------
>
>
> This error often appears after running dozens of hours so it's hard to debug
> in short time test. It seems that this error is related with writing the
> visualization postprocess output to files. But if so, it means that after
> some time step the postprocessing output can proceed but after another
> certain time step the postprocessing will not work (In my case, the code
> crashed at the time step ~3000 postprocessing). I'm using the ASPECT and
> dealii from git hub and didn't modify anything in the postprocessing code.
>
> Any idea why this problem appears and how to solve it?
>
> Best,
> Shangxin
>
>
> _______________________________________________
> Aspect-devel mailing list
> Aspect-devel at geodynamics.org
> http://lists.geodynamics.org/cgi-bin/mailman/listinfo/aspect-devel


More information about the Aspect-devel mailing list