[aspect-devel] Number of grouped files = 1 -> code doesn't exit after it's finished

Austermann, Jacqueline jaustermann at fas.harvard.edu
Tue Mar 22 18:43:25 PDT 2016


Hi Rene and everyone,

Thanks for the tips! I started having problems when the system admins updated openmpi from 1.8.3 to 1.10.0. I can only get the 1.8.3 version to work but that has become a little buggy since the system update. I tried setting the compiler paths correctly but that could certainly be a place to double check (thanks Max for you suggestion!). If I find an issue that is aspect related I will definitely share it.

Thanks everyone!
Jacky


On Mar 21, 2016, at 4:36 PM, Rene Gassmoeller <rene.gassmoeller at mailbox.org<mailto:rene.gassmoeller at mailbox.org>> wrote:

Hi Jacky,
did you use this option before and did it work before? (and did maybe something in your cluster evironment change?) Because if it changed its behavior I might need to take another look at the recent changes. In fact the behavior of this part  (when number of grouped files != 0) should not have changed.

But it really seems that the MPIIO option is somehow implementation dependent or at least fragile. E.g. on one of the clusters that I use I recently noticed that I can set 'Number of grouped files = 24' and it works flawlessly, but 'Number of grouped files = 32' does crash every time (I am using more than 32 processes). It seems there is a limit for concurrently active MPI communicators on this cluster. All in all I think the MPIIO feature is very nice and useful for large models, but it seems we can not guarantee its functionality on all systems.

If you get any insights from talking to your IT administrators, please let us know. The relevant lines of code to look at would be postprocess/visualization.cc<http://visualization.cc>:460-511 and deal.II/source/base/data_out_base.cc::6137-6198.

Hope you can figure out the problem,
Rene



On 03/19/2016 11:06 AM, Austermann, Jacqueline wrote:

Hi,

After updating my master with the upstream master I’ve been running into the following problem: When I set Number of grouped files in subsection Postprocess, subsection Visualization to 1 (and probably to anything other than 0) the code runs through without problems till the end (after it prints the table with the different run times) but does then not exit. It just hangs and I can cancel it manually with the right output but if I don’t do that it just stays there until the runtime for the submitted run is exceeded. I tried this out with a few cookbooks and always had the same problem (I attached a .prm that produces this error).
Let me know if you have any ideas why this could be.

Thanks!
Jacky







_______________________________________________
Aspect-devel mailing list
Aspect-devel at geodynamics.org<mailto:Aspect-devel at geodynamics.org>
http://lists.geodynamics.org/cgi-bin/mailman/listinfo/aspect-devel<https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.geodynamics.org_cgi-2Dbin_mailman_listinfo_aspect-2Ddevel&d=CwMDaQ&c=WO-RGvefibhHBZq3fL85hQ&r=j5AqZvMsoErn2L-vpbdTErPRtyT4BhzQUwKzsenbbTc&m=9JH4Lt_jU3l5LUtNJnJEjRwu8GCLVfWb2Skg43g7b2o&s=GFE7JT0ZTUHQgjAh-7iPm2IRTZe8f9FtNMLfzz1uEQs&e=>

_______________________________________________
Aspect-devel mailing list
Aspect-devel at geodynamics.org<mailto:Aspect-devel at geodynamics.org>
http://lists.geodynamics.org/cgi-bin/mailman/listinfo/aspect-devel

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.geodynamics.org/pipermail/aspect-devel/attachments/20160323/d242eae7/attachment.html>


More information about the Aspect-devel mailing list