[aspect-devel] Particle related error when running on more than 1 node

Eric Heien emheien at ucdavis.edu
Mon Jun 25 10:03:40 PDT 2012


Hi John,

I can't seem to reproduce this problem with your parameter file.  Does the error occur at the beginning of the simulation or after it has run for a while?  If it happens after a while, can you do a visualization of the particle movements until it crashes?

Thanks,

-Eric

On Jun 23, 2012, at 3:30 PM, John Naliboff wrote:

> Hi Eric,
> 
> Gotcha.  The simulation (.prm file attached) is pretty straightforward - 2D box (1x1 dimensions) with cooling at the top and heating at the bottom.  The Rayleigh number is pretty high (1e8), but the model seems to be converging towards the predicted Nusselt number (i.e. running on single node).
> 
> For the even higher Ra number simulations (5e8, 1e9) I'll comment the out the check_particle_count() line, as the computation time is noticeably increasing as the Rayleigh number goes up.
> 
> John
> 
> <bkb1Ra1e8.prm>
> 
> 
> On Jun 22, 2012, at 10:49 PM, Eric Heien wrote:
> 
>> This means a particle fell out of the mesh and was lost - I haven't yet written code to put these particles back in the mesh.  It may be indicative of something non-physical about the simulation, but since it happens based on the number of cores it is more likely a bug I need to fix.  Can you send me the parameter file you're using?
>> 
>> If you want to ensure the run works anyway, you can try commenting out the call to check_particle_count() in tracer.cc.
>> 
>> -Eric
>> 
>> On Jun 23, 2012, at 2:53 AM, John Naliboff wrote:
>> 
>>> Hi all,
>>> 
>>> I came across an error that appears to be associated with tracking of particles across multiple nodes:
>>> 	Exception on MPI process <0> while running postprocessor <N6aspect11Postprocess11ParticleSetILi2EEE>: 
>>> 	-------------------------------------------------------- 
>>>       An error occurred in line <484> of file <source/postprocess/tracer.cc> in function 
>>> 	   void aspect::Postprocess::ParticleSet<dim>::check_particle_count() [with int dim = 2] 
>>>      The violated condition was: 
>>> 	   global_particles==global_sum_particles 
>>>      The name and call sequence of the exception was: 
>>>          ExcMessage ("Particle count changed.") Additional Information: Particle count changed. 
>>>      -------------------------------------------------------- 
>>>      Aborting!
>>> 
>>> I received this error when I tried to run a model on 32 threads across 2 nodes.  I am currently running the same model successfully on 16 threads across 1 node.  The attached file (bkb1Ra1e8.o6876) contains the full error output.  The error also generated a number of "core" files in the model directory.  I last updated aspect through svn update two weeks ago, but this is perhaps an issue specific to deal.ii?
>>> 
>>> Cheers,
>>> John
>>> 
>>> <bkb1Ra1e8.o6876>
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> Aspect-devel mailing list
>>> Aspect-devel at geodynamics.org
>>> http://geodynamics.org/cgi-bin/mailman/listinfo/aspect-devel
>> 
>> _______________________________________________
>> Aspect-devel mailing list
>> Aspect-devel at geodynamics.org
>> http://geodynamics.org/cgi-bin/mailman/listinfo/aspect-devel
> 
> _______________________________________________
> Aspect-devel mailing list
> Aspect-devel at geodynamics.org
> http://geodynamics.org/cgi-bin/mailman/listinfo/aspect-devel



More information about the Aspect-devel mailing list