[CIG-SEISMO] segmentation faults on cluster nodes

Dimitri Komatitsch komatitsch at lma.cnrs-mrs.fr
Fri Feb 24 08:53:39 PST 2012


Hi,

What kind of machine / compiler are you using?

This could look like arrays that are allocated and for which we do not 
call "deallocate"  at the end of the code (i.e., the code runs fine and 
writes the seismograms to disk, but then complains at the end). Most 
compilers have no problem with that, but some do. If so, is there an 
option in your compiler to force automatic deallocation of arrays (for 
instance at the end of subroutine calls?).

Best regards,

Dimitri.

On 02/24/2012 04:58 PM, A. Datta wrote:
> Hi,
>
> I'm a first year PhD student and have just started to use SPECFEM3D GLOBE
> (v5.1.3) to calculate some simple synthetics (forward modelling) with the
> S40RTS mantle model.
>
> The code runs fine but the problem is, at the end of the run, I get an
> error file with a list of error messages as follows:
>
> [node092:20021] *** Process received signal *** [node060:16784] *** Process
> received signal *** [node092:20021] Signal: Segmentation fault (11)
> [node092:20021] Signal code: Address not mapped (1) [node092:20021] Failing
> at address: 0x5c91fcd8 [node053:19456] *** Process received signal ***
> [node106:01357] *** Process received signal *** [node099:01255] *** Process
> received signal *** [node099:01255] Signal: Segmentation fault (11)
> [node099:01255] Signal code: Address not mapped (1) [node099:01255] Failing
> at address: 0x5c91fcd8 [node085:01554] *** Process received signal ***
> [node039:06947] *** Process received signal *** [node039:06947] Signal:
> Segmentation fault (11) [node039:06947] Signal code: Address not mapped (1)
> [node039:06947] Failing at address: 0x5c91fcd8 *** glibc detected ***
> /home/ad605/specfem_runs/s40rts_regular_22feb2012/bin/xspecfem3D: double
> free or corruption (out): 0x000000005c91fce0 *** *** glibc detected ***
> /home/ad605/specfem_runs/s40rts_regular_22feb2012/bin/xspecfem3D: free():
> invalid pointer: 0x000000005c91fce0 *** [node043:04676] *** Process
> received signal *** [node103:01276] *** Process received signal ***
> [node105:01279] *** Process received signal *** [node105:01279] Signal:
> Segmentation fault (11) [node105:01279] Signal code: Address not mapped (1)
> [node105:01279] Failing at address: 0x5c91fcd8 [node105:01279] [ 0]
> /lib/libpthread.so.0(+0xf8f0) [0x7f4ca47f18f0] [node105:01279] [ 1]
> /lib/libc.so.6(cfree+0x1d) [0x7f4ca44dce2d] [node105:01279] [ 2]
> /home/ad605/specfem_runs/s40rts_regular_22feb2012/bin/xspecfem3D(MAIN__+0x16cda)
> [0x47ac32] [node105:01279] [ 3]
> /home/ad605/specfem_runs/s40rts_regular_22feb2012/bin/xspecfem3D(main+0x2a)
> [0x4d7cca] [node105:01279] [ 4] /lib/libc.so.6(__libc_start_main+0xfd)
> [0x7f4ca447dc4d] [node105:01279] [ 5]
> /home/ad605/specfem_runs/s40rts_regular_22feb2012/bin/xspecfem3D()
> [0x405f49] [node105:01279] *** End of error message ***
>
> This is repeated several times in the file, for different nodes on the
> cluster where the node runs.
>
> Despite this however, I get all the required seismograms in the output
> directory at the end of the run, and they look fine.
>
> But this error file still bothers me and I don't understand what it means.
> Does anyone have a idea, or do these error messages look familiar ?
>
> Thanks very much for your help in anticipation !
> Arjun
>
>

-- 
Dimitri Komatitsch - komatitsch aT lma.cnrs-mrs.fr
CNRS Research Director (DR CNRS), Laboratory of Mechanics and Acoustics,
UPR 7051, Marseille, France    http://komatitsch.free.fr


More information about the CIG-SEISMO mailing list