[CIG-SEISMO] CIG-SEISMO Digest, Vol 120, Issue 5

Daniel B. Peter daniel.peter at kaust.edu.sa
Mon Feb 12 14:29:20 PST 2018


Hi Moritz,

again good point - your pushing the limits :) 

concerning the mesh, the GPU memory is just fine for that number of elements. 1M elements will take about 10GB memory for an elastic simulation. so you could even add more if you like for those two K40s. 

in your setup, it’s likely a problem with the seismogram allocations on the GPUs. this changed recently such that the full seismogram arrays are allocated on the GPU as well to speed up simulations even further. unfortunately, this comes with a hit on memory consumption. 

in your setup, you have about 10,000 stations and 111,111 time steps. depending on the number of receivers in a single slice and how they are shared between the 2 GPUs, a rough estimate for your case shows up to about 6.3GB additional memory for a single 3-component seismogram (e.g. displacement) per GPU (assuming you have an even split with 5,000 local stations per GPU). given the ~7.5GB from the mesh, this exceeds a single GPU memory of 12GB. 

as a quick workaround, you would split up the stations and run several simulations for different station setups. this can limit the seismogram memory allocation. the other workarounds involve some changes in the code: (i) we could add checkpointing (again, the global version has it, so in principle easy to add to the Cartesian version) but this means you will have to run multiple simulations as well, (ii) we reduce the seismogram allocations. well, both would be nice, let me focus on (ii) first since we already have a parameter in the Par_file, NTSTEP_BETWEEN_OUTPUT_SEISMOS, which could be used here to limit the array allocation size. 

best wishes,
daniel




> On Feb 12, 2018, at 7:25 PM, Fehr, Moritz <Moritz.Fehr at dmt-group.com> wrote:
> 
> Hi Daniel,
> 
> thanks a lot for adjusting the receiver detection routine. First the routine seems to work fine, but after finishing the routine the calculation immediately breaks up with CUDA memory errors (see error files). I have used two Tesla K40m GPUs (2 x 12 gig) and the model consists of 1.500.000 elements. I think the GPU memory should be enough for that model size. The same problem has occurred with the V100 GPUs.
> Do you have any idea?
> 
> Thanks
> Mo
> 
> 
> 
> 
> 
> 
> 
> -----Ursprüngliche Nachricht-----
> Von: CIG-SEISMO [mailto:cig-seismo-bounces at geodynamics.org] Im Auftrag von cig-seismo-request at geodynamics.org
> Gesendet: Donnerstag, 25. Januar 2018 11:02
> An: cig-seismo at geodynamics.org
> Betreff: CIG-SEISMO Digest, Vol 120, Issue 5
> 
> Send CIG-SEISMO mailing list submissions to
> cig-seismo at geodynamics.org
> 
> To subscribe or unsubscribe via the World Wide Web, visit
> http://lists.geodynamics.org/cgi-bin/mailman/listinfo/cig-seismo
> or, via email, send a message with subject or body 'help' to
> cig-seismo-request at geodynamics.org
> 
> You can reach the person managing the list at
> cig-seismo-owner at geodynamics.org
> 
> When replying, please edit your Subject line so it is more specific than "Re: Contents of CIG-SEISMO digest..."
> 
> 
> Today's Topics:
> 
>   1. Re: SPECFEM3D: GPU memory usage limited (Daniel B. Peter)
>   2.  SPECFEM3D: GPU memory usage limited (Fehr, Moritz)
> 
> 
> ----------------------------------------------------------------------
> 
> Message: 1
> Date: Wed, 24 Jan 2018 21:02:53 +0000
> From: "Daniel B. Peter" <daniel.peter at kaust.edu.sa>
> To: "cig-seismo at geodynamics.org" <cig-seismo at geodynamics.org>
> Subject: Re: [CIG-SEISMO] SPECFEM3D: GPU memory usage limited
> Message-ID: <5EC00584-31D0-4CFE-8708-4142FEC8D106 at kaust.edu.sa>
> Content-Type: text/plain; charset="utf-8"
> 
> hi Moritz,
> 
> is there also an error output?
> 
> I’m not aware that there should be such an issue with SPECFEM3D on this newest GPU hardware. running simulations on Pascal GPUs with multiple GB memory usage works just fine. so I would expect that the run exits because of another issue, not because of the GPU memory part.
> 
> your output_solver.txt stops at the receiver detection. based on your setup which uses about 13,670 stations, i expect it to be a receiver detection issue. this routines works fine for a few hundred station, but becomes very slow for more than a few thousand on a single process. this issue has been addressed in the global version, let me see if i can implement it in a similar way in the SPECFEM3D devel version.
> 
> many thanks for pointing out,
> daniel
> 
> 
> 
> On Jan 24, 2018, at 6:35 PM, Fehr, Moritz <Moritz.Fehr at dmt-group.com<mailto:Moritz.Fehr at dmt-group.com>> wrote:
> 
> Hallo,
> 
> I have a problem simulating a CUBIT meshed model (1.000.000 elements) on a Tesla GPU V100-SXM2 (amazon cloud CPU / GPU cluster). I am using SPECEM3D Cartesian (V3.0) and the newest CUDA 9 lib. I want to share this issue: It seems that the GPU memory usage is limited to a value of 420 MiB although the maximum GPU memory is 16000 Mib.
> Do you have any idea about the origin of this limitation?
> 
> Thanks
> Mo
> 
> <image001.png>
> 
> 
> 
> ___________________________________________________________________________________________________
> Sitz der Gesellschaft/Headquarters: DMT GmbH & Co. KG * Am Technologiepark 1 * 45307 Essen * Deutschland/Germany Registergericht/County Court: Amtsgericht Essen * HRA 9091 * USt-ID DE 253275653 Komplementär/Fully Liable Partner: DMT Verwaltungsgesellschaft mbH, Essen Registergericht/County Court: Amtsgericht Essen * HRB 20420 Geschäftsführer/Board of Directors: Prof. Dr. Eiko Räkers (Vorsitzender/CEO), Dr. Maik Tiedemann, Ulrich Pröpper, Jens-Peter Lux Vorsitzender des Aufsichtsrates/Chairman of the Supervisory Board: Jürgen Himmelsbach TÜV NORD GROUP ___________________________________________________________________________________________________
> Diese Nachricht enthält vertrauliche Informationen und ist nur für den Empfänger bestimmt. Wenn Sie nicht der Empfänger sind, sollten Sie die E-Mail nicht verbreiten, verteilen oder diese E-Mail kopieren. Benachrichtigen Sie bitte den Absender per E-Mail, wenn Sie diese E-Mail irrtümlich erhalten haben und löschen dann diese E-Mail von Ihrem System.
> 
> This message contains confidential information and is intended only for the recipient. If you are not the recipient you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system.
> 
> 
> <CMTSOLUTION><output_solver.txt><gpu_device_info.txt><Par_file><output_generate_databases.txt><values_from_mesher.h>_______________________________________________
> CIG-SEISMO mailing list
> CIG-SEISMO at geodynamics.org<mailto:CIG-SEISMO at geodynamics.org>
> http://lists.geodynamics.org/cgi-bin/mailman/listinfo/cig-seismo
> 
> 
> ________________________________
> This message and its contents including attachments are intended solely for the original recipient. If you are not the intended recipient or have received this message in error, please notify me immediately and delete this message from your computer system. Any unauthorized use or distribution is prohibited. Please consider the environment before printing this email.
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <http://lists.geodynamics.org/pipermail/cig-seismo/attachments/20180124/7c4f0513/attachment-0001.html>
> 
> ------------------------------
> 
> Message: 2
> Date: Thu, 25 Jan 2018 10:07:18 +0000
> From: "Fehr, Moritz" <Moritz.Fehr at dmt-group.com>
> To: "cig-seismo at geodynamics.org" <cig-seismo at geodynamics.org>
> Subject: [CIG-SEISMO]  SPECFEM3D: GPU memory usage limited
> Message-ID:
> <b0f6d12197f24424b880f2ae10910a39 at H04EXC13.netz.tuev-nord.de>
> Content-Type: text/plain; charset="utf-8"
> 
> Hi Daniel,
> 
> thanks for your rapid response. Sorry, I do not have any error output file, because of breaking up the simulation by myself (The simulation remains in the process of receiver detection).  But I tried the same simulation with just one receiver and it works fine.
> Please let me know if you can fix the problem.
> 
> Thanks
> Mo
> 
> 
> 
> -----------------------------------------------------------------------------------------------------------------------------------
> hi Moritz,
> 
> is there also an error output?
> 
> I’m not aware that there should be such an issue with SPECFEM3D on this newest GPU hardware. running simulations on Pascal GPUs with multiple GB memory usage works just fine. so I would expect that the run exits because of another issue, not because of the GPU memory part.
> 
> your output_solver.txt stops at the receiver detection. based on your setup which uses about 13,670 stations, i expect it to be a receiver detection issue. this routines works fine for a few hundred station, but becomes very slow for more than a few thousand on a single process. this issue has been addressed in the global version, let me see if i can implement it in a similar way in the SPECFEM3D devel version.
> 
> many thanks for pointing out,
> daniel
> 
> 
> 
> On Jan 24, 2018, at 6:35 PM, Fehr, Moritz <Moritz.Fehr at dmt-group.com<mailto:Moritz.Fehr at dmt-group.com>> wrote:
> 
> Hallo,
> 
> I have a problem simulating a CUBIT meshed model (1.000.000 elements) on a Tesla GPU V100-SXM2 (amazon cloud CPU / GPU cluster). I am using SPECEM3D Cartesian (V3.0) and the newest CUDA 9 lib. I want to share this issue: It seems that the GPU memory usage is limited to a value of 420 MiB although the maximum GPU memory is 16000 Mib.
> Do you have any idea about the origin of this limitation?
> 
> Thanks
> Mo
> 
> <image001.png>
> 
> 
> 
> <CMTSOLUTION><output_solver.txt><gpu_device_info.txt><Par_file><output_generate_databases.txt><values_from_mesher.h>_______________________________________________
> CIG-SEISMO mailing list
> CIG-SEISMO at geodynamics.org<mailto:CIG-SEISMO at geodynamics.org> http://lists.geodynamics.org/cgi-bin/mailman/listinfo/cig-seismo
> 
> 
> ________________________________
> This message and its contents including attachments are intended solely for the original recipient. If you are not the intended recipient or have received this message in error, please notify me immediately and delete this message from your computer system. Any unauthorized use or distribution is prohibited. Please consider the environment before printing this email.
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <http://lists.geodynamics.org/pipermail/cig-seismo/attachments/20180124/7c4f0513/attachment.html>
> 
> ___________________________________________________________________________________________________
> Sitz der Gesellschaft/Headquarters: DMT GmbH & Co. KG * Am Technologiepark 1 * 45307 Essen * Deutschland/Germany Registergericht/County Court: Amtsgericht Essen * HRA 9091 * USt-ID DE 253275653 Komplementär/Fully Liable Partner: DMT Verwaltungsgesellschaft mbH, Essen Registergericht/County Court: Amtsgericht Essen * HRB 20420 Geschäftsführer/Board of Directors: Prof. Dr. Eiko Räkers (Vorsitzender/CEO), Dr. Maik Tiedemann, Ulrich Pröpper, Jens-Peter Lux Vorsitzender des Aufsichtsrates/Chairman of the Supervisory Board: Jürgen Himmelsbach TÜV NORD GROUP ___________________________________________________________________________________________________
> Diese Nachricht enthält vertrauliche Informationen und ist nur für den Empfänger bestimmt. Wenn Sie nicht der Empfänger sind, sollten Sie die E-Mail nicht verbreiten, verteilen oder diese E-Mail kopieren. Benachrichtigen Sie bitte den Absender per E-Mail, wenn Sie diese E-Mail irrtümlich erhalten haben und löschen dann diese E-Mail von Ihrem System.
> 
> This message contains confidential information and is intended only for the recipient. If you are not the recipient you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system.
> 
> 
> ------------------------------
> 
> Subject: Digest Footer
> 
> _______________________________________________
> CIG-SEISMO mailing list
> CIG-SEISMO at geodynamics.org
> http://lists.geodynamics.org/cgi-bin/mailman/listinfo/cig-seismo
> 
> ------------------------------
> 
> End of CIG-SEISMO Digest, Vol 120, Issue 5
> ******************************************
> 
> ___________________________________________________________________________________________________
> Sitz der Gesellschaft/Headquarters: DMT GmbH & Co. KG * Am Technologiepark 1 * 45307 Essen * Deutschland/Germany
> Registergericht/County Court: Amtsgericht Essen * HRA 9091 * USt-ID DE 253275653
> Komplementär/Fully Liable Partner: DMT Verwaltungsgesellschaft mbH, Essen
> Registergericht/County Court: Amtsgericht Essen * HRB 20420
> Geschäftsführer/Board of Directors: Prof. Dr. Eiko Räkers (Vorsitzender/CEO), Dr. Maik Tiedemann, Ulrich Pröpper, Jens-Peter Lux
> Vorsitzender des Aufsichtsrates/Chairman of the Supervisory Board: Jürgen Himmelsbach
> TÜV NORD GROUP
> ___________________________________________________________________________________________________
> Diese Nachricht enthält vertrauliche Informationen und ist nur für den Empfänger bestimmt. Wenn Sie nicht der Empfänger sind, sollten Sie die E-Mail nicht verbreiten, verteilen oder diese E-Mail kopieren. Benachrichtigen Sie bitte den Absender per E-Mail, wenn Sie diese E-Mail irrtümlich erhalten haben und löschen dann diese E-Mail von Ihrem System.
> 
> This message contains confidential information and is intended only for the recipient. If you are not the recipient you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system.
> 
> <error_message_000004.txt><error_message_000005.txt><error_message_000011.txt><error_message_000010.txt><error_message_000014.txt><error_message_000017.txt><error_message_000006.txt><error_message_000008.txt><error_message_000007.txt><error_message_000001.txt><error_message_000009.txt><output_solver.txt><gpu_device_info.txt><output_generate_databases.txt>_______________________________________________
> CIG-SEISMO mailing list
> CIG-SEISMO at geodynamics.org
> http://lists.geodynamics.org/cgi-bin/mailman/listinfo/cig-seismo



More information about the CIG-SEISMO mailing list