[CIG-SEISMO] Error executing specfem3d with CUDA / GeForce GT 730

Tue May 8 14:42:07 PDT 2018

Hi Professor Daniel,

Thanks for your help. I'll try to test the suggested modifications in the
Makefile.

I was able to replace the previous card with a newer model. My advisor
purchased a Nvidia Quadro P6000 board.

I performed the same compilation procedures and there were no errors. We
are conducting our tests and learning more with Specfem3d.

Thank you again for your help.

best wishes,

Leandro Gazoni

2018-05-05 17:22 GMT-03:00 Daniel B. Peter <daniel.peter at kaust.edu.sa>:

> hi Leandro,
>
> your card has CUDA compute capability 2.1. that is Fermi architecture. you
> would need to compute with:
> GENCODE = -gencode=arch=compute_20,code=\"sm_21\”
> in the Makefile.
>
> note that CUDA 9 doesn’t support that architecture anymore, so you will
> have to take a lower version CUDA 8 or less. hope this solves the issue
> (otherwise we’d have to rewrite these memory allocations - but since this
> an old card anyway, check with a newer card. not sure if we want to support
> Fermi still - we’re going to Volta now, that is about 3 generations ahead...
>
> best wishes,
> daniel
>
>
>
> > On May 4, 2018, at 1:39 AM, Leandro Gazoni <lgazoni at gmail.com> wrote:
> >
> > Hi Daniel,
> >
> > Thanks for your help! about the architecture of the video card I believe
> the GT 730 is Kepler.
> >
> > I compiled Specfem3d using only the command: "./configure --with-cuda"
> without any arguments. But I've tried before with ./configure --with-cuda =
> cuda5, both generate the makefile and compile without errors.
> >
> > The answer to configure follows.
> > ## ---- ##
> > ## CUDA ##
> > ## ---- ##
> > checking for nvcc ... /usr/local/cuda-9.1/bin/nvcc
> > checking for cuda_runtime.h ... yes
> > checking nvcc compilation with cudaMalloc in -lcudart ... yes
> > checking nvcc linking with cudaMalloc in -lcudart ... yes
> > checking linking with cudaMalloc in -lcudart ... yes
> >
> >
> > The problem is when running the simple_model example.
> >
> >
> > nacib at jobi:~/Downloads/specfem3d/EXAMPLES/meshfem3D_examples/simple_model$
> ./run_this_example.sh
> > running example: Qui Mai  3 13:24:35 -03 2018
> >
> >   setting up example...
> >
> >   running mesher...
> >
> >   running database generation...
> >
> >   running solver...
> >
> > Error in setConst_hprime_xx: invalid device symbol
> > The problem is maybe -arch sm_13 instead of -arch sm_11 in the Makefile,
> please doublecheck
> >
> > -----
> >
> > nacib at jobi:/usr/local/cuda-9.1/samples/bin/x86_64/linux/release$
> ./deviceQuery
> > ./deviceQuery Starting...
> >
> >  CUDA Device Query (Runtime API) version (CUDART static linking)
> >
> > Detected 1 CUDA Capable device(s)
> >
> > Device 0: "GeForce GT 730"
> >   CUDA Driver Version / Runtime Version          9.1 / 9.1
> >   CUDA Capability Major/Minor version number:    2.1
> >   Total amount of global memory:                 1982 MBytes (2078605312
> bytes)
> > MapSMtoCores for SM 2.1 is undefined.  Default to use 64 Cores/SM
> > MapSMtoCores for SM 2.1 is undefined.  Default to use 64 Cores/SM
> >   ( 2) Multiprocessors, ( 64) CUDA Cores/MP:     128 CUDA Cores
> >   GPU Max Clock rate:                            1400 MHz (1.40 GHz)
> >   Memory Clock rate:                             700 Mhz
> >   Memory Bus Width:                              128-bit
> >   L2 Cache Size:                                 131072 bytes
> >   Maximum Texture Dimension Size (x,y,z)         1D=(65536), 2D=(65536,
> 65535), 3D=(2048, 2048, 2048)
> >   Maximum Layered 1D Texture Size, (num) layers  1D=(16384), 2048 layers
> >   Maximum Layered 2D Texture Size, (num) layers  2D=(16384, 16384), 2048
> layers
> >   Total amount of constant memory:               65536 bytes
> >   Total amount of shared memory per block:       49152 bytes
> >   Total number of registers available per block: 32768
> >   Warp size:                                     32
> >   Maximum number of threads per multiprocessor:  1536
> >   Maximum number of threads per block:           1024
> >   Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
> >   Max dimension size of a grid size    (x,y,z): (65535, 65535, 65535)
> >   Maximum memory pitch:                          2147483647 bytes
> >   Texture alignment:                             512 bytes
> >   Concurrent copy and kernel execution:          Yes with 1 copy
> engine(s)
> >   Run time limit on kernels:                     Yes
> >   Integrated GPU sharing Host Memory:            No
> >   Support host page-locked memory mapping:       Yes
> >   Alignment requirement for Surfaces:            Yes
> >   Device has ECC support:                        Disabled
> >   Device supports Unified Addressing (UVA):      Yes
> >   Supports Cooperative Kernel Launch:            No
> >   Supports MultiDevice Co-op Kernel Launch:      No
> >   Device PCI Domain ID / Bus ID / location ID:   0 / 1 / 0
> >   Compute Mode:
> >      < Default (multiple host threads can use ::cudaSetDevice() with
> device simultaneously) >
> >
> > deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 9.1, CUDA
> Runtime Version = 9.1, NumDevs = 1
> > Result = PASS
> >
> >
> > Thank you,
> > Best regards,
> > Leandro Gazoni
> >
> >
> > 2018-05-03 19:14 GMT-03:00 Daniel B. Peter <daniel.peter at kaust.edu.sa>:
> > hi Leandro,
> >
> > how did you compile SPECFEM3D?
> >
> > the GT 730 comes in two chip flavors, not sure which one you will have.
> one of them supports CUDA compute capability 3.5 (Kepler). to try that out,
> use something like:
> >
> > ./configure --with-cuda=cuda5 CUDA_FLAGS=.. CUDA_LIB=.. CUDA_INC=..
> MPI_INC=..
> >
> > best wishes,
> > daniel
> >
> >
> > > On May 3, 2018, at 11:24 PM, Leandro Gazoni <lgazoni at gmail.com> wrote:
> > >
> > > Hello Dimitri,
> > >
> > > Many thanks for the reply! The first thing I imagined would be to run
> a simple problem, so I chose the simple model. I modified it so that it
> could generate a small mesh and could run on a GPU like the GeForce GT 730.
> > >
> > > I believe the problem is small, but I do not know the reasons for the
> message about the architecture of the board (sm _....). I await and
> appreciate the response from the experts.
> > >
> > >
> > > Thank you,
> > > Best regards,
> > > Leandro Gazoni
> > >
> > > 2018-05-03 16:05 GMT-03:00 Dimitri Komatitsch <
> komatitsch at lma.cnrs-mrs.fr>:
> > >
> > > Hi Leandro,
> > >
> > > Thanks for your message. I do not see any reason why a GeForce GT 730
> 64bit card could not be used, but I am not an expert, thus let me cc four
> experts.
> > >
> > > Could it be that the example is big and thus you are running out of
> memory?
> > >
> > > Thank you,
> > > Best regards,
> > > Dimitri.
> > >
> > >
> > > On 05/03/2018 04:17 PM, Leandro Gazoni wrote:
> > > Hello everyone,
> > >
> > > My name is Leandro Gazoni, I am a Phd student in computational
> mechanics in Federal University of Rio de Janeiro / Brazil. I have done
> some studies on using SprecFem3d I was able to compile, run and modify some
> examples using MPI.
> > >
> > > I would like to test the same problems with cuda but I have had
> problems running the example: (/ EXAMPLES / meshfem3D_examples /
> simple_model) with a GPU: GeForce GT 730, 64bit card. Is it possible to run
> tests with this board?
> > >
> > > I can compile specfem3d without problems with ./configure --with-cuda,
> but when running the example I have the following error:
> > >
> > > born @ jobi: ~ / Downloads / specfem3d / EXAMPLES / meshfem3D_examples
> / simple_model $ ./run_this_example.sh
> > > running example: Thu May 3 05:06:59 -03 2018
> > >
> > >     setting up example ...
> > >
> > >     running mesher ...
> > >
> > >     running database generation ...
> > >
> > >     running solver ...
> > >
> > > Error in setConst_hprime_xx: invalid device symbol
> > > The problem is maybe -arch sm_13 instead of -arch sm_11 in the
> Makefile, please doublecheck
> > >
> > >
> > > Regards,
> > > Leandro Gazoni
> > >
> > > ==============
> > > ParFile
> > > ==============
> > > # number of MPI processors
> > > NPROC                           = 1
> > >
> > > # time step parameters
> > > NSTEP                       = 10000
> > > DT                              = 0.01
> > >
> > > GPU_MODE                        = .true.
> > >
> > >
> > > ==============
> > > Mesh Parfile
> > > ==============
> > > # number of elements at the surface along edges of the mesh at the
> surface
> > > # (must be 8 * multiple of NPROC below if mesh is not regular and
> contains mesh doublings)
> > > # (must be multiple of NPROC below if mesh is regular)
> > > NEX_XI                          = 32
> > > NEX_ETA                         = 32
> > >
> > > # number of MPI processors along xi and eta (can be different)
> > > NPROC_XI                        = 1
> > > NPROC_ETA                     = 1
> > >
> > > # number of regions
> > > NREGIONS                        = 4
> > > # define the different regions of the model as :
> > > #NEX_XI_BEGIN  #NEX_XI_END  #NEX_ETA_BEGIN  #NEX_ETA_END  #NZ_BEGIN
> #NZ_END  #material_id
> > > 1              32            1               32             1
>  4         1
> > > 1              32            1               32             5
>  5         2
> > > 1              32            1               32             6
> 15       3
> > > 14             25            7               19             7
> 10       4
> > >
> > >
> > >
> > >
> > >
> > > _______________________________________________
> > > CIG-SEISMO mailing list
> > > CIG-SEISMO at geodynamics.org
> > > http://lists.geodynamics.org/cgi-bin/mailman/listinfo/cig-seismo
> > >
> > >
> > > --
> > > Dimitri Komatitsch, CNRS Research Director (DR CNRS)
> > > Laboratory of Mechanics and Acoustics, Marseille, France
> > > http://komatitsch.free.fr
> > >
> >
> >
> > ________________________________
> > This message and its contents including attachments are intended solely
> for the original recipient. If you are not the intended recipient or have
> received this message in error, please notify me immediately and delete
> this message from your computer system. Any unauthorized use or
> distribution is prohibited. Please consider the environment before printing
> this email.
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.geodynamics.org/pipermail/cig-seismo/attachments/20180508/fc4d12b7/attachment-0001.html>