Geodynamics - Group: Earths Interior ~ Wiki: Scaling

Scalability Results

Model Description

These scalability tests were run using CitcomS 3.2.0 with default configuration. The mesh for these tests is a regional cap with 129×129×129 nodes. Total velocity unknowns is 129^3 x 3 = 6.4 million. The model is run for 11 time steps. The result reported is the total wall clock time. Each node on this cluster has 2 Xeon 5680 series 3.33GHz hex-core processors with a 12MB unified L3 cache and 24GB RAM, for a total of 12 cores per node. The interconnect is QDR InfiniBand.

Partition	Total Procs	Wall Time (sec)	Speedup	Scalability
1×1×1	1	47217	1.000	1.000
1×1×2	2	25466	1.854	0.927
1×1×4	4	14645	3.224	0.806
2×2×1	4	14438	3.270	0.818
2×2×2	8	8980	5.258	0.657
2×2×4	16	4432	10.654	0.666
4×4×1	16	5367	8.798	0.550
4×4×2	32	2460	19.194	0.600
4×4×4	64	1346	35.079	0.548
8×8×2 1	28	583	80.990	0.633
8×8×4	256	337	140.110	0.547

The input file is available here. It is currently configured for 1×1×1 processors, to do different processor divisions you must change the nprocx, nprocy, and nprocz parameters. You must create a folder named “scratch” in the working directory for the output files. The input file uses the non-Python version of CitcomS, located at CitcomS-3.2.0/bin/CitcomSRegional.

Created on 02 Mar 2022, Last modified on 02 Mar 2022