PC Cluster with GAMMA (Non-TCP/IP) Communication Software and Fortran 90
分子動力学の手法へ戻る
Back to Methods and Tools of Molecular Dynamics

Xeon (2.4GHz) x 2
cpu/pe  walltime
192     264
 (72)

Xeon (2.4GHz) x 4
cpu/pe  walltime
113     190
  (77)

Pen 4 (3.0GHz) x 2
cpu/pe  walltime
148    184 
(36)

Timing measurments by DFT Molecular Dynamics code
(ab initio molecular dynamics code)


● 181 (C,N,H) atoms, 60x60x192 grids, 2 SCF iterations

● Red Hat Linux 7.3, Portland Fortran/C compilers 5.1,
  Scalapack package (self compiled)

● Xeon 2.4 GHz - smp (dual)/533MHz FSB, chipset intel E7505
  PC2100 512MBx2, Hyper-threading is off

● Pentium 4 3.0GHz/800MHz FSB, chipset intel 875P
  PC3200 512MBx2

● Communications via Gigabit network
  a) on board NIC, b) on board bcm5700,  c) 3Com996

Oct.1, 2003/May 10, 2004

Homeへもどる
Back to Home

 Computation with two Pentium 4 machines is faster than with a 4-Xeon (dual cpu)
cluster, which is possibly due to the single NIC for dual cpus or memory band-width limitation of the Xeon architecture.Performance of the Pentium 4 cluster is excellent
for the ab initio molecular dynamics code, except for the communication overhead.
  Interestingly, the computation per processor with the Pentium 4 is even faster
than that with the Xeon cpus in terms of the processor's clock speed ratio.

 Recently, a significant speedup has been achieved by introduction of the GAMMA
communication software (
Genoa Active Message Machine; thanks to Dr.G.Ciaccio
(Genoa University); http://www.disi.unige.it/project/gamma/) onto the Pentium 4 cluster.
A gigabit ethernet with 3Com996 network interface cards is used. Communication overhead has almost been eliminated due to this non TCP/IP method. This result is
close to the performance of the RISC machine IBM Power 4 "Regatta" (1.5GHz).


Pen 4 (3.0GHz) x 4
cpu/pe  walltime
79     129
 (56)


In the above, the cpu/pe and walltime are both in unit of sec
(walltime -cpu/pe) is also shown in
green digits.
Pen 4 (3.0GHz) x 6
cpu/pe  walltime
51     96
  (45).

Pen 4 (3.0GHz) x 4


MPI/GAMMA
cpu/pe  walltime
66    
66  (0.1)
TCP/IP
cpu/pe  walltime
67    
93  (26)
a)
b)
c)
IBM Power 4 (1.5GHz) x 4
cpu/pe  walltime
59    
59  (0.1)


-->