Close

RELION 3.1 fastest ever benchmark on 2*GPUs and 4*GPUs

Relion 3.1 Benchmark

We continue running benchmark tests on new LinuxVixion Workstations based on the Ampere NVIDIA family GPUs.


After testing GROMACS on 2*GPU system (RTX-3080), now is the turn for Relion 3.1 benchmark tests in a LVX SILENT 2*RTX-3090 and LVX SILENT 4*RTX-3090 with 2 and 4 NVIDIA GeForce RTX 3090 GPUs, respectively.


These NVIDIA Ampere GeForce RTX 3090 GPUs with 24GB RAM, 10496 CUDA Cores, 1395 MHz Clock speed has proved to be faster than previous Pascal, Turing and Volta NVIDIA GPUs series.


Relion is one of the most popular software for CryoEM analysis. We would like to compare our tests with those carried out in the past on different platforms and different GPUs.

With this data from our Relion 3.1 benchmark tests we have a new scenario that show how this technique has improved in the last years, if we compare it with the orginal Titan X (Pascal) carried on at the MRC Laboratory of Molecular Biology (LMB) and even with the latest Nvidia V-100. You can find the original data in the web Benchmarks & computer hardware, by Sjors Scheres.


Results are for Plasmodium Ribosome Classification (2D, 3D) using Relion 3.1.

2 GPUs Workstation results

3D benchmark

time mpirun -n 3 which relion_refine_mpi --i Particles/shiny_2sets.star --ref emd_2660.map:mrc --firstiter_cc --ini_high 60 --ctf --ctf_corrected_ref --iter 25 --tau2_fudge 4 --particle_diameter 360 --K 6 --flatten_solvent --zero_mask --oversampling 1 --healpix_order 2 --offset_range 5 --offset_step 2 --sym C1 --norm --scale --random_seed 0 --o class3d --scratch_dir /data --gpu 0:1 --pool 100 --j 4 --dont_combine_weights_via_disc

Output: 71 m 14.239 s → 1hr 11m


So if we compare the time we have obtained, 1hr 11m, with the reference in the original benchmark website, 4h 29m, we can clearly see that our result is much more faster.

2D benchmark

time mpirun -n 3 which relion_refine_mpi --i Particles/shiny_2sets.star --ctf --iter 25 --tau2_fudge 2 --particle_diameter 360 --K 200 --zero_mask --oversampling 1 --psi_step 6 --offset_range 5 --offset_step 2 --norm --scale --random_seed 0 --o class2d --scratch_dir /data --gpu 0:1 --pool 100 --j 4 --dont_combine_weights_via_disc

Output: 361 m 33.032 s → 6h 1m.

Comparing this number with the original benchmark (11h 2m), there is no doubt that GeForce RTX 3090 provides faster results.

4 GPUs Workstation results

We run basically the same example but using options mpi -n 5 and –j 6. (Note there are not results for 2D classification for 4*V100 in the original tests).

3D benchmark

time mpirun -n 5 which relion_refine_mpi --i Particles/shiny_2sets.star --ref
emd_2660.map:mrc --firstiter_cc --ini_high 60 --ctf --ctf_corrected_ref --iter 25 --tau2_fudge 4 --particle_diameter 360 --K 6 --flatten_solvent --zero_mask --oversampling 1
--healpix_order 2 --offset_range 5 --offset_step 2 --sym C1 --norm --scale --random_seed 0 --o class3d --scratch_dir /data --gpu --pool 100 --j 6 --dont_combine_weights_via_disc

Output: 42 m 58.354s

Again, the time we have obtained in our benchmark is shorter that the original one (72 min).

2D benchmark

time mpirun -n 5 which relion_refine_mpi --i Particles/shiny_2sets.star --ctf --iter 25 --tau2_fudge 2 --particle_diameter 360 --K 200 --zero_mask --oversampling 1 --psi_step 6 --offset_range 5 --offset_step 2 --norm --scale --random_seed 0 --o class2d --scratch_dir /scratch --gpu --pool 100 --j 6 --dont_combine_weights_via_disc

Output: 200 m 4.513 s → 3h 20m. (No 2D Classification in 4*V100 system).

Share this post

This website uses cookies to ensure you get the best experience on our website. If you want to know more or withdraw your consent to all or some of the cookies, please refer to the Privacy Policy. By closing this banner, you agree to the use of cookies.​