System Configuration

Rockfish a community-shared cluster at Johns Hopkins University and housed at Maryland Advanced Research and Computing Center in Baltimore. It follows the “condominium model” with three main units. The first unit is based on an NSF Major Research Infrastructure grant, a second unit contains mainly medium-size condos (for example DURIP/DoD, Deans’ contributions condos), and the last unit is a collection of individual research groups condos. All three units are shared, with no physical separation,  by all users.

Rockfish has 34,128 cores (711 nodes), a combined theoretical performance of 3.3 PFLOPs and Rmax of 2.1 PFLOPs. Rockfish has three parallel file systems (GPFS) with a total of ~13PB of usable space. The Rockfish cluster has Mellanox Infinidad HDR100 connectivity (1:1.5 topology).

The Rockfish Cluster was ranked #414 in top500.org at 1.9TFLOPs (June 2022) and 17th among US academic institutions. 

Compute Hardware

Original # Nodes Current # Nodes Type CPU GPU RAM Storage Total Cores
368 672 Compute Intel Xeon Gold Cascade Lake 6248R N/A 192GB TruDDR4 2933MHz 1TB NVMe SSD 32256
10 22 Large Memory Intel Xeon Gold Cascade Lake 6248R N/A 1.5TB TruDDR4 2933MHz 1TB NVMe SSD 1056
10 18 GPU Nodes Intel Xeon Gold Cascade Lake 6248R 4X Nvidia A100 192GB TruDDR4 2933MHz 1TB NVMe SSD 864

Storage

Filesystem System Type Total Size Block Size Default Quota Files Per TB Backed Up?
/home/ NVMe SSD 20T 128K 50GB N/A Limited
/scratch4/ IBM GPFS 3.8PB 4MB 10T 2,000 files per TB No
/scratch16/ IBM GPFS 3.6PB 16MB N/A 1,000 files per TB No
/data/ IBM GPFS 5.1PB 16MB 20T 400 files per TB No

Partitions

Partition Available Nodes Max Time (Hours) Max Cores per Node Max Memory per Node (MB)
defq 667 1 / 72 48 192,000
a100 17 1 / 72 48 192,000
bigmem 22 1 / 48 48 1,537,000
v100 1 1 / 72 48 193,118