System Configuration
- ARCH
- About ARCH
- System Configuration
Rockfish is a community-shared cluster at Johns Hopkins University and housed at Maryland Advanced Research and Computer Center in Baltimore. It follows the “condominium model” with three main units. The first unit is based on an NSF Major Research Infrastructure grant, a second unit contains mainly medium-sodez condos (for example DURIP/DoD, Deans’ contributions condos), and the last unit is a collection of individual research groups condos. All three units are shared, with no physical separation, by all users.
Rockfish has 34,128 cores (711 nodes), a combined theoretical performance of 3.3 PFLOPs and Rmax of 2.1 PFLOPs. Rockfish has three parallel file systems (GPFS) with a total of ~13PB of usable space. The Rockfish cluster has Mellanox Infinidad HDR100 connectivity (1:1.5 topology)
The Rockfish Cluster was ranked #443 in top500.org (November 2023).
Compute Hardware
Original # Nodes | Current # Nodes | Type | CPU | GPU | RAM | Storage | Total Cores |
---|---|---|---|---|---|---|---|
386 | 768 | Compute | Intel Xeon Gold Cascade Lake 6248R | N/A | 192GB DDR4 2933MHz | 1TB NVMe SSD | 36,864 |
0 | 47 | Compute | Intel Xeon Gold Sapphire Rapids 6448Y | N/A | 256GB DDR5 4800MHz | 2TB NVMe SSD | 6,016 |
10 | 28 | Large Memory | Intel Xeon Gold Cascade Lake 6248R | N/A | 1.5TB DDR4 2933MHz | 1TB NVMe SSD | 1,344 |
10 | 18 | GPU Nodes | Intel Xeon Gold Cascade Lake 6248R | 4x Nvidia A100 40GB | 192GB DDR4 2933MHz | 1TB NVMe SSD | 864 |
0 | 6 | GPU Nodes | Intel Xeon Gold Icy Lake 6338 | 4x Nvidia A100 80GB | 256GB DDR4 3200MHz | 1.6TB NVMe SSD | 384 |
406 | 867 | 45,472 |
Storage
Filesystem | System Type | Total Size | Block Size | Default Quota | Files Per T | Backed Up? |
---|---|---|---|---|---|---|
/home/ | NVMe SSD | 20T | 128K | 50GB | N/A | Limited |
/scratch4/ | IBM GPFS | 1.9PB | 4MB | 10T | 2,000K files per TB | No |
/scratch16/ | IBM GPFS | 1.9PB | 16MB | N/A | 1,000K files per TB | No |
/data/ | IBM GPFS | 5.1PB | 16MB | 1T | 400K files per TB | No |
partitions
Partition | Available Nodes | Max Time (Hours) | Max Cores per Node | Max Memory per Node (MB) |
---|---|---|---|---|
parallel | 768 | 1 / 72 | 48 | 192,000 |
a100 | 17 | 1 / 72 | 48 | 192,000 |
bigmem | 28 | 1 /48 | 48 | 1,537,000 |
v100 | 1 | 1 / 72 | 48 | 193,118 |
ica100 | 8 | 1 / 72 | 64 | 256,000 |
express | 5 | 1 / 8 | 128 | 256,000 |
shared | 41 | 1 / 24 | 64 | 256,000 |