The Supercomputing.life platform is a network
of General Purpose Compute Nodes designed
by the Distributed Computing Research Group
at Balanced City¹.
Compute Node Architecture · Overview
A Compute Node is a group of computers designed
to work together on large projects.
Distributed computing research and experience in
performance engineering guided the design of the
General Purpose Compute Node, GPCN2021.
General Purpose Compute Node 2021
Compute Nodes have 15 individual RPi computers.
· 12 x RPi A+ perform primary computation,
the scientific work that the system is designed
to accomplish, using their 48 CPUs.
· 3 x RPi B orchestrate overall workflow,
data and communication, using their 12 CPUs.
raspberrypi.org12 x Raspberry Pi Model A+ 3 x Raspberry Pi Model B64bit ARM Cortex-53 quad-core 1.4GHz64bit ARM Cortex-A72 quad-core 1.5GHz
Each individual RPi operates a quad-core processor
with 4 Central Processing Units (CPUs).
A General Purpose Compute Node has 60 CPUs.
Each CPU is capable of performing an independent
Compute Node Workflow · Primary Computation
RPi B managers divide the workload into partitions,
preparing data and computational tasks for up to
48 CPUs on the 12 RPi A+ primary computers.
Primary computers deliver processed data back
to managers upon completion.
Managers consolidate processed data, checking
for accuracy and consistency.
Lastly, managers send processed data from the
Compute Node to the project's final inventory.
Compute Node Software · Scientific Libraries, Methods and Tools
Scientific libraries, methods and tools used on this
architecture are created de novo with inspiration from
leadership initiatives such as the NIH Biowulf Cluster ²
and Department of Energy National Laboratories ³.
· Supercomputing.life Applications
Supercomputing.life applications support life science
research and are developed with databases ⁴ provided
by the National Center for Biotechnology Information,
a division of the National Library of Medicine at the
National Institutes of Health in the United States.
These applications are collectively referred to as the
Learning Health Systems and are accessible through
the public interface at OpenMD.life
This machine teaching platform benefits from
natural intelligence of users around the world.
Operational Performance · Initial Deployment
Compute Nodes began scientific operation in Q2 2019
performing CPU intensive language processing for the
Learning Health Systems.
The utilization profile of a Compute Node typically has
20 to 35 primary compute CPUs operating in parallel
continuously for several hours up to several days.
Technical measures of processor clock speed are less
relevant than the number of independent processing
tasks performed simultaneously.
· Developer Productivity
Engineered to maximize performance of developers
through automation and ease of use, Compute Node
architecture eliminates productivity barriers for rapid
prototyping, programming, optimizing and deploying
Easily deploying 1,000 computational tasks with no
complexity burden is a practical measure of system
success from the perspective of a developer.
· Operations per second
Computing benchmarks measure floating-point
operations per second (FLOPS).
The Compute Nodes described here are capable of...
· 64 GigaFLOPS peak CPU
· 85 GigaFLOPS peak CPU
Sustained operation is kept below 75% peak capacity,
approximately 50 GigaFLOPS of primary computation.
An indefinite number of Compute Nodes can be used
to accomplish computational goals.
This scalability is a motivating design requirement of
the Compute Node architecture.
· Large Scale · 1012 FLOPS
25 Compute Nodes are estimated to sustain
1 TeraFLOP running on 1,000 CPUs.
· Extreme Scale · 1015 FLOPS
25,000 Compute Nodes are estimated to sustain
1 PetaFLOP running on 1,000,000 CPUs.
Compute Node Costs · Hardware
A Compute Node's hardware costs less than $1,000.
25 Compute Nodes containing 1,500 total CPUs cost
approximately $15,000 in hardware per availability.
Operating costs are primarily determined by software
and hardware engineering labor.
Overall Assessment · Summary
Overall performance has met high expectations with
no technical issues.
The current architecture has proven to be a scalable
and adaptive computational research platform for
machine learning and life science.