autonomys-controller: Responsible for proxying node RPC, used to manage cluster components.
sharded-cache: Piece sharded cache.
full-piece-sharded-cache: Full node of piece sharded cache.
proof-server: GPU-based block generation, used for computing proofs.
plot-server: Plotting service, responsible for encoding data.
plot-client: Farming component, used for scanning disks and submitting solutions.
Architecture
Currently, all cluster management is based on NATS, but the actual data transmission for the cache is done through TCP for peer-to-peer (P2P) communication.
Recommended Software and Hardware Configuration
This software is only supported on Linux operating systems and Nvidia GPU environments.
Operating System and Dependency Software
Operating System: Ubuntu 22.04
GPU Driver Version: ≥ 525.60.13, or alternatively, install CUDA 12.4 directly.
File System: Ext4
Supervisor: 4
NATS Server: v2.10.22
numactl: Required for managing NUMA (Non-Uniform Memory Access) nodes
Recommended Server Configuration
Server
Node
CPU
64 cores
MEM
64GB / 128GB
GPU
Required
SSD
500GiB
Ethernet
at least 1 Gbps
Running Components
controller
autonomys-node
proof-server
nats-server
Server
Plotter
CPU
at least 30 cores per GPU
MEM
at least 64GB per GPU
GPU
Required
SSD
at least 1 TiB for caching plot data
Ethernet
at least 20 Gbps
Running Components
plot-server
sharded-cache
full-piece-cache
Server
Storage
CPU
depending on the storage capacity
MEM
depending on the storage capacity
GPU
Not Required
SSD
depending on the storage capacity
Ethernet
at least 20 Gbps
Running Components
plot-client
Best Practices
Note: The following names, IP addresses and other details are examples.
Environment Overview
Server
IP Address
Configuration
Component
Node 1
192.168.1.1
GPU * 1
controllerautonomys-node
proof-servernats-server
Node 2
192.168.1.2
GPU * 1
controllerautonomys-node
proof-servernats-server
Node 3
192.168.1.3
GPU * 1
controllerautonomys-node
proof-servernats-server
Plotter 1
192.168.1.4
GPU * 4
autonomys-plot-server-0
autonomys-plot-server-1
autonomys-plot-server-2
autonomys-plot-server-3
sharded-cachefull-piece-cache
Plotter 2
192.168.1.5
GPU * 4
autonomys-plot-server-0
autonomys-plot-server-1
autonomys-plot-server-2
autonomys-plot-server-3
sharded-cachefull-piece-cache
Storage 1
192.168.1.6
8T NVMe SSD * 4
/mnt/nvme0n1
/mnt/nvme0n2
/mnt/nvme1n2
/mnt/nvme1n1
autonomys-plot-client
Storage 2
192.168.1.7
8T NVMe SSD * 4
/mnt/nvme0n1
/mnt/nvme0n2
/mnt/nvme1n1
/mnt/nvme1n2
autonomys-plot-client
Cluster Start Command
Start by launching NATS, then follow the instructions below to configure Supervisor’s parameters. Once configured, simply run the following command to start all programs:
bashCopycodesupervisorctlstartall
Supervisor Configuration
Node Configuration
Each node requires the deployment of 4 components: controllerautonomys-nodeproof-servernats-server
Explanation of Startup Command Parameters and Environment Variables:
--nats-server : This parameter is used to specify the address of the NATS server.
CUDA_VISIBLE_DEVICES: This environment variable is used to specify which GPU to use. For example, 0 represents GPU0, 1 represents GPU1, and so on.
Plotter Configuration (Example with 4 GPUs)
Each plotter requires the deployment of e components: autonomys-plot-server, autonomys-sharded-cacheand autonomys-full-piece-cache
The autonomys-plot-server component retrieves pieces from both the autonomys-sharded-cache and autonomys-full-piece-cache components for use on the plotting drive.
Explanation of Startup Command Parameters and Environment Variables:
--nats-server: Specifies the address of the NATS server.
CUDA_VISIBLE_DEVICES: Sets the GPU to be used, where 0 represents GPU0, 1 represents GPU1, and so forth.
GPU_CONCURRENCY: Increasing this value raises GPU memory usage. Adjusting this variable may be beneficial when using GPUs of different models.
It is important to note that when using the numactl tool to bind CPU cores, you should consider the NUMA affinity of the GPU to achieve optimal performance.
You can use the nvidia-smi topo -m command to check the NUMA affinity of the GPU.
--nats-server : Used to specify the address of the NATS server.
path=/path/to/plot-dir,sectors=8000: Specifies the file path for plots as well as the number of sectors for the plot, with 8000 as the sector count in this example.
Appendix
Using the Command
Execute the command to manually initialize the cluster. The entire cluster will be reinitialized after n seconds.
The Autonomys Piece Conversion Tool allows you to convert data synchronized by autonomys-node into piece cache data. Please follow the steps below to export piece cache data:
You can download pre-synced node data from Baidu Cloud, with the file name node-db.tar.gz. After downloading and extracting, you’ll still need to sync the latest node data, but the process will be significantly faster.
Data Update: The data is current as of November 12, 2024, at 23:00 Singapore Time.
Note: This is raw node data, and it must be converted into piece data using the autonomys-export-piece tool before it can be used for packaging.