Setup Storage Gateway environment
Last updated
Last updated
The storage gateway server should be installed with:
- petabyte-scale data onboarding and retrieval client tool.
Install Singularity:
- lite client for interacting with the Filecoin chain. Used for legacy deals.
Install Lotus and configure a lotus-lite node:
- new lotus markets client for v1.2 deals with upgraded SPs.
Install Boost client:
Web Server - hosting of CAR files for online data transfer to SPs.
E.g. Nginx.
daemon, stores the Singularity dataset index to support retrievals.
An Ubuntu Linux instance is recommended. Singularity, Lotus node, and Boost packages should be be built and installed from source. Web server and IPFS daemon can be installed from binaries.
[TODO link to Ubuntu build script]
Network data transfer is often a limiting factor when onboarding PiB-scale large data sets. Determine the optimal data replication approach for the following paths:
Source dataset replicated from the data owner to the storage gateway,
Prepared dataset replicated from the storage gateway to each participating SP.
Compare online network data transfer options and offline physical media transport options, consider feasibility, cost, transfer duration, etc. The data transfer plan may also affect the optimal placement the storage gateway. Compare cloud vs. on-premise hosting. Consider the physical locations of the source dataset and the destination SPs.
Gateway storage should be sized for storing the source dataset, and for hosting of the prepared CAR files. A general guideline is to size local storage for 2x of the source dataset size.
Data preparation tasks are IO-bound, so the storage gateway will benefit from fast local storage, such as iSCSI or NVMe storage interfaces.
Singularity can also be configured to specify a number of workers for concurrent data preparation. See deal_preparation_worker.num_workers in the if required.