This operational guide walkthroughs a checklist of steps to onboard large sized datasets ( >100 TiB in size including all replicas), into the Filecoin network. This guide is suitable for data owners that are seeking an introduction to the onboarding process.
The end-to-end data flow is covered at an introductory level, from planning, through to retrieval testing. This guide is intended as a top-level guide that summarizes each step, while providing references to more detailed documentation.
This guide provides a walkthrough usage of the primary client-side tool set, namely Singularity, supported by Boost, and Lotus client.
- 1.Understand the large data onboarding reference architecture.
- 2.Select datasets for onboarding, estimate sizing.
- 3.Allocate storage funding with DataCap, or FIL tokens.
- 4.Select Storage Providers
- 5.Setup Storage Gateway environment
- 6.Prepare Data
- 7.Replicate Data to SPs, Propose Storage Deals
- 8.Retrieve Data
- 9.Plan next steps
If your organization's dataset is smaller (<100TiB), you may also consider alternatives that may provide simpler and accelerated onboarding paths, such as Chainsafe.storage, Estuary.tech, web3.storage, NFT.storage, and others listed at https://dataonboarding.filecoin.io/