Pillar of Web3: overview of decentralized storage ecosystem

Foresight News 2022-06-23 16:32:51 阅读数:218

pillarweb3weboverviewdecentralized
Filecoin、Arweave、Storj、Crust Network、Sia and Swarm, Who is the best decentralized storage solution ?

If we want to go further in decentralizing the Internet , These three pillars will eventually be needed : Consensus 、 Storage and calculation . If humanity succeeds in decentralizing these three areas , We will embark on the next stage of the Internet journey : Web3.

chart 1: Every Web3 Project example of a pillar

Storage , As the second pillar , Is maturing rapidly , Various storage solutions have been applied to usage scenarios . this paper , The pillar of decentralized storage will be explored further .

This article is a summary of the whole article , The full length can be divided into decentralized storage Arweave and Crust Network download .

The need for decentralized storage

From the perspective of blockchain

From the perspective of blockchain , We need decentralized storage because the blockchain itself is not designed to store large amounts of data . The mechanism for obtaining block consensus relies on a small amount of data ( transaction ), These data are placed in blocks ( Collect transactions ), And quickly share to the network for node verification .

First , Storing data in blocks is very expensive . At the time of writing , stay layer1 Store a complete BAYC #3368 It costs more than 18000 dollar .

chart 2: A project with an active main network . choice 200 Year storage period to meet Arweave The definition of permanence . source : Network document 、Arweave Storage calculator

secondly , If we want to store a lot of arbitrage data in these blocks , Network congestion will become serious , This can cause... When using the network gas The war led to a rise in prices . This is the consequence of the implicit time value of the block , If a user needs to submit a transaction to the network at a certain time , They will have to pay extra gas Fee to make their deal a priority .

therefore , It is suggested that NFT Metadata and image data 、dApp The front end of the is stored off the chain .

From the perspective of centralized network

If storing data on the chain is so expensive , Why not store data directly under the centralized network chain ?

Centralized networks are subject to scrutiny and variability . This requires users to trust the data provider to maintain data security . No one can guarantee that the operators of the centralized network will really live up to the trust of users : Data may be erased intentionally or accidentally . For example, it may be because the data provider changes the policy 、 Hardware failure or being attacked by a third party .

NFTs

With NFT The floor price of the collection exceeds 10 Thousands of dollars , some NFT Every time kb The value of image data is as high as 7 ten thousand , Commitment alone is not enough to ensure that data is available at all times . A stronger guarantee is needed to ensure that the bottom layer NFT Invariance and persistence of data .

chart 3: Based on the last sale Crypto Punk Floor price ( There is no reserve price at the time of writing this article ); Crypto Punk The image size is based on Crypto Punks V2 The byte length of the byte string on the chain . Data up to 2022 year 5 month 10 Japan . source :OpenSea、 Data on the chain 、IPFS Metadata

NFT Does not really contain any image data , contrary , They only have pointers to metadata and image data stored under the chain . But it is these metadata and image data that need to be protected , If these data disappear ,NFT Will be just an empty container .

chart 4: Blockchain 、 block 、NFT And a simplified diagram of metadata under the chain

so to speak ,NFT The value of is not primarily driven by the metadata and image data they refer to , It is driven by the movement and the community of ecosystems around the collection . Although this may be true , But if there is no basic data ,NFT Will be meaningless , Meaningless communities cannot be formed at all .

In addition to profile images and art collections ,NFT It can also represent the ownership of real-world assets , Such as real estate or financial instruments . Such data has external real-world value , from So through NFT Represents its value , So save NFT The value of each byte of data will not be lower than that of the chain NFT The value of .

dApps

If NFT It is a commodity existing on the blockchain , that dApp It can be considered as a service that exists on the blockchain and promotes interaction with the blockchain . dApp It is a combination of the front-end user interface under the chain and the smart contract that exists on the network and interacts with the blockchain . Sometimes they also have a simple back end , Some calculations can be moved off the chain to reduce the amount of gas, Thus reducing the costs incurred by end users for certain transactions .

chart 5:dApp Simplified diagram of interaction with blockchain

Even though dApp The value of should be based on dApp In the context of ( Such as ,DeFi,GameFi, social contact , Meta universe , Name service, etc ),dApps The value is amazing . The past at the time of writing 30 Days. ,DappRadar In the top 10 Bit dApp Together contributed to more than 1500 A $billion transfer .

chart 6: By 2022 year 5 month 11 Japan ,DappRadar Reported most popular in dollar terms dApp

Even though dApp The core mechanism of is implemented by smart contracts , End users can ensure user accessibility through the front end . therefore , In a sense , Make sure dApp The accessibility of the front end is to ensure the availability of the underlying services .

chart 7:Aave founder Stani Kulechov On twitter ,Aave dApp Front end at 2022 year 1 month 20 Daily offline , But still through IPFS A copy of the hosted site is accessed

Decentralized storage reduces server failures 、DNS hackers 、 And centralized entity deletion dApp Front end access . Even if it stops dApp Development of , You can also continue to access smart contracts through the front end .

Picture of decentralized storage

Such as The currency The etheric fang Such blockchains exist mainly to promote value transfer . When it comes to decentralized storage networks , Some networks also use this method : They use native blockchains to record and track stored orders , This represents a transfer of value in exchange for storage services . However , This is just one of many potential approaches —— Broad storage space , Over the years, different solutions have emerged with different trade-offs and use cases .

chart 8: Overview of some optional decentralized storage protocols ( Non exhaustive )

Despite many differences , But all these projects have one thing in common : None of these networks replicate all data on all nodes , This is the case with bitcoin and Ethereum blockchain . In a decentralized storage network , The immutability and availability of stored data are not achieved by storing and verifying successively linked data on most networks , This is the case with bitcoin and Ethereum . Although as mentioned earlier , Many networks choose to use blockchains to track stored orders .

It is not sustainable for all nodes on a decentralized storage network to store all data , Because the indirect cost of running the network will rapidly increase the storage cost of users , And finally promote the centralization of the network , Turn to a few node operators who can afford the hardware cost .

therefore , Decentralized storage networks need to overcome extraordinary challenges .

The challenge of decentralized storage

Review the previous limitations on data storage on the chain , It is clear that a decentralized storage network must store data in a way that does not affect the network value transfer mechanism , At the same time, ensure that the data remains persistent 、 Immutability and accessibility . essentially , A decentralized storage network must be able to store data 、 Retrieving data and maintaining data , At the same time, ensure that all participants in the network are motivated by their storage and retrieval work , At the same time, it is necessary to maintain the trustworthiness and willfulness of the decentralized system .

These challenges can be summarized as the following questions :

  • Data storage format : Store complete files or file fragments ?

  • Data replication : How many nodes to store data across ( Complete file or fragment )?

  • Store trace : How the network knows where to retrieve files ?

  • Proof of stored data : Whether nodes store the data they are required to store ?

  • Data availability over time : Whether the data is still stored over time ?

  • Store price discovery : How storage costs are determined ?

  • Persistent data redundancy : If the node leaves the network , How the network ensures that data is still available ?

  • The data transfer : Network bandwidth comes at a cost —— How to ensure that a node retrieves data when asked ?

  • Network token economics : In addition to ensuring that data is available on the network , How does the network ensure its long-term existence ?

As part of this study , The various networks that have been explored employ a wide range of mechanisms , And through some trade-offs to achieve decentralization .

chart 9: Summary of technical design decisions for audited storage networks

An in-depth comparison of the above networks for each challenge , And the detailed configuration file of each network , Can be found in Arweave or Crust Network Read the complete research article .

Data storage format

chart 10: Data replication and erasure encoding

In these networks , There are two main ways to store data on the network : Store complete files and use erasure codes :Arweave and Crust Network Store complete files , and Filecoin、Sia、Storj and Swarm All use erasure codes . In erasure coding , The data is decomposed into fixed size fragments , Each fragment is expanded and encoded with redundant data . The redundant data stored in each fragment makes it necessary to reconstruct the original file by only a subset of the fragment .

Data replication

stay Filecoin、Sia、Storj and Swarm in , The network determines the number of erasure encoded segments and the range of redundant data to be stored in each segment . However ,Filecoin It also allows the user to determine the replication factor , This factor determines that as part of a storage transaction with a single storage miner , How many separate physical devices should the erasure code segment be copied . If the user wants to use a different storage miner to store files , Then the user must make a separate storage transaction . Crust and Arweave Let the network decide to replicate , And in the Crust It is possible to manually set the replication factor on . stay Arweave On , Storage proof mechanism encourages nodes to store as much data as possible . therefore ,Arweave The upper limit of replication is the total number of storage nodes on the network .

chart 11: Data storage format will affect retrieval and reconstruction

The methods used to store and copy data will affect how the network retrieves data .

Store trace

After the data is stored on the network and distributed in any form among the nodes in the network , The network needs to be able to track stored data . Filecoin、Crust and Sia Both use local blockchains to track and store orders , The storage node also maintains a list of local network locations . Arweave Use a blockchain like structure . Different from blockchains such as bitcoin and Ethereum , stay Arweave On , The node can decide whether to store the data from the block . therefore , If you compare Arweave A chain of multiple nodes on , They will not be exactly the same —— contrary , Some blocks on some nodes are lost , On other nodes, you can find .

chart 12:blockweave Figure displaying the three nodes in

Last ,Storj and Swarm Two completely different methods are used . stay Storj in , A second node type, called a satellite node, acts as a coordinator for a set of storage nodes , Storage location for managing and tracking data . stay Swarm in , The address of the data is directly embedded in the data block . When retrieving data , The network knows where to look according to the data itself .

Store data to prove

When proving how data is stored , Each network has its own unique approach . Filecoin Use replication to prove —— A proprietary storage proof mechanism , It first stores the data on the storage node , Then seal the data in a sector . The sealing process allows two duplicate fragments of the same data to prove that they are unique to each other , This ensures that the correct number of copies are stored on the network ( So for 「 Proof of reproduction 」).

Crust Break a piece of data into many small pieces , These small pieces are hashed into Merkle In the tree . By hashing the result of a single data stored on a physical storage device with the expected Merkle Compare tree hash values ,Crust You can verify that the file is stored correctly . This is similar to Sia Methods , The difference is Crust Store the entire file on each node , and Sia Store erasure encoded fragments . Crust You can store the entire file on a single node , And you can still use the node trusted execution environment (TEE) To achieve privacy , This is a sealed hardware component that even the hardware owner cannot access . Crust This storage proof algorithm is called 「 Proof of meaningful work 」, Meaningful means that the new hash value is calculated only when the stored data is changed , Thus, meaningless operations are reduced . Crust and Sia All will Merkle The tree root hash is stored on the blockchain , As a true source for verifying data integrity .

Storj Check whether the data has been stored correctly through data audit . Data auditing is similar to Crust and Sia How to use Merkle Tree to validate data fragments . stay Storj On , Once enough nodes return their audit results , The network can determine which nodes are faulty according to most of the responses , Instead of comparing with the fact source of blockchain . Storj This mechanism in is very intentional , Because developers think , Reducing network wide coordination through blockchain can speed up ( No need to wait for consensus ) And bandwidth usage ( There is no need for the entire network to interact regularly with the blockchain ) Improve performance .

Arweave Use the encryption proof of work challenge to determine if the file has been stored . In this mechanism , To enable the node to mine the next block , They need to prove that they can access the previous block and another random block in the network block history . Because in Arweave The data uploaded in is directly stored in the block , Prove that the storage provider did save the file correctly by proving access to the previous block .

Last , stay Swarm It also uses Merkle Trees , The difference is Merkle The tree is not used to determine the file location , Instead, data blocks are stored directly in Merkle In the tree . stay swarm When storing data on , The root of the tree ( It is also the address where the data is stored ) The documentation has been properly partitioned and stored .

Data availability over time

Again , When determining that data is stored in a specific period of time , Each network has a unique approach . stay Filecoin in , To reduce network bandwidth , The storage miner needs to run the replication proof algorithm continuously within the time period to store data . The result hash of each time period proves that the storage space has been occupied by the correct data in a specific time period , So it is 「 Time and space prove 」.

Crust、Sia and Storj Verify the random data segment regularly , And report the results to their coordination mechanism ——Crust and Sia Blockchain , as well as Storj Satellite nodes of . Arweave Ensure the consistent availability of data through its access proof mechanism , This requires miners not only to prove that they can access the last block , And prove that they can access a random block of history . Storing older and rarer blocks is an incentive , Because it increases the likelihood that the miner will win the workload proof challenge , This challenge is a prerequisite for accessing a particular block .

On the other hand ,Swarm Run the lottery regularly , Reward nodes hold less popular data over time , At the same time, it also runs a proof of ownership algorithm for the data that the node promises to store for a longer time .

Filecoin、Sia and Crust The node needs to deposit collateral to become a storage node , and Swarm Just need it for long-term storage requests . Storj No upfront collateral is required , but Storj Part of the deposit income of the miners will be withheld . Last , All networks make periodic payments to the nodes for the period of time that the nodes can prove to store data .

Store price discovery

To determine the storage price ,Filecoin and Sia Use the storage marketplace , Storage vendors set their asking prices , Storage users set the price they are willing to pay , And other settings . then , The storage market connects users to storage providers that meet their requirements . Storj In a similar way , The main difference is that no single network wide market can connect all nodes on the network . contrary , Each satellite has its own set of storage nodes that interact with it .

Last ,Crust、Arweave and Swarm Let the agreement determine the storage price . Crust and Swarm Some settings can be made according to the user's file storage requirements , and Arweave The files on the are stored permanently .

Persistent data redundancy

as time goes on , Nodes will leave these open public networks , When the node disappears , The data they store will also disappear . therefore , The network must actively maintain a certain degree of redundancy in the system . Sia and Storj By collecting a subset of fragments 、 Rebuild the underlying data and then re encode the file to recreate the missing fragment , Redundancy is achieved by supplementing lost erasure encoded segments . stay Sia in , Users must log in regularly Sia Only the client can replenish the fragments , Because only the client can distinguish which data fragments belong to which data and users . And in the Storj On ,Satellite Always online and regularly run data audits to supplement data fragments .

Arweave Our access proof algorithm ensures that data is always replicated regularly throughout the network , And in the Swarm On , Data is copied to nodes close to each other . stay Filecoin On , If the data disappears over time and the remaining file fragments fall below a certain threshold , Storage orders will be reintroduced into the storage market , Allow another storage miner to take over the storage order .Crust Replenishment mechanism (replenishment mechanism) Currently under development .

Drive data transmission

as time goes on , After the data is safely stored , Users will want to retrieve data . Because bandwidth comes at a cost , Therefore, data must be provided to motivate storage nodes when necessary . Crust and Swarm Use debt and credit mechanisms , Each node tracks how inbound and outbound traffic flows to the nodes they interact with . If a node only accepts inbound traffic , But the outbound flow is not accepted , Then it will be de prioritized for future communication , This may affect their ability to accept new stored orders . Crust Use IFPS Bitswap Mechanism , and Swarm Use the name SWAP Exclusive agreement of . stay Swarm Of SWAP Agreement on , The network allows nodes to pay off their debts with stamps ( Only accept inbound traffic without sufficient outbound traffic ), This can be exchanged for their practical tokens .

chart 13: Group accounting agreement (SWAP), source :Swarm white paper

This tracking of node generosity is also Arweave How to ensure that data is transmitted on request . stay Arweave in , This mechanism is called wildfire , Nodes will give priority to peer nodes with better ranking , And rationalize the use of bandwidth accordingly . Last , stay Filecoin、Storj and Sia On , Users will eventually pay for bandwidth , Thus, the nodes are encouraged to deliver data when requested .

Token economy

Token economy design ensures the stability of the network , It also ensures that the network will exist for a long time , Because the final data is only as permanent as the network . In the table below , We can find a brief summary of token economics design decisions , And the inflation and deflation mechanism embedded in the corresponding design .

chart 14: Token economy design decisions for audited storage networks .

Which is the best network ?

It cannot be said that one network is objectively better than another . When designing decentralized storage networks , There are countless tradeoffs . although Arweave Ideal for storing data permanently , but Arweave Not necessarily suitable for Web2.0 Industry participants migrate to Web3.0 - Not all data needs to be permanently saved . however , A strong data sub domain really needs permanence :NFT and dApp.

Final , Design decisions will be based on the purpose of the network .

Here is a summary of the various storage networks , They compare with each other on a set of scales defined below . The scales used reflect the comparative dimensions of these networks , But it should be noted that , In many cases, there is no good or bad way to overcome the challenges of decentralized storage , It just reflects the design decision .

  • Storage parameter flexibility : The user controls the extent to which the file stores parameters

  • Storage persistence : To what extent can file storage achieve theoretical persistence through the network ( That is, no intervention is required )

  • Redundant persistence : The ability of a network to maintain data redundancy by supplementing or repairing

  • Data transmission incentives : The extent to which the network ensures that nodes transmit data generously

  • The universality of storage tracing : The degree of consensus between nodes on the location of data storage

  • Guaranteed data accessibility : The ability of the network to ensure that a single participant in a stored procedure cannot remove access to files on the network

The higher the score, the stronger the ability of the above items .

Filecoin Token economics supports increasing the storage space of the entire network , Used to store large amounts of data in an immutable manner . Besides , Their storage algorithm is more suitable for data that is unlikely to change greatly over time ( Cold storage ).

chart 15:Filecoin Summarize the Overview

Crust Token economics ensures super redundancy and fast retrieval , Make it suitable for high flow dApp And it is suitable for quick retrieval of popular NFT The data of .

Crust Low score in storage persistence , Because there is no persistent redundancy , Its ability to provide permanent storage will be severely affected . For all that , Persistence can still be achieved by manually setting extremely high replication factors .

chart 16:Crust Summarize the Overview

Sia It's about privacy . The reason why users need to restore health manually , This is because the node does not know which data segments it has stored , And what data these fragments belong to . Only the data owner can reconstruct the original data from the shards in the network .

chart 17:Sia Summarize the Overview

by comparison ,Arweave It's about persistence . This is also reflected in their endowment design , This makes storage more expensive , But it also makes them NFT Attractive choice for storage .

chart 18:Arweave Summary of

Storj Their business model seems to affect their billing and payment methods to a large extent : Amazon AWS S3 Users are more familiar with monthly billing . By removing the complex payment and incentive systems common in blockchain based systems ,Storj Labs At the expense of some decentralization , But significantly lower AWS Entry threshold for key target groups of users .

chart 19:Storj Summarize the Overview

Swarm The joint curve model ensures that as more data is stored on the network , Storage costs remain relatively low , And its proximity to the Ethereum blockchain makes it a more complex Ethereum based blockchain dApp Key storage competitors for .

chart 20:Swarm Summarize the Overview

For the challenges of decentralized storage networks , There is no single best method . According to the purpose of the network and the problems it tries to solve , It must balance the technology of network design with the economics of token .

chart 21: A powerful use case summary of the reviewed storage network

Last , The purpose of the network and the specific use cases it tries to optimize will determine various design decisions .

The next chapter

go back to Web3 Infrastructure pillars ( Consensus 、 Storage 、 Calculation ), We see decentralized storage with a few powerful players , They have positioned themselves in the market for specific use cases . This does not preclude new networks optimizing existing solutions or capturing new niche markets , But it does raise a question : What's next ?

The answer is : Calculation . The next frontier for truly decentralized Internet is decentralized computing . at present , Only a few solutions can be trusted 、 Decentralized computing solutions to market , These solutions can be complex dApp Provide support , These solutions can perform more complex calculations at a cost far lower than the cost of executing smart contracts on the blockchain .

Internet computer (ICP) and Holochain(HOLO) It is a network that has a strong position in the decentralized computing market at the time of writing this article . For all that , Computing space is not as crowded as consensus and storage space . therefore , Powerful competitors will enter the market sooner or later and position themselves accordingly . Stratos(STOS) Is one of such competitors . Stratos Provide unique network design through its distributed data grid technology .

We will decentralize computing , especially Stratos The network design of network is regarded as the field of future research .

writing :0xPhillan、Fundamental labs

translate :Tia

版权声明:本文为[Foresight News]所创,转载请带上原文链接,感谢。 https://netfreeman.com/2022/174/202206231545189489.html