Consensus mechanism （Consensus Mechanism） It is an algorithm for blockchain transactions to reach distributed consensus , With the continuous promotion of blockchain Technology , Consensus mechanism is the core of blockchain , More and more people pay attention to it . Consensus mechanism plays an important role in protecting data consistency . This article selects 8 A common consensus mechanism , According to the principle of mechanism 、 The role in the running process 、 Algorithm flow and advantages and disadvantages , Proof of workload 、 Proof of interest 、 Capacity proof and other mechanisms are introduced in detail . meanwhile , This paper also makes a comparative analysis of similar mechanisms . So as to deepen people's understanding of the consensus mechanism , Accelerate the development of blockchain technology .
Blockchain is the underlying technology of bitcoin , Similar to a database ledger , The consensus mechanism is the core of the rules in decentralized distributed ledgers , It determines the security of the blockchain 、 Many important features such as scalability and decentralization .
Consensus mechanism refers to the process of reaching a unified agreement on the state of the network in a decentralized way . Also known as consensus algorithms , Helps verify and verify that information is added to the ledger , Make sure that only real transactions are recorded on the blockchain . therefore , Consensus mechanism is responsible for safely updating the state of data in distributed networks . Rules that have been hard coded into the protocol ensure that a unique data source can always be found and agreed upon in the global computer network . These rules protect the entire network , Realize a network without trust , Without central data or intermediaries .
The consensus mechanism is to decide which participating node should be used for bookkeeping , And important means to ensure the safe completion of transactions . Consensus mechanisms need to balance efficiency and security , The more complex the security measures are , The slower the corresponding processing time . And want to improve processing speed , Simplifying the complexity of security measures is a very important step .
Consensus mechanism satisfies both consistency and effectiveness . Consistency means that all honest nodes keep the same part of the blockchain prefix , The validity means that the information released by an honest node will be recorded in its own block by all other honest nodes . The consensus mechanism ensures that the blockchain is fault tolerant , So it's reliable and consistent . Unlike a centralized system , Users don't have to trust anyone in the system , The protocol rules embedded in the blockchain consensus mechanism can ensure that only valid and real transactions can be recorded in the public transparent account book , The protocol rules embedded in the network ensure that the state of the public ledger is always updated as the public consensus changes .
An important advantage of decentralized blockchain is the allocation of authorization , Anyone can get involved on the same basis . The consensus mechanism can ensure that there is no discrimination in the blockchain , So as to achieve fair distribution . Due to the open source nature of the public blockchain , So that anyone can monitor and verify that the underlying source code is fair to all participants in the network .
Consensus mechanism can be achieved by encouraging good behavior , In some cases , Punish bad actors to achieve this . For example, in the proof of workload mechanism , By rewarding the miners with bitcoin , Reward them with guarantees and verifications for every transaction . Any computing and security maintenance requires a lot of computing power and money , The consensus mechanism enables these resources to be better used to work for the system , Not for the system .
Common consensus mechanisms are ：PoW（ Proof of workload ）、PoS（ Proof of interest ）、DPoS（ Certificate of appointment ）、PBFT（ Practical Byzantine fault tolerance algorithm ）、POOL（ Validation pool ） etc. .
2 PoW：Proof of Work（ Proof of workload ）
2008 year , In the bitcoin white paper （Bitcoinswhitepaper） On ,PoW Get attention for the first time .PoW It's relying on machines for mathematical operations （ And or operation , Calculate a random number that meets the rules ） To get the right to book this time , Send the data to other nodes in the whole network , After verification by other nodes , Store the data after reaching a consensus .PoW It was originally an economic term , It's a measure set for a system to achieve a goal . A simple understanding is a proof , To make sure you've done a certain amount of work . The whole process of monitoring is usually extremely inefficient , Through the certification of the results of the work to prove the completion of the corresponding workload , It's a very efficient way .
Illustrate with examples ,10+？=12, Who can figure out the answer first , Who will harvest .
One sentence summary ： The more you do , The more you collect （ There is and only actual labor , To get results ）
chart 1: PoW How it works
2.1 A term is used to explain
hash function Hash The hash function is a one-way encryption function , Hash algorithm can get any number of data , And return a fixed length string , The string is completely unique for a particular input .
Nonce A random number that can only be used once .
The miners Miners Independent transaction processor in cryptocurrency network , The goal is to validate the transaction . It's also known as Full Node or Node.
Worker There must be a certain amount of work to be done , This amount is given by the job verifier ;
Workers can't find a way to get their work done quickly ;
Workers can't themselves ” Creating jobs ”, The work has to be released by the verifier .
Verifier Can quickly check whether the workload is up to standard .
1. Method is simple , Easy to implement .
2. There is no need to exchange additional information between nodes to reach a consensus （ Free access between nodes ）.
3. It costs a lot to destroy the system .
4. All nodes of the whole network are required to participate , Completely decentralized .
1. At present, bitcoin has attracted most of the world's computing power , The new blockchain must find a different hash algorithm , It's hard to use the same computing power as in the past to get the same security .
2. A lot of resources are wasted .
3. Consensus takes a long time , Not suitable for commercial applications （ It's easy to split , Need to wait for more than one confirmation , It is difficult to shorten the confirmation time of blocks ）.
4. There will never be finality , A checkpoint mechanism is needed to compensate for finality .
2.5 Application example
In bitcoin , Use PoW Confirm the validity of the block （ As long as CPU The amount of work consumed can satisfy the proof mechanism of the workload , Well, unless you do a lot of work again , The information in this block cannot be changed ）
2.6 Problem explanation
Why do you say PoW The problem of energy consumption is serious ？
—— Because the miners （Miners） Every guess of （guess） It takes a computer to produce a certain amount of energy . at present , The hash rate of the entire bitcoin network （Hash rate） by ～17,000,000 TH/s, That's the entire network per second 17,000,000,000,000,000,000 A guess （guess）. This energy demand is roughly the same as Hungarian consumption .
3 PoS：Proof of Stake（ Proof of interest ）
2012 year ,PoS As a little coin （Peercoin） For the first time .PoS yes PoW An upgraded consensus mechanism , You don't have to consume electricity to do calculations , According to the difficulty of obtaining the bookkeeping right of each node , Its equity is inversely proportional to its ownership , Reduce the difficulty of mining in equal proportion , To speed up the search for random numbers . To keep it simple ,PoS There are no miners in （Miners）, It's a verifier （Validators）. It's still a way to get accounting rights based on hash operation competition , Fault tolerance and PoW identical .PoS It's based on the amount of digital money that miners currently have , A system of interest distribution based on the amount and time of money you hold , stay PoS In mode , Yours “ dig ” The income is proportional to the age of your currency , It has nothing to do with the computing power of the computer .
Illustrate with examples ,PoS It's like money in our hands . When we have more money , The more rights you get in life . One sentence summary ： The more you hold , The more you get （ You get interest when you deposit money in the bank ）
3.1 A term is used to explain
Verifier Validators Validator , To verify the transaction , The verifier must bet a certain amount of stake,stake The size of the validator determines the possibility of the verifier verifying the next block .
1. To a certain extent, it shortens the time for consensus , More efficient transfer of funds .
2. There's no longer a lot of energy and computing power to mine .
1. Or mining , In essence, there is no solution to the pain point of commercial application .
2. All confirmation is just a probability expression , It's not a matter of certainty , In theory, there may be other attack effects .
3. Degree of decentralization , It is easy for the strong to be strong , Large holders of money earn interest by holding money , And then there's a monopoly problem .
3.4 Problem explanation
PoS In improving PoW After the problem of energy consumption , What other issues need to be considered ？
—— One of the most controversial issues is , If too much weight is given to a lot of wealth or old nodes , The Internet will quickly become unfair .
surface 1: PoW and PoS Comparison
4 DPoS：Delegated Proof of Stake（ Certificate of appointment ）
And PoS The main difference is that nodes elect several agents , After authorizing a vote to a proxy , Verification and bookkeeping by agent , The wallet is the status monitor . Its compliance regulation 、 performance 、 Resource consumption and fault tolerance PoS be similar . It's like a board vote , The coin holder throws a certain number of nodes , The node selects the agent , Acting on their behalf for verification and bookkeeping .
Illustrate with examples , If the holder A Supported the agent 50 Coins , Currency holder B Supported the agent 10 Coins , that A The voting weight is B Of 5 times .
One sentence summary ： The node elects several agents , Verification and bookkeeping by agent
4.1 Part of the
1. Select a group of block producers ： The election process is made up of token The holder decides , The performance of the elected producers will affect the performance of the entire network , Which in turn affects token Interests of holders .
2. Scheduling production
4.2 Algorithm flow （ Take normal operation and a few bifurcations for example ）
1. Normal operation ： Under normal circumstances, block producers produce in sequence , The interval is 3s. No one is missing production , Here's the longest chain （ The arrow points to the previous block ）.
chart 2: Normal operation
2. A few forks ： When not more than 1/3 Of nodes are malicious or not working , And when there's a bifurcation , The following figure B, This fork every 8 A block comes out in seconds , And every node that works normally 8 Second out 2 Block . The reason is according to A,B,C,A... The order of , Each node has to wait for the corresponding time to block . According to the longest chain principle , The system still works .
chart 3: A few forks
1. Greatly reduce the number of nodes participating in verification and accounting , It can achieve consensus verification in seconds .
2. Through the affirmative vote system , You can make sure that even if one person has 50％ Effective voting rights of , You can't choose a blocker alone , Ensure algorithm security .
3. When most people have problems ,DPoS You can still work .
1. The whole consensus mechanism still depends on token , A lot of commercial applications don't need tokens .
2. Weak centralization , The degree of decentralization is not high .
4.5 Application example
EOSIO：EOSIO By commission (DPoS) system maintenance , The system was originally developed by Larimer Created , It's still being Steemit Use .Dan Larimer For the first time in BitShares Use DPOS, It immediately became one of the fastest blockchains . later BM take DPOS introduce Steem,Steem It is considered to be one of the fastest and most stable blockchain projects , Handle more than 100 Million transactions , And use only one percent of its capacity .
5 PBFT：Practical ByzantineFault Tolerance（ Practical Byzantine fault tolerance algorithm ）
PBFT Is a state machine replication algorithm , Generally, there are three kinds of agreements ： Agreement of conformity (agreement)、 Checkpoint protocol (checkpoint) And view change protocol (view change). To ensure activity and safety （liveness and safety） The premise of providing (n-1)/3 Fault tolerance . On Distributed Computing , Different computers exchange information through , Try to reach a consensus ; But sometimes , The system coordinates the computer （Coordinator / Commander） Or member computers （Member/Lieutanent） May be due to system error and exchange wrong messages , Lead to impact on the final system consistency . The Byzantine general problem is based on how many wrong computers there are to find a possible solution , Although we can't find an absolute answer , But it can only be used to verify whether a mechanism works . The possible solution to the Byzantine problem is ： stay N ≥ 3F + 1 Consistency is possible to solve . among ,N Is the total number of computers ,F Total number of computers with problems . After information is exchanged between computers , Each computer lists all the information it gets , Take most of the results as the solution .
One sentence summary ： Every “ The general ” Run calculations or operations based on internal state and new messages , In order to reach a personal decision , Individuals will decide to share , On the basis of all the decisions, a consensus decision is made .
Agreement of conformity There are at least several stages ： request （request）、 Serial number assignment （pre-prepare） And response （reply）. Depending on the protocol design , It may involve interaction （prepare）, Serial number confirmation （commit） Equal stage .
Checkpoint protocol The algorithm sets periodic checkpoint protocol , Synchronize the servers in the system to the same state , The system will set up a check-point The timing of the , At this time, the log can be processed regularly 、 Save resources and correct server status in time .
The view is more Exchange agreement A replica node i It detects that the master node is doing evil or offline , There will be a timeout mechanism , Start view change , And number the view v Changed to: v+1, At the same time, do not accept the message except checkpoint （checkpoint）、 View change message (view-change) And new view messages （new-view） Other requests for information .
Based on fully homomorphic encryption PSI Fully homomorphic encryption technology enables the mathematical operation of plaintext to be carried out directly on the ciphertext without first decrypting the ciphertext . Early homomorphic encryption was inefficient , And its performance has only improved in recent years . Using fully homomorphic encryption PSI agreement , By encrypting one party with a smaller set and sending it to the other party , The other party is responsible for finding the intersection of two sets based on ciphertext , Then send the result to the other party to decrypt . Using fully homomorphic encryption PSI Can achieve relatively small interaction complexity , But its computational complexity is usually very high , This leads to inefficient protocols . therefore , Reducing computational complexity without sacrificing too much interaction complexity is based on fully homomorphic encryption PSI The main challenges to the agreement .
5.2 Algorithm flow
1. Client initiates message ： client C Send the operation request to the master node <REQUEST,o,t,c>
o: Request state machine operation
t: Client appended timestamp
c: Client flags
REQUEST: Include message content m, And a summary of the message d(m) etc.
2. Pre preparation stage : In the preparatory stage , The master node verifies the signature of the received client request message , Verification passed , Based on the current view v Assign a serial number n To the received client request , Then send pre prepared messages to all backup node groups < <PRE-PREPARE, v, n , d>, m>
v： View number m： Request message sent by client d： The request message m A summary of the n： Pre prepared message sequence number
The purpose of preparing messages is to prove , Make sure the request is in view v Is given the serial number n, So that messages can still be traced during view changes .
3. Preparation stage : Replica node i Received... From the master node PRE-PREPARE news , The following checks are required ：
• Master node PRE-PREPARE Is the message signature correct .
• Whether the current replica node has received one in the same v And the number is n, But the signature is different PRE-PREPARE Information .
• d And m Is the summary consistent .
• n Is it in the interval [h, H] Inside .
After the verification is passed , Replica node i Send preparation messages to all replica nodes <PREPARE,v, n, d, i>, And will prepare the message and prepare the message in the node i Preservation , be used for View Change Recover the uncompleted request operation in the process . When node i Received more than 2f+1 After preparing messages for nodes , You can enter the confirmation stage . It checks the view number of these messages v, Message number n And summary d.
4. Confirmation stage ： The node entering the confirmation stage sends a message to other nodes including the master node <COMMIT, v, n, d, i> Confirmation message . When node i Accept to 2f+1 individual m After confirming the message and satisfying the corresponding conditions , news m Changed to: committed-local state , Node execution m Request . At the completion of m After the requested operation , Each replica node sends a reply to the client . Get into reply Stage .
5. Respond to clients ： node i return <REPLY, v, t, c, i, r> To the client ,r： Request operation result . If the client receives f+1 A the same REPLY news , It indicates that the request initiated by the client has reached the consensus of the whole network , Otherwise, the client needs to determine whether to resend the request to the primary node .
chart 4: notes ： copy 0 It's the master node , copy 3 It's a failed node ,C The client side. .
1. The system can operate without the existence of money ,PBFT Algorithm consensus that each node is composed of business participants or regulators , Security and stability are guaranteed by business stakeholders .
2. The delay of consensus is about 2~5 Second , It basically meets the requirements of commercial real-time processing .
3. Consensus is efficient , High throughput , It can meet the demand of high frequency trading volume .
4. Power consumption mode without proof of workload , More energy saving and environmental protection .
1. Limited by the number of nodes and nodes need to be elected or licensed , Scalability and decentralization are weak .
2. Low fault tolerance , When there is 1/3 Or after the bookkeeper stops working , The system will not be able to provide services ; When there is 1/3 Or a combination of the above bookkeepers , And the other two islands are divided into two separate networks , Malicious bookkeepers can cause the system to bifurcate , But it will leave behind cryptographic evidence .
6 POOL（ Validation pool ）
Verification pool mechanism is based on the combination of traditional distributed consistency technology and data verification mechanism , It makes in mature distributed consistent algorithms （Paxos、Raft） On the basis of , Second level consensus verification can be achieved without token .
1. It works without a token , In the mature distributed consistency algorithm （Paxos、Raft） On the basis of , Second level consensus verification .
2. Increase the speed of your application , Improve efficiency and reduce overhead .
1. Less decentralized than bitcoin .
2. More suitable for multi center business model with multi participation .
7 PoC：Proof of Capacity（ Proof of capacity ）
2015 year , The idea is Dziembowski And so on .PoC By allocating a certain amount of memory or disk space to address the challenges provided by service providers , It shows that someone has something to do with a service （ E.g. send email ） Have a legitimate interest in . although Ateniese Papers of others The name is also “Proof-of-space”, But it's actually an adoption of MHF（Memory Hard Function, A hash algorithm whose computation cost depends on memory ） Of PoW agreement .PoC It's a way to save computing time by caching a lot of data .
Illustrate with examples , Fill the hard drive with lottery tickets , Generate a random number , Then check the person who matches the most numbers . If you have the best match , You'll win the prize .
One sentence summary ： The more storage space , The more you collect （ There is and only actual labor , To get results ）
7.1 A term is used to explain
Hash value Shabal Shabal The algorithm is a very slow algorithm , Allows the input of any length of ordinal sequence , Even an empty sequence . Also suitable for byte streams of any length , But because of security , The suitable length should be less than 27 position . The input length can be the sum of any integer value 8 Multiple . If given a bit Sequence , Index it in left and right order , The index of the first place is 0. Use left and right to describe an ordered sequence of bits ： The first bit in the sequence is called the leftmost bit , The last one is called rightmost .
Plotting mapping Generate a file that stores a large number of precomputed hashes , After the memory is full of hashed files , You can still participate in the block creation process . This process is called drawing .
7.2 Algorithm flow
1. Hard drive drawing ： Depending on the size of the hard drive , It can take days or weeks to create a unique drawing file . When drawing is called Shabal Very slow hash value of . And SHA-256 Hash values are different ,Shabal It's a hash value that bitcoin miners use quickly . because Shabal Hash values are hard to calculate , So we have to pre calculate them and store them on the hard disk . This process is called drawing hard drives .
2. The actual mining of blocks ： Calculation 0 To 4085 The number of buckets between . Suppose your calculation is 42. And then you're going to scoop 42 spoon nonce 1 And then use the scoop data to calculate a period of time , This is called the deadline . For everything on the hard disk nonces Repeat the process . After calculating all the deadlines , You'll choose the minimum deadline . The deadline means “ Before you are allowed to forge a block , The number of seconds that must pass since the last block was forged ”. If no one else has forged a block during this time , You can forge a block and get a block reward .
chart 5: PoC The working process diagram of
1. You can use any ordinary hard drive , So other miners won't get an advantage from buying specialized equipment , For example, use ASIC Mining bitcoin .
2. Energy efficiency with hard drives is based on ASIC Of mining 30 times .
3. Capacity proves to be more decentralized , Because everyone has a hard drive . You can even get it from your Android Mining on the hard disk of the mobile phone .
4. Miners don't need to constantly upgrade their equipment . Old hard disks can store data like new ones .
5. The hard drive can be removed after mining , And use it for the original purpose .
1. Universal evidence of capacity extraction could lead to another arms race . Today people use tb The hard disk of , But in the end we'll see pb、exabytes and zettabytes.
2. Capacity proof is a relatively new technology , There is no rigorous testing and challenge in the real world .
3. at present , The data drawn by the hard drive is useless except for mining purposes . However , There are plans to use hard disks as redundant storage for important open source information . The hard disk can store maps 、 Wikipedia articles or other information worth preserving .
4. There are already malware mining bitcoin on people's computers . If capacity proof becomes popular , You might see malware plotting people's hard drives .
7.5 Problem explanation
Why do you say PoC It's a consensus mechanism for low power consumption ？
—— In the context of capacity proof , Because the mechanical hard disk is natural to （ Relative ） Electricity is not sensitive , The cost of electricity is no longer a knock on the door for miners to join the network in the short term Brick , And the wide adaptability of mechanical hard disk , It further reduces the difficulty for miners to join the network , At present, home computers are commonly used 1~2T Hard disk can be used as mining equipment . Further more , Capacity proof doesn't actually store network data , Even if the hard disk is damaged, it will not lead to the loss of network content , Avoided FIlecoin The impact of data loss on network usability in spatiotemporal proof .
chart 6: PoC and PoW Comparison
Paxos from Lamport On 1888 In 《The Part-Time Parliament》 The paper is published for the first time .Paxos The problem solved by the algorithm is the problem of distributed consistency , That is, how each process in a distributed system can achieve a certain value （ The resolution ） To reach an agreement .
Paxos The algorithm runs in an asynchronous system that allows downtime , Reliable messaging is not required , Message loss can be tolerated 、 Delay 、 Disorder and repetition . It uses most of (Majority) The mechanism guarantees 2F+1 The ability of fault tolerance , namely 2F+1 A system of nodes allows at most F Node failures at the same time . One or more proposed processes (Proposer) You can launch a proposal (Proposal),Paxos The algorithm makes one of all proposals , Agree on all the processes . The majority in the system also endorsed the proposal , That is, an agreement has been reached . At most, only one definite proposal can be agreed upon .
Proposer Make a proposal (Proposal).Proposal Information includes proposal number (Proposal ID) And the proposed value (Value).
Acceptor Participate in decision making , Respond Proposers Proposal for . received Proposal After that, we can accept the proposal , if Proposal Get the majority Acceptors Acceptance of , It is called the Proposal Approved .
Learner Not involved in decision making , from Proposers/Acceptors Learn about the latest agreed proposals （Value）.
8.2.1 Safety principles
Promise not to do wrong .
1. Voting on an instance can only have one value approved , There can be no case where an approved value is overridden by another value ;( Let's say that there is a value that is majority Acceptors approved , Then this value can only be learned ).
2. Each node can only learn the approved value , Can't learn values that are not approved .
8.2.2 The principle of survival
As long as most servers survive and communicate with each other , The following things should be done in the end ：
1. Finally, a proposed value will be approved .
2. A value has been approved , Other servers will eventually learn this value .
8.3 Algorithm flow
1. The first stage ：Prepare Stage .Proposer towards Acceptors issue Prepare request ,Acceptors For the received Prepare Request for Promise promise .
2. The second stage ：Accept Stage .Proposer Receive the majority Acceptors Committed Promise after , towards Acceptors issue Propose request , Acceptors For the received Propose Request for Accept Handle .
3. Learn Stage .Proposer After receiving the majority of Acceptors Of Accept after , It marks this time Accept success , The resolution came into being , Send the resulting decision to all Learners.
8.3.1 Algorithm description
1. Prepare: Proposer Generate globally unique and incremental Proposal ID ( You can use the timestamp to add Server ID), To all Acceptors send out Prepare request , There's no need to carry the content of the proposal here , Carry only Proposal ID that will do .
2. Propose: Proposer Receive the majority Acceptors Of Promise After response , Choose from the response Proposal ID The biggest proposal Value, As a proposal initiated this time . If all the proposals answered Value They are all null values , You can decide your own proposal Value. And then carry the current Proposal ID, To all Acceptors send out Propose request .
3. Acceptors received Prepare After the request , To make a “ Two promises , A response ”.
4. Accept: Acceptor received Propose After the request , Without breaking the promise you made before , Accept and persist the current Proposal ID And proposals Value.
5. Learn: Proposer Receive the majority Acceptors Of Accept after , The resolution came into being , Send the resulting decision to all Learners.
chart 7: Paxos The algorithm process
8.3.2 Two promises , A response 
1. Two promises ： No longer accept Proposal ID Less than or equal to the currently requested Prepare request ; No longer accept Proposal ID Less than the currently requested Propose request .
2. A response ： Don't violate previous promises , The reply has been Accept In the past proposal Proposal ID The biggest proposal Value and Proposal ID, If not, null value will be returned .
1. Efficient , Node communication does not need to verify the identity signature .
2. Paxos The algorithm has strict mathematical proof , The system is well designed .
3. Fault tolerance : Less than half of Acceptor invalid 、 Any number of Proposer invalid , Can run . once value The value is determined , Even if less than half of Acceptor invalid , This value can also be obtained , It's not going to change .
1. Engineering practice is difficult , Achieving industrial performance requires engineering optimization to varying degrees , And sometimes the deviation of engineering design will cause the collapse of the whole system .
2. Only applicable to permissionedsystems( The chain of private ), Only fault nodes can be accommodated (fault), No evil nodes (corrupt).
3. a CFT(Crash FaultTolerant), Byzantine fault tolerance is not supported (ByzantineFault Tolerance).
8.6 Problem explanation
Multi-Paxos and Paxos The relationship between ？
—— simple Paxos The algorithm runs through multiple rounds of Prepare/Accept Process to determine a value , We call the whole process a Instance.Multi-Paxos It's through Paxos Algorithm to determine many values , And the order of these values is exactly the same at each node , To sum up, it is to determine a global order .
Raft It started as a consensus algorithm for managing replication logs , It's a protocol for real world applications , It mainly focuses on the implementation and comprehensibility of the agreement .Raft It's a strong consensus agreement reached under a non Byzantine breakdown , Is a consistency algorithm for managing replication logs . It's primarily designed to be easy to understand , Therefore, it has chosen a very simple and clear solution in the conflict handling and other ways .Paxos The effective basic guarantee is ： Any two legal sets , There must be a public member . In addition to improving the performance of the whole system, distributed system also has an important feature of improving the reliability of the system . Providing reliability can be understood as the failure of one or more machines in the system will not make the system unavailable （ Or lose data ）. The key to ensure the reliability of the system is multiple copies （ That is, data needs to be backed up ）, Once there are multiple copies , Faced with the problem of consistency between multiple copies for so long . Consistency algorithm is used to solve the problem of data consistency among multiple copies in distributed environment .
The most famous consistency algorithm in the industry is famous Paxos（Chubby My author once said ： There is only one consistency algorithm in the world , Namely Paxos）.
but Paxos It's notoriously hard to understand , and Raft It is to explore a more understandable consistency algorithm .
Raft The system is required to have at most one at any time Leader, During normal work, only Leader and Followers.
Leader The leader Accept client requests , And to Follower Synchronization request log , When the log is synchronized to most nodes, tell Follower Submission log . All changes to the system go through Leader, A log will be written for each change (Log Entry).Leader The process after receiving the modification request is as follows , This process is called log replication (Log Replication)：
chart 8: Log copy
Follower Follower All nodes are represented by Follower The state of being . If not received Leader The news will become Candidate state . Accept and persist Leader Synchronized logs , stay Leader After the notification log can be submitted , Submission log .
Candidate The candidate Leader A temporary role in the election process . It will send to other nodes “ Canvassing for votes ”, If you get the majority of the tickets, it becomes Leader.
chart 9: The relationship between the three characters
9.2 Algorithm flow
1. Leader Election Leader election ： When Follower Not received within the election timeout Leader The heartbeats of , Convert to Candidate state . To avoid electoral strife , This timeout is a 150~300ms Random number between . generally speaking , stay Raft In the system ：
• Any server can be a candidate Candidate, It sends it to other servers Follower Make a request to elect yourself .
• The other servers agreed , issue OK. Be careful , If in the process , There is one Follower Downtime , There was no request for an election , At this point, candidates can choose themselves , As long as we reach N/2+1 Most of the tickets for , Candidates can still be Leader Of .
• So the candidate becomes Leader leaders , It can tell voters that Follower Issue commands , Such as bookkeeping .
• Notification of bookkeeping by heartbeat .
• Once this Leader collapsed , that Follower One of the candidates , And send out an invitation to vote .
• Follower After consent , It becomes Leader, Continue to undertake accounting and other guidance work .
2. Log Replication Log copy ：Raft The bookkeeping process is completed in the following steps ：
chart 10: Accounting process
For each new transaction , Repeat the process . In the process , In case of network communication failure , bring Leader Can't access most of Follower 了 , that Leader Only those that it can access can be updated normally Follower The server . And most servers Follower Because there is no Leader, They will re elect a candidate as Leader, And then this Leader Dealing with the outside world as a representative , If they are asked to add new transactions , This new Leader Just follow the above steps to inform the majority of Follower. When network communication is restored , The original Leader It becomes Follower, In the lost phase , This old man Leader Any update to is not a confirmation , Must roll back all , Receive new Leader New update of .
1. Than Paxos Algorithms are easier to understand , And it's easier to engineer .
2. Raft And Paxos As efficient as , Efficiency Raft Equivalent to (multi-)Paxos.
3. Apply to permissionedsystems( The chain of private ), Only fault nodes can be accommodated （CFT）, Evil nodes are not supported . The biggest fault tolerant node is (N-1)/2, among N Is the total number of nodes in the cluster . Enhanced Leader The status of , The whole agreement can be divided into two parts ：Leader In time , from Leader towards Follower Synchronous log ;Leader It doesn't work , Choose a new one Leader.
4. Emphasis on legality Leader The uniqueness agreement of , They come directly from Leader Of Degree describes the process of the protocol , Also from the Leader From the point of view of proof of correctness .
1. Only applicable to permissioned systems ( The chain of private ), Only fault nodes can be accommodated (CFT), No evil nodes .
surface 2: Raft and Multi-Paxos Comparison
 Beccuti, J., and Jaag, C. The bitcoin mining game:On the optimality of honesty in proof-of-work consensus mechanism. WorkingPapers (2017).
 From Wikipedia, t. f. e. Paxos (computer science). https://en.wikipedia.org/wiki/Paxos_(computer_science).
 From Wikipedia, t. f. e. Proof of work. https://en.wikipedia.org/wiki/Proof_of_work.
 Lamport, L. Paxos madesimple. https://www.microsoft.com/en-us/research/publication/paxos-
 Team, E. Proof of capacity explained: Theeco-friendly mining algorithm. https://www.coinbureau.com/education/proof-of-capacity-explained/.
 Du Jiangtian . Discussion on Hash Algorithm in workload proof mechanism of blockchain . Computer programming skills and maintenance No.394, 4 (2018), 42–44.
 Yang Yuguang , and Zhang Shuxin . Review and research for consensus mechanism of block chain Information security research 004, 4 (2018), 369–379.
 Liang Bin . from ” Bitcoin mining ” Look at the consensus mechanism of blockchain technology . China financial computer , 9 (2016), 45–46.
 Gan Jun , Li Qiang , Chen Zihao , and juck . The improvement of Byzantine fault tolerant consensus algorithm in blockchain . Computer application , 7 (2019).
 Zhu Chaofan , Guo Jinwei , and Cai Peng . be based on paxos The implementation and optimization of distributed consistency algorithm . Journal of East China Normal University ( Natural science edition ), 10 (2019), 168–177.
 Cheng Shuzhi , Shi Wenxuan , and Liu Li Ting . Overview of blockchain technology .
 Han Xuan , and Liu Yamin . Research on consensus mechanism in blockchain technology . Information network security , 9 (2017).
 Huang Jiacheng , Xu Xinhua , and Wang Shichun . The improvement scheme of the consensus mechanism of entrusting rights and interests proof . Computer application , 7 (2019),2162–2167