1. 程式人生 > >區塊鏈入門教程以太坊源碼分析fast sync算法一

區塊鏈入門教程以太坊源碼分析fast sync算法一

point nis define mat ive 註意事項 快速 ted hash

區塊鏈入門教程以太坊源碼分析fast sync算法一,2018年下半年,區塊鏈行業正逐漸褪去發展之初的浮躁、回歸理性,表面上看相關人才需求與身價似乎正在回落。但事實上,正是初期泡沫的漸退,讓人們更多的關註點放在了區塊鏈真正的技術之上。
this PR aggregates a lot of small modifications to core, trie, eth and other packages to collectively implement the eth/63 fast synchronization algorithm. In short, geth --fast.
這個提交請求包含了對core,trie,eth和其他一些package的微小的修改,來共同實現eth/63的快速同步算法。 簡單來說, geth --fast.
Algorithm 算法
The goal of the the fast sync algorithm is to exchange processing power for bandwidth usage. Instead of processing the entire block-chain one link at a time, and replay all transactions that ever happened in history, fast syncing downloads the transaction receipts along the blocks, and pulls an entire recent state database. This allows a fast synced node to still retain its status an an archive node containing all historical data for user queries (and thus not influence the network’s health in general), but at the same time to reassemble a recent network state at a fraction of the time it would take full block processing.
快速同步算法的目標是用帶寬換計算。 快速同步不是通過一個鏈接處理整個區塊鏈,而是重放歷史上發生的所有事務,快速同步會沿著這些塊下載事務處理單據,然後拉取整個最近的狀態數據庫。 這允許快速同步的節點仍然保持其包含用於用戶查詢的所有歷史數據的存檔節點的狀態(並且因此不會一般地影響網絡的健康狀況),對於最新的區塊狀態更改,會使用全量的區塊處理方式。
An outline of the fast sync algorithm would be:
Similarly to classical sync, download the block headers and bodies that make up the blockchain
Similarly to classical sync, verify the header chain’s consistency (POW, total difficulty, etc)
Instead of processing the blocks, download the transaction receipts as defined by the header
Store the downloaded blockchain, along with the receipt chain, enabling all historical queries
When the chain reaches a recent enough state (head - 1024 blocks), pause for state sync:
Retrieve the entire Merkel Patricia state trie defined by the root hash of the pivot point
For every account found in the trie, retrieve it’s contract code and internal storage state trie
Upon successful trie download, mark the pivot point (head - 1024 blocks) as the current head
Import all remaining blocks (1024) by fully processing them as in the classical sync
快速同步算法的概要:
與原有的同步類似,下載組成區塊鏈的區塊頭和區塊body
類似於原有的同步,驗證區塊頭的一致性(POW,總難度等)
下載由區塊頭定義的交易收據,而不是處理區塊。
存儲下載的區塊鏈和收據鏈,啟用所有歷史查詢
當鏈條達到最近的狀態(頭部 - 1024個塊)時,暫停狀態同步:
獲取由 pivot point定義的區塊的完整的Merkel Patricia Trie狀態
對於Merkel Patricia Trie裏面的每個賬戶,獲取他的合約代碼和中間存儲的Trie
當Merkel Patricia Trie下載成功後,將pivot point定義的區塊作為當前的區塊頭
通過像原有的同步一樣對其進行完全處理,導入所有剩余的塊(1024)
分析 Analysis
By downloading and verifying the entire header chain, we can guarantee with all the security of the classical sync, that the hashes (receipts, state tries, etc) contained within the headers are valid. Based on those hashes, we can confidently download transaction receipts and the entire state trie afterwards. Additionally, by placing the pivoting point (where fast sync switches to block processing) a bit below the current head (1024 blocks), we can ensure that even larger chain reorganizations can be handled without the need of a new sync (as we have all the state going that many blocks back).
通過下載和驗證整個頭部鏈,我們可以保證傳統同步的所有安全性,頭部中包含的哈希(收據,狀態嘗試等)是有效的。 基於這些哈希,我們可以自信地下載交易收據和整個狀態樹。 另外,通過將pivoting point(快速同步切換到區塊處理)放置在當前區塊頭(1024塊)的下方一點,我們可以確保甚至可以處理更大的區塊鏈重組,而不需要新的同步(因為我們有所有的狀態 TODO)。
註意事項 Caveats
The historical block-processing based synchronization mechanism has two (approximately similarly costing) bottlenecks: transaction processing and PoW verification. The baseline fast sync algorithm successfully circumvents the transaction processing, skipping the need to iterate over every single state the system ever was in. However, verifying the proof of work associated with each header is still a notably CPU intensive operation.
基於歷史塊處理的同步機制具有兩個(近似相似成本)瓶頸:交易處理和PoW驗證。 基線快速同步算法成功地繞開了事務處理,跳過了對系統曾經處於的每一個狀態進行叠代的需要。但是,驗證與每個頭相關聯的工作證明仍然是CPU密集型操作。
However, we can notice an interesting phenomenon during header verification. With a negligible probability of error, we can still guarantee the validity of the chain, only by verifying every K-th header, instead of each and every one. By selecting a single header at random out of every K headers to verify, we guarantee the validity of an N-length chain with the probability of (1/K)^(N/K) (i.e. we have 1/K chance to spot a forgery in K blocks, a verification that’s repeated N/K times).
但是,我們可以在區塊頭驗證期間註意到一個有趣的現象 由於錯誤概率可以忽略不計,我們仍然可以保證鏈的有效性,只需要驗證每個第K個頭,而不是每個頭。 通過從每個K頭中隨機選擇一個頭來驗證,我們保證N長度鏈的可能會被偽造的概率為(1 / K)^(N / K)(在K塊中我們有1 / K的機會發現一個偽造,而驗證經行了N/K次。)。
Let’s define the negligible probability Pn as the probability of obtaining a 256 bit SHA3 collision (i.e. the hash Ethereum is built upon): 1/2^128. To honor the Ethereum security requirements, we need to choose the minimum chain length N (below which we veriy every header) and maximum K verification batch size such as (1/K)^(N/K) <= Pn holds. Calculating this for various {N, K} pairs is pretty straighforward, a simple and lenient solution being
我們將可忽略概率Pn定義為獲得256位SHA3沖突(以太坊的Hash算法)的概率:1/2 ^ 128。 為了遵守以太坊的安全要求,我們需要選擇最小鏈長N(在我們每個塊都驗證之前),最大K驗證批量大小如(1 / K)^(N / K)<= Pn。 對各種{N,K}對進行計算是非常直接的。
N K N K N K N K
1024 43 1792 91 2560 143 3328 198
1152 51 1920 99 2688 152 3456 207
1280 58 2048 108 2816 161 3584 217
1408 66 2176 116 2944 170 3712 226
1536 74 2304 128 3072 179 3840 236
1664 82 2432 134 3200 189 3968 246
The above table should be interpreted in such a way, that if we verify every K-th header, after N headers the probability of a forgery is smaller than the probability of an attacker producing a SHA3 collision. It also means, that if a forgery is indeed detected, the last N headers should be discarded as not safe enough. Any {N, K} pair may be chosen from the above table, and to keep the numbers reasonably looking, we chose N=2048, K=100. This will be fine tuned later after being able to observe network bandwidth/latency effects and possibly behavior on more CPU limited devices.

區塊鏈入門教程以太坊源碼分析fast sync算法一