Solana验证者(节点)攻略1-交易流程(ENG-CHN)

版权声明(Copyright Notice)

本文翻译自Jito Labs的文章:Solana Validator 101: Transaction Processing。已得到作者的授权。译者为@chainguys。转载请注明作者和译者。

(Coptyright©2021 by Jito Labs, translated by @chainguys

本翻译文章所展示的一切信息都只是为了学习和交流目的,不能也不应成为任何财务或投资建议。
(All content shown are for communication and learning purposes, cannot and should not be viewed as any form of Financial or Investment Advice)

1. 欢迎词(Hello World)

Welcome to the first article in Jito Lab’s series on the Solana validator. In this article, we’ll do a technical deep dive on the lifecycle of a transaction as it moves from an unsigned transaction to a propagated block.

欢迎阅读Jito Labs关于Solana节点系列文章的第一篇。在这篇文章里,我们会将会从技术角度详细描述未签名的交易如何变为链上广播的区块。

If you want to follow along in code, we’re working off commit cd6f931223. This is the master branch as of November 21, 2021, so validators are likely running an older version of the code on this date.

如果您想直接阅读代码,请移步至我们正在施工的cd6f931223。这是2021年11月21号得主分支,所以目前节点们正在执行的是相对比较旧的版本。

There is a lot of activity in this codebase, so always check out the Solana Github for the most up-to-date code. Load it up in your favorite IDE and click through definitions with the article to poke around.

由于代码工作和维护量巨大,所以请记得在Solana Github寻找最新的代码。您可以在在本文的辅助下,在网上找到代码后加载到自己喜欢的IDE中试运行。

Jito Labs is building MEV infrastructure for Solana. If reverse engineering, hacking on code, or building high performance systems interests you, we’re hiring engineers! Please DM @buffalu__ on Twitter.

Jito Labs正在建设Solana MEV的基础设施。如果您对反向工程,编程,高性能系统感兴趣,请不要忘记我们正在招聘工程师,请推特私信 @buffalu__ 联系。

2. 一笔交易的历程(A whole picture of one transaction)

So you’re aping into some coins on Serum or Mango. You sign the transaction in your wallet and your transaction is quickly confirmed with a notification. What just happened?

现在你从Serum或者Mango买了几个币。你用钱包将交易签名,然后你的交易信息(向全网)通知后迅速获得确认。那么,这中间到底发生了什么?

Solana交易信息流

The dapp built a transaction to buy some amount of tokens.

  1. The dapp sent the transaction to your wallet (Phantom, Sollet, etc.) to be signed.

  2. The wallet signed the transaction using your private key and sends it back to the dapp.

  3. The dapp took the signed transaction and uses the sendTransaction HTTP API call to send the transaction to the current RPC provider specified in the dapp.

  4. The RPC server sent your transaction as a UDP packet to the current and next validator on the leader schedule. validators on the leader schedule.

  5. The validator’s TPU receives the transaction, verifies the signature using CPU or GPU, executes it, and propagates it to other validators in the network.

  6. Dapp创建一笔购买代币的交易

  7. Dapp将交易信息传送给你的钱包(如Phantom,Sellet等)

  8. 钱包用你的私钥给交易签名并传回给dapp

  9. Dapp再将已签名的信息,用sendTransaction HTTP API发送给该Dapp目前所使用的RPC(Remote Procedure Call,即远程过程调用)服务器

  10. RPC服务器将你的交易信息作为一个用户数据协议资料包(UDP packet)发送给领导者列表中当前和下一个验证者

  11. 验证者的TPU收到该笔交易信息,然后用CPU或GPU来验证这笔交易,然后执行,最后向网络中其他验证者广播这笔交易

Solana has a known leader schedule generated every epoch (~2 days), so it sends transactions directly to the current and next leader instead of gossiping transactions randomly like the Ethereum mempool does. Read more about the comparison here.

请注意,Solana这个众所周知,每个时段(0–2天)产生一次的“领导者机制”,即交易会被发送到领导者列表中当前和下一个“领导者”,而不是像以太坊mempool 那样随机向全网散发。您可以在此阅读更多的信息。

3. 关于TPU

Let’s go deeper.

让我们再深入一点,聊一聊TPU(Transaction Processing Unit)

验证者和TPU总览

At this point, the RPC service that our dapp is using has received the transaction over HTTP, converted the transaction into a UDP packet, looked up the current leader’s info using the leader schedule, and has sent it to the leader’s TPU.

现在,dapp使用的RPC服务已经收到了HTTP发送过来的交易信息,并且将这些信息打包成了用户数据协议资料包(UDP packet),用“领导者机制”查找到了当前领导者的信息,并成功地将其发送到了领导者的TPU。

So what is this TPU thing? The transaction processing unit (TPU) processes transactions! It leverages message queues (Rust channels) to build a software pipeline consisting of multiple stages. The output of stage n is hooked into stage n + 1.

那到底什么是TPU?所谓的TPU,就是Transaction Processing Unit的缩写,顾名思义,TPU就是专门处理交易的!它使用消息队列(message queues)来建立一个包含多个阶段(multiple stages)的软件管线(software pipeline)。这样第n个阶段就可以自动替换为第n+1个阶段。

下面我们将逐步讲解TPU中涉及的各个各个部分。

3.1 读取阶段(FetchStage

The Solana validator is unique compared to other blockchains because it relies on UDP packets to communicate with each other. UDP connections are stateless (no connection handshaking) and don’t guarantee order or delivery of messages. This removes potential TCP back-pressure issues and connection management overhead while keeping server and networking setup simple for validator operators. However, this limits transaction and gossip message sizes, increases DOS-ability of validators, and may cause transactions and other packets to get dropped.

与其他区块链相比,Solana节点最独特的地方就是它依赖用户数据协议资料包(UDP packets )来通信。UDP连接是无状态的(不需要握手/交握),也不保证有序和传输成功。这样做减少了可能的TCP背压(TCP back-pressure),也减少了高峰时期管理连接时的额外消耗,同时也可以让节点运行者更加方便地配置服务器和网络。但是,这也使得传输信息的规模受到了限制,增加了验证者受到DDOS攻击的概率,同时也可能会导致传输时丢包。

The FetchStage has dedicated sockets for each packet type:

  • tpu: normal transactions (Serum orders, NFT minting, token transfers, etc.)
  • tpu_vote: votes
  • tpu_forwards: if the current leader can’t process all transactions, it forwards unprocessed packets to the next leader on this port.

读取阶段对不同类型的信息包有专们设计的套接字(socket):

  • tpu: 普通交易 (Serum 订单, NFT minting, 代币转移等.)
  • tpu_vote: 投票
  • tpu_forwards: 如果当前领导者(leader)无法处理交易,那么系统会自动将信息包发送到下一个领导者(leader)进行处理

Packets are batched into groups of 128 and forwarded to the SigVerifyStage.

The sockets mentioned above are created here and stored in the ContactInfo struct. ContactInfo contains information on the node and is shared with the rest of the network through a gossip protocol.

信息包将被分为128个组然后再传输到签名验证阶段(SigVerifyStage)。上文中提到的套接字(socket)在此处产生,存储在ContactInfo 架构中。ContactInfo 包含的节点信息也会通过一个八卦协议(gossip protocol)全网共享。

3.2 签名验证阶段(SigVerifyStage

The SigVerifyStage verifies signatures on packets and marks them for discard if they fail. Reminder as far as the software is concerned, these are still packets with some metadata; it still doesn’t know if these are transactions or not.

签名验证阶段将验证签名和信息包,如若失败则会标识它们作废。但请注意,对软件来说,它只认为这些都是包含一些元数据(metadata)的数据包,却并不知道这些是不是交易(信息)。

If you have a GPU installed, it will use it for signature verification. It also contains some logic to handle excessive packets in the case of higher load which uses IP addresses to drop packets.

如果你安装了GPU,那么它就会启动签名验证。这中间也包含了一些可以处理过量数据包的逻辑,以防止丢包等问题。

Votes and normal packets run in two separate pipelines. Packets arriving in the vote pipeline are filtered using simple logic to determine if the packet is a vote or not — this should help prevent DOS on the vote pipeline. This is another mitigation to avoid another network outage similar to the one earlier this year.

投票和正常的信息包会分别在两个独立的管线上处理。如果信息包进入了处理投票的管线,就会被简单的逻辑判定后再过滤掉。这样就可以防止投票管线的DDOS攻击。这和今年(2021年)早期时候一个防止网络终端的方案类似。

3.3 银行阶段(BankingStage

This is the meat of the validator and the hardest section to understand, at least from what we’ve reviewed so far 😉

这是我们认为最难被(大众)理解,也同时是验证者“赚钱/揾食”之处。

There are three packet types being sent to this stage:

  • Verified gossip vote packets
  • Verified tpu_vote packets
  • Verified tpu packets (normal transactions)

有三种已验证的信息包会被发送到这个阶段:

  • 已验证的gossip vote信息包(Verified gossip vote packets)
  • 已验证的tpu_vote信息包(Verified tpu_vote packets)
  • 已验证的tpu信息包(Verified tpu packets/normal transactions)

Each packet type has its own processing thread; the normal transaction pipeline has two threads.

每一种类型的信息包都有对应的线程来处理。针对普通的交易,共有两个线程进行处理。

When the current node becomes a leader, the packet goes through the following steps:

  1. Deserialize the packet into a SanitizedTransaction.
  2. Run the transaction through a Quality of Service (QoS) model. This selects transactions to execute depending on a few properties (signature, length of instruction data bytes, and some type of cost model based on access patterns for a given program id).
  3. The pipeline then grabs a batch of transactions to be executed. This group of transactions are greedily selected to form a parallelizable entry (a group of transactions that can be executed in parallel). In order to do this, it uses the isWriteable flag that clients set when building transactions and a per-account read-write lock to ensure no data race conditions.
  4. Transactions are executed.
  5. The results are sent to the PohService and then forwarded to the broadcast stage to be shredded (packetized) and propagated to the rest of the network. They are also saved to the bank and accounts database.

若当前节点成为一个“领导者”,则信息包会经历以下“旅程”:

  1. 将信息包反序列化(Deserialize),生成一个纯净交易(SanitizedTransaction)

  2. 在服务质量(Quality of Service/QoS)模型下运行(Run)这笔交易。这将会根据一些特点(properties)(如签名,指令数据长度,以及基于特定程序id访问模式的费用模型)来选择执行的交易。

  3. 接着管线将一批交易执行。这组被选中的交易将通过贪婪选择(greedily selected)被并行执行(executed in parallel)。为了让执行成功,管线会使用“是否可编辑(isWriteable)”标记来确保不会有数据竞争(data race)的状况,而这个标记在客户端生成交易时也由单账户读写锁(per-account read-write lock)同时设置。

  4. 至此,交易被执行(executed

  5. The results are sent to the PohService and then forwarded to the broadcast stage to be shredded (packetized) and propagated to the rest of the network. They are also saved to the bank and accounts database.

结果被先传送至PohService ,然后在被转发到广播阶段(broadcast stage)打包并向全网发送。这些结果也会被保存(saved)在银行和账户数据库中(bank and accounts database)。

Transactions that aren’t processed will be buffered and retried on the next iteration. If a validator receives transactions before it’s time to produce blocks, or its time leading the network is up, it will forward the unprocessed transactions to the TPU forward port of leader n + 1.

那些没有被执行的交易则会被缓冲(buffered)并送入下一个迭代(iteration)中。如果一个验证者在它出块之前接收到了(这些)交易信息,或者它“领导者”的时间已经用尽,验证者就会对做出决定 — — 到底是留下这些信息还是发送给下一个“领导者”的TPU转发端口(TPU forward port)。

3.4 并行处理(Parallel Processing)

One of Solana’s spicy features is the runtime’s ability to parallelize transaction processing enabled by the programming model’s separation of code and state (known as Sealevel). Transaction instructions are required to explicitly mark what accounts will be read from and written to.

Solana的一个突出特点就是它可以在运行时(Runtime)通过特定编程模型(programming model)将代码和状态分离,从而实现对交易的并行处理(即Sealevel机制)。

This can be a huge pain in the ass for developers, but it’s for a good reason — it allows the BankingStage to execute batches of transactions in parallel without running into concurrency issues.

这对开发者来说可能很痛苦(因为开发难度大),但这样做是有益的:它使得银行阶段**(BankingStage)** 可以在并行执行交易批处理文件(batches of transactions)时免受并发问题(concurrency issues)的影响。

Transactions are processed in batches. The entries produced by each batch contain a list of transactions that can be executed in parallel. Therefore, we need RW locking in each batch in addition to batches processed in other parallel pipelines.

交易在批处理文件中处理。由批处理文件产生的(启动)项(条目)中包含一张由多个可被并行执行的交易所组成的列表,因为,我们就需要在每个批处理文件中添加读写锁(read-write lock/R-W lock)来确保程序不会在平行管线间同时读/写相同的位置(否则数据就有问题了)。

Let’s walk through a few examples. In reality, transactions will access multiple accounts. For simplicity’s sake, imagine each box is a transaction that is accessing one account.

接下来就让我们一起再进一步研究几个例子。在现实中,交易会涉及多个账户。但此时,简单起见,我们先认为下图一个盒子就代表一次交易,也只对应一个账户。

多个批处理文件可以同时被处理。当底层的管线没能识别/抓住(grab)账号C的读写锁是,它就会被缓冲然后进入下一个迭代。

In the above example, account C is the only account shared between the two batches in the two separate pipelines. One batch is attempting to read from C and another attempting to write. In order to avoid any race conditions, the validator uses a read-write lock to ensure no data corruption. In this case, the top pipeline locks C first and starts execution. At the same time, the bottom pipeline attempts to execute the batch, but C is already locked, so it executes everything except for C and saves it for the next iteration.

在上一个例子(图)中,C是在两条并行管线中,唯一被两个批处理文件共享的账户。一个批处理文件正尝试读取C而同时另一个则尝试编辑C。为了防止任何数据竞态(race conditions),验证者必须使用读写锁来确保数据没有被污染。具体来说,顶部的管线会先将C账户“上锁”然后开始执行。同时,虽然底部的管线也会尝试执行批处理文件中的C账户,但是C账户此时已经上锁,所以它会执行C账户之外所有的(账户),而C账户则会留到下一次迭代时执行。

已经被执行的批处理文件也需要是可并行的(parallelizable)

In this example, there are multiple transactions which contain reads and writes to account C. Since the entries propagated to block producers need to support parallel execution, it will grab the first access to C in addition to the other transactions after that follow the RW Lock rules.

总之,在这个例子中,因为有多笔交易信息,所以需要读取和编辑C账户。由于被广播给区块生成者的(初始)项(条目)需要支持并行处理/执行,所以系统会在读写锁的规则下,有序处理账户C和其他账户。

3.5 历史证明服务(PohService)

Proof of History (PoH) is Solana’s technique for proving the passage of time. It is similar to a verifiable delay function (VDF). Read more about it here.

历史证明(PoH)是Solana用来证明(收发)消息时间的技术。它与可验证延迟函数(VDF: verifiable delay function)技术类似,具体可以参考此处

The PohService is responsible for generating ticks which are units of time smaller than a slot (1 slot = N ticks). This is pinned to a processor core and runs a hash loop like:

历史证明服务会负责产生比“槽”(slot)还小的“最小时间单位”(ticks)。“最小时间单位”会附着在一个处理器核心上并运行一个哈希循环:

output = “solana summer”
while 1:
output = hash([output])

Hashes will be generated from themselves until a Record is received from the BankingStage, at which point the current hash and mixin (a hash of all transactions in the batch) are combined. These records are converted to Entries so that they can be broadcasted to the network via the BroadcastStage. The pseudo-code ends up looking something like:

直到从银行阶段(BankingStage)收到(received)一份记录(Record),即当前哈希和mixin(批处理文件中所有交易的哈希)合并之前,哈希会一直产生。这些记录也会被转化成可被广播阶段(BroadcastStage)向全网发送的项(条目)(Entries)。整个过程的伪代码如下:

output = “solana summer”
record_queue = Queue()while 1:
record = record_queue.maybe_pop()
if record:
output = hash([output, record])
else:
output = hash([output])

3.6 广播阶段(BroadcastStage

This stage is responsible for broadcasting Entries generated by PohService to the rest of the network. These entries are converted to Shreds, which represent the smallest unit of a block and then sent to the rest of the network using a block propagation technique called Turbine. The tl;dr is that each node receives a partial view of the block from its parent node (up to 64KB packet size), and shares it with its own children and so on. It looks something like this:

这个阶段负责将历史证明服务(PohService)所产生的项(条目)(Entries)向全网广播。这些项(条目)先回被转化成代表一个区块最小单位的碎片(Shreds),然后再使用“涡轮”(Turbine)区块通信技术向全网广播。宏观上看,每个节点都只接收到父节点传来区块的部分信息(信息包大小上限为64K),并且和自己的子节点也如此分享信息。整个过程看起来像:

涡轮

On the receiving side, validators listen for shreds and convert them back into entries and then blocks. The details of that will be in a future article.

在接收方面,验证者会监听并将碎片再次转化项(条目)(entries)。具体的详情会在之后的文章中详述。

4. 感谢(Thank You)

As you may be able to tell, the validator is a very complex piece of engineering and we have just scratched the surface. It takes inspiration from operating systems and one can tell there are lots of systems-level tricks the team uses to increase the performance. Big kudos to the Solana engineering team and contributors for this amazing piece of engineering.

如您所知,验证者/节点是网络工程中非常复杂的一环,目前我们也只是浮光掠影地介绍了一番。这其中有来自操作系统技术的灵感,开发团队也使用了大量底层系统级别的技术来提升性能。让我们一起向Solana工程团队致以崇高敬意!

Stay tuned for our next pieces where we go more in depth into the challenges of building a more democratic MEV system on Solana.

请密切关注我们下一篇文章,我们将会详述如何在Solana上建设一个更加民主(democratic)的MEV系统。

Thank you to the Solana engineers for help in Discord and thank you for reading. Please follow @jito_network@buffalu__, and @segfaultdoctor on Twitter to stay tuned for our next articles.

向在Discord频道帮助我们的Solana工程师们致谢!如果可以的话,烦请您关注我们的推特账号:@jito_labs@buffalu_@segfaultdoctor 。

If any of this is interesting to you, we’re hiring! Please DM @buffalu__ on Twitter.

另外,我们正在招聘!如果您有兴趣,请在推特上私信 @buffalu__

赞赏