Geth Source Code Series: Transaction Design and Implementation

2025-11-08 10:08:33

Collection

This article is the fourth in the Geth source code series, where we will delve into Ethereum transactions, including the design of the transaction mechanism and the lifecycle of transactions. Finally, we will detail the execution process of transactions, including the interaction process between the execution layer and the consensus layer.

1. Introduction to Transactions

The Ethereum execution layer can be seen as a transaction-driven state machine, where transactions are the only way to modify the state. Transactions can only be initiated by EOAs and will include a signature from the private key. After execution, the state of the Ethereum network is updated. The simplest transaction on the Ethereum network is the transfer of ETH from one account to another.

It should be noted that as EOAs gain the ability to support contracts through the upgrade of EIP-7702, the concepts of contract accounts and EOAs will gradually blur. However, in the current version, it is still believed that only EOAs controlled by private keys can initiate transactions.

Ethereum supports different types of transactions. When the Ethereum mainnet was first launched, it only supported one type of transaction. Subsequently, as Ethereum continued to upgrade, various types of transactions were gradually supported. The current mainstream transaction type supports dynamic fees through EIP-1559, with most users submitting this type of transaction. EIP-4844 introduced a cheaper data storage option for Layer 2 or other off-chain scaling solutions, while the latest Pectra upgrade introduced a transaction format through EIP-7702 that can extend EOAs into contracts.

As Ethereum evolves, it may support other transaction types in the future, but the overall processing flow for transactions will not change significantly. All transactions need to go through the process of transaction submission → transaction validation → entering the transaction pool → transaction propagation → packaging into blocks.

2. Evolution of Transaction Structure

Since the launch of the Ethereum mainnet, the transaction structure has undergone four major changes, laying a foundation for security and scalability, allowing Ethereum to add transaction types at a low cost in the future.

Preventing Cross-Chain Replay Attacks

The initial transaction structure is as follows, where RLP identifies transaction data that is encoded into RLP structure before being propagated and processed:

RLP([nonce, gasPrice, gasLimit, to, value, data, v, r, s])

The biggest issue with this structure is that it is not associated with a chain. Transactions generated on the mainnet could be executed on other chains arbitrarily. Therefore, EIP-155 embedded the chainId (e.g., mainnet ID=1) in the signature v value to isolate transactions across different chains, ensuring that transactions on each chain cannot be replayed on others.

Involved EIPs:

EIP-155

Standardization of Transaction Expansion

With the development of Ethereum, the initial transaction format could no longer meet the requirements of some scenarios, necessitating the addition of new transaction types. However, if transaction types are added arbitrarily, it could lead to management complexity and a lack of standardization. EIP-2718 defined the format for subsequent transactions, primarily defining the TransactionType || TransactionPayload structure, where:

TransactionType defines the class of transaction types, allowing for up to 128 types, which is sufficient for new transaction types.
TransactionPayload defines the data format for transactions, currently using RLP for data encoding, with potential upgrades to SSZ or other encodings in the future.

This upgrade was completed in the Berlin upgrade. In addition to EIP-2718, this upgrade also introduced the Access List transaction type through EIP-2930, which allows users to pre-declare the contracts and storage they need to access in the transaction, reducing gas consumption during execution.

Involved EIPs:

EIP-2718
EIP-2930

Transformation of Ethereum's Economic Model

In the London upgrade, EIP-1559 introduced the Base Fee mechanism, slowing down or even deflating the issuance of ETH. For nodes participating in staking, there is also the possibility of earning additional income through tips (maxPriorityFeePerGas). EIP-1559 transactions inherit the Access List mechanism, which is now the most common type of transaction. Furthermore, after the Paris Merge upgrade, Ethereum transitioned from PoW to PoS, rendering the previous mining economic model obsolete and entering the Staking era.

Additionally, EIP-1559 introduced a target mechanism to dynamically adjust the Base Fee, effectively introducing load balancing capabilities for Ethereum. The target value is half of the block Gas Limit; if exceeded, the Base Fee will continue to rise, allowing many transactions to avoid congestion times, thereby alleviating overall congestion on the chain and improving user experience.

Involved EIPs:

EIP-1559

Adding Various Extended Transactions

After EIP-2718 and EIP-1559 defined the standards for extended transactions and the economic model, new transaction types began to be added. In the recent two upgrades, EIP-4844 and EIP-7702 were introduced. The former added the Blob transaction type, providing an ideal storage solution for off-chain scaling, with large space and low cost, and also featuring an economic model and load mechanism similar to EIP-1559. EIP-7702 allows EOAs to be transformed into smart contract accounts controlled by private keys, preparing for the large-scale adoption of account abstraction.

Involved EIPs:

EIP-4844
EIP-7702

3. Transaction Module Architecture

As the input to the Ethereum state machine, almost all main processes revolve around transactions. Before entering the transaction pool, transactions need to be validated for their format and signature. After entering the transaction pool, they need to be propagated among different nodes, selected by block-producing nodes from the transaction pool, executed on the EVM, and modify the state database, ultimately being packaged into blocks and transmitted between the execution layer and consensus layer.

For block-producing nodes and non-block-producing nodes, the transaction processing flow will differ slightly. Block-producing nodes are responsible for selecting transactions from the transaction pool, packaging them into blocks, and updating the local state database. Non-block-producing nodes only need to re-execute the transactions in the latest synchronized block to update their local state.

Types of Transactions

Currently, Ethereum supports a total of five types of transactions. The main structures of these transactions are similar, and different transactions are distinguished by the transaction type field, with some extended fields used for specific purposes.

LegacyTxType: The basic format inherited from the genesis block, using the first-price auction model (users manually set gasPrice). After the EIP-155 upgrade, it defaults to embedding chainId to prevent cross-chain replay attacks. Its usage on the Ethereum mainnet is now relatively low, and Ethereum currently remains compatible with this transaction type, which will gradually be phased out.
AccessListTxType: Pre-heating storage access significantly reduces gas costs. This feature has been inherited by subsequent transaction types, and transactions using this type are also relatively few.
DynamicFeeTxType: A transaction type that updates Ethereum's economic model, introducing the Base Fee and target mechanism, inheriting the Access List feature. This is currently the most mainstream transaction type.
BlobTxType: A transaction type specifically for off-chain scaling, allowing transactions to carry large amounts of low-cost data through a blob structure, reducing the cost of off-chain scaling solutions. It inherits the Access List and Dynamic Fee features, and the blobs in the transaction have a separate billing mechanism similar to EIP-1559.
SetCodeTxType: Allows EOAs to convert to contract accounts (also can revoke contract capabilities through transactions) and execute the corresponding contract code in the EOA, inheriting the Access List and Dynamic Fee features.

Transaction Lifecycle

Once a transaction is packaged into a block, it has completed the modification of state data, and its lifecycle can be considered complete. During this process, a transaction will go through four phases:

Transaction Validation: Transactions submitted by EOAs undergo a series of basic validations before being added to the transaction pool.
Transaction Broadcasting: Newly submitted transactions to the transaction pool are broadcast to the transaction pools of other nodes.
Transaction Execution: Block-producing nodes select transactions from the transaction pool for execution.
Transaction Packaging: Transactions are packaged into blocks in a specific order (first distinguishing whether they are local transactions, then by gas fee size), ignoring those that fail validation.

Transaction Pool

The transaction pool is a temporary storage area for transactions. Before being packaged, transactions are stored in the transaction pool, and transactions in the pool are synchronized to other nodes while also synchronizing transactions from other nodes' transaction pools. Transactions submitted by users first enter the transaction pool, then trigger the consensus process through the consensus layer, driving transaction execution and packaging into blocks.

Currently, there are two types of implementations for the transaction pool:

Blob Transaction Pool (Blob TxPool)
Other Transaction Pool (Legacy TxPool)

Since Blob transactions carry data that is processed differently from other transactions, a separate transaction pool is used for processing. Although other types of transactions have inconsistent types, the synchronization and packaging processes between different nodes are fundamentally the same, so they are processed in the same transaction pool. Transactions in the transaction pool are submitted by the owners of external EOAs, and there are two ways to submit transactions to the pool:

SendTransaction
SendRawTransaction

SendTransaction is when the client sends an unsigned transaction object, and the node will sign the transaction using the private key corresponding to the address from in the transaction. SendRawTransaction requires the transaction to be signed in advance, and then the signed transaction is submitted to the node. This method is more commonly used, as wallets like Metamask and Rabby use this method.

Taking SendRawTransaction as an example, after the node starts, it will launch an API module to handle various external API requests, with SendRawTransaction being one of the APIs. The source code is in internal/ethapi/api.go:

func (api *TransactionAPI) SendRawTransaction(ctx context.Context, input hexutil.Bytes) (common.Hash, error) {
tx := new(types.Transaction)
if err := tx.UnmarshalBinary(input); err != nil {
return common.Hash{}, err
}
return SubmitTransaction(ctx, api.b, tx)
}

4. Core Data Structures

For the transaction module, the core data structures consist of two parts: one part represents the transaction data structure itself, and the other part represents the transaction pool structure for temporarily storing transactions. Since transactions in the transaction pool need to be propagated among different nodes, the implementation in the transaction pool relies on the underlying p2p protocol.

Transaction Structure

Using core/types/transaction.go, the Transaction structure unifies all transaction types:

type Transaction struct {
inner TxData // The actual transaction data is stored here
time time.Time
//….
}

TxData is an interface type that defines the property retrieval methods that all transaction types need to implement. However, for transaction types like LegacyTxType, many fields may not exist, so existing fields will be used as substitutes or return empty values:

type TxData interface {
txType() byte // Transaction type
copy() TxData // Create a deep copy of transaction data
chainID() *big.Int // Chain ID, used to distinguish different Ethereum networks
accessList() AccessList // Pre-compiled access list for optimizing gas consumption (introduced in EIP-2930)
data() []byte // Input data for the transaction, used for contract calls or creation
gas() uint64 // Gas limit, indicating the maximum gas that can be consumed by the transaction
gasPrice() *big.Int // Price per unit of gas (for Legacy transactions)
gasTipCap() *big.Int // Maximum tip (for EIP-1559 transactions)
gasFeeCap() *big.Int // Total fee cap (for EIP-1559 transactions)
value() *big.Int // Amount of ETH sent in the transaction
nonce() uint64 // Transaction nonce, used to prevent replay attacks
to() *common.Address // Recipient address; nil if it is a contract creation
rawSignatureValues() (v, r, s *big.Int) // Raw signature values (v, r, s)
setSignatureValues(chainID, v, r, s *big.Int) // Set signature values
effectiveGasPrice(dst *big.Int, baseFee *big.Int) *big.Int // Calculate the actual gas price (considering baseFee)
encode(*bytes.Buffer) error // Encode the transaction into a byte stream
decode([]byte) error // Decode the transaction from a byte stream
sigHash(*big.Int) common.Hash // Hash of the transaction to be signed
}

In addition to the required details for each transaction, each newly added transaction has its own extension.

In Blob transactions:

BlobFeeCap: The maximum fee cap for each blob of data, similar to maxFeePerGas, but specifically for calculating blob data fees.
BlobHashes: An array storing the hash values of all blob data, which will be stored in the execution layer to prove the integrity and authenticity of the blob data.
Sidecar: Contains the actual blob data and its proof. This data will not be stored in the execution layer, only in the consensus layer for a period of time, and will not be encoded into the transaction.

In SetCode transactions:

AuthList: An authorization list used to implement a multi-signature mechanism for contract code, helping EOAs gain smart contract capabilities.

All transaction types need to implement TxData, and the differentiated processing for each type of transaction will be implemented internally within the transaction type. This interface-oriented approach allows for easy addition of new transaction types without modifying the current transaction processing flow.

Transaction Pool Structure

Similar to the transaction structure, the transaction pool also adopts the same design pattern, using core/txpool/txpool.go's TxPool to manage the transaction pool uniformly, where SubPool is an interface that each specific implementation of the transaction pool must implement:

type TxPool struct {
subpools []SubPool // Specific implementations of the transaction pool
chain BlockChain
// …
}
type LegacyPool struct {
config Config // Transaction pool parameter configuration
chainconfig *params.ChainConfig // Blockchain parameter configuration
chain BlockChain // Blockchain interface
gasTip atomic.Pointer[uint256.Int] // Current minimum gas tip accepted
txFeed event.Feed // Transaction event publishing and subscription system
signer types.Signer // Transaction signature validator
pending map[common.Address]*list // Currently processable transactions
queue map[common.Address]*list // Temporarily unprocessable transactions
//…
}
type BlobPool struct {
config Config // Transaction pool parameter configuration
reserve txpool.AddressReserver //
store billy.Database // Persistent data storage for transaction metadata and blob data
stored uint64 //
limbo *limbo //
signer types.Signer //
chain BlockChain //
index map[common.Address][]*blobTxMeta //
spent map[common.Address]*uint256.Int //
//…
}

The two transaction pools that currently implement SubPool are:

BlobTxPool: Used to manage Blob transactions.
LegacyTxPool: Used to manage other transactions besides Blob transactions.

The reason Blob transactions need to be managed separately from other transactions is that they may carry large amounts of blob data, while other transactions can be managed and synchronized directly in memory. Blob transactions' blob data requires persistent storage, so they cannot use the same management approach as other transactions.

5. Fee Mechanism

Since Ethereum itself cannot handle the halt problem, it uses the Gas mechanism to prevent some malicious attacks. Additionally, Gas serves as the user's transaction fee, which are the initial two purposes of Gas.

After years of development, Gas has become an important component of Ethereum's economic model, controlling the issuance of ETH and helping Ethereum achieve deflation. It can even dynamically adjust the traffic of the Ethereum network, enhancing user experience.

Ethereum's fee mechanism uses Gas to achieve various functions, including maintaining network security and balancing the economic model.

Gas

When processing transactions, every operation executed on the EVM consumes Gas, such as using memory, reading data, writing data, etc. Some operations consume more Gas than others; for example, transferring ETH requires 21,000 Gas. In each transaction, a maximum amount of Gas that can be consumed must be set. If the Gas runs out, the transaction execution ends, and the consumed Gas is not refunded. This mechanism helps address Ethereum's halt problem.

The size of blocks in Ethereum is also limited by Gas rather than specific size units. The total Gas consumed by all transactions in a block cannot exceed the block's Gas limit. Gas is merely a measurement unit during the EVM execution process, used to pay ETH for the Gas consumed by each transaction. The price of Gas is usually expressed in Gwei, where 1 Ether = 10^9 Gwei.

In the current Ethereum network, the size limit for a block is 36M Gas. However, there is significant community demand to increase the block Gas limit to 60M, which is considered a reasonable choice. This increase would enhance the network's capacity without threatening its security and is currently being tested on the testnet. Meanwhile, some in the community argue that using Gas limits alone to control block size is unreasonable and that byte size limits should be introduced. These discussions are ongoing in the community.

EIP-1559

After introducing the EIP-1559 mechanism, the previous GasPrice was split into Base Fee and Priority Fee (maxPriorityFeePerGas). The Base Fee is completely burned to control the growth rate of ETH in Ethereum, while the Priority Fee is given to the validators corresponding to the block-producing nodes. Users can set maxFeePerGas in transactions to ensure that the final payment is capped.

To ensure a transaction's success, it must be guaranteed that maxFeePerGas ≥ Base Fee + Priority Fee; otherwise, the transaction will fail, and the fee will not be refunded. The actual cost to the user is (Base Fee + Priority Fee) × Gas Used, with any excess fees refunded to the transaction initiator.

Base Fee is dynamically adjusted based on the actual Gas usage in the block. The maximum Gas limit for the block is halved to define the target. If the actual usage of the previous block exceeds the target, the Base Fee for the current block increases; if it is below the target, the Base Fee decreases; otherwise, it remains unchanged.

Blob Transaction Fee Mechanism

The fee settlement for Blob transactions is divided into two parts: one part uses EIP-1559 to adjust the Base Fee along with other transactions, while the other part has a separate Blob Fee mechanism for the Blob data within Blob transactions. The target value is half of the maximum number of blobs and adjusts the Blob Fee based on the usage of blob data blocks, but it does not set a separate Priority Fee since Blob transactions can directly set the Priority Fee in the transaction to encourage faster packaging.

6. Source Code Analysis of Transaction Processing Flow

Having detailed the design and implementation of the transaction mechanism in Ethereum, we will now analyze the code to introduce the specific implementation of transactions in Geth, including the processing flow of transactions throughout their lifecycle.

Transaction Submission

Whether submitting a transaction through SendTransaction or SendRawTransaction, the SubmitTransaction function in internal/ethapi/api.go is called to submit the transaction to the transaction pool.

In this function, two basic checks are performed on the transaction: one checks whether the gas fee is reasonable, and the other checks whether the transaction meets the EIP-155 specifications. EIP-155 solves the cross-chain transaction replay problem by introducing the chainID parameter in the transaction signature. This check ensures that when the node configuration enables EIP155Required, all transactions submitted to the transaction pool must comply with this standard.

After completing the checks, the transaction is submitted to the transaction pool, with the addition logic implemented in eth/api_backend.go's SendTx:

In the transaction pool, the Filter method matches the transaction to the corresponding transaction pool. Currently, there are two implementations of transaction pools; if it is a blob transaction, it will be placed in BlobPool; otherwise, it will go to LegacyPool:

At this point, the transaction submitted by the EOA has been placed in the transaction pool, and this transaction will begin to propagate within the transaction pool, entering the subsequent transaction packaging and execution flow.

If a new transaction is resent before packaging, with new gasPrice and gasLimit settings, the original transaction in the transaction pool will be deleted and replaced with the new gasPrice and gasLimit, then returned to the transaction pool. This method can also be used to cancel transactions that are not desired for execution.

func (api *TransactionAPI) Resend(ctx context.Context, sendArgs TransactionArgs, gasPrice *hexutil.Big, gasLimit *hexutil.Uint64) (common.Hash, error) {
if sendArgs.Nonce == nil {
return common.Hash{}, errors.New("missing transaction nonce in transaction spec")
}
if err := sendArgs.setDefaults(ctx, api.b, false); err != nil {
return common.Hash{}, err
}
matchTx := sendArgs.ToTransaction(types.LegacyTxType)
// Before replacing the old transaction, ensure the new transaction fee is reasonable.
price := matchTx.GasPrice()
if gasPrice != nil {
price = gasPrice.ToInt()
}
gas := matchTx.Gas()
if gasLimit != nil {
gas = uint64(*gasLimit)
}
if err := checkTxFee(price, gas, api.b.RPCTxFeeCap()); err != nil {
return common.Hash{}, err
}
// Iterate the pending list for replacement
pending, err := api.b.GetPoolTransactions()
if err != nil {
return common.Hash{}, err
}
for _, p := range pending {
wantSigHash := api.signer.Hash(matchTx)
pFrom, err := types.Sender(api.signer, p)
if err == nil \&\& pFrom == sendArgs.from() \&\& api.signer.Hash(p) == wantSigHash {
// Match. Re-sign and send the transaction.
if gasPrice != nil \&\& (*big.Int)(gasPrice).Sign() != 0 {
sendArgs.GasPrice = gasPrice
}
if gasLimit != nil \&\& *gasLimit != 0 {
sendArgs.Gas = gasLimit
}
signedTx, err := api.sign(sendArgs.from(), sendArgs.ToTransaction(types.LegacyTxType))
if err != nil {
return common.Hash{}, err
}
if err = api.b.SendTx(ctx, signedTx); err != nil {
return common.Hash{}, err
}
return signedTx.Hash(), nil
}
}
return common.Hash{}, fmt.Errorf("transaction %#x not found", matchTx.Hash())
}

Transaction Broadcasting

After a node receives a transaction submitted by an EOA, it needs to propagate it across the network. The txpool (core/txpool/txpool.go) provides the SubscribeTransactions method to subscribe to new events in the transaction pool. The Blob transaction pool and Legacy transaction pool implement subscription differently:

func (p *TxPool) SubscribeTransactions(ch chan<- core.NewTxsEvent, reorgs bool) event.Subscription {
subs := make([]event.Subscription, len(p.subpools))
for i, subpool := range p.subpools {
subs[i] = subpool.SubscribeTransactions(ch, reorgs)
}
return p.subs.Track(event.JoinSubscriptions(subs…))
}

BlobPool distinguishes between two event sources:

discoverFeed: Contains only newly discovered transactions.
insertFeed: Contains all transactions, including those re-entering the pool due to reorganization.

func (p *BlobPool) SubscribeTransactions(ch chan<- core.NewTxsEvent, reorgs bool) event.Subscription {
if reorgs {
return p.insertFeed.Subscribe(ch)
} else {
return p.discoverFeed.Subscribe(ch)
}
}

LegacyPool does not distinguish between new transactions and reorganization transactions; it uses a single txFeed to send all transaction events.

func (pool *LegacyPool) SubscribeTransactions(ch chan<- core.NewTxsEvent, reorgs bool) event.Subscription {
return pool.txFeed.Subscribe(ch)
}

In general, SubscribeTransactions decouples the transaction pool from other components through an event mechanism. This subscription can be used by multiple modules, such as transaction broadcasting, transaction packaging, and external RPC, all of which need to listen to this process and respond accordingly.

At the same time, the p2p module (eth/handler.go) continuously listens for new transaction events. If a new transaction is received, it broadcasts the transaction:

// eth/handler.go broadcasts the transaction after a new transaction is generated
func (h *handler) txBroadcastLoop() {
defer h.wg.Done()
for {
select {
case event := <-h.txsCh: // Listening for new transaction information here
h.BroadcastTransactions(event.Txs)
case <-h.txsSub.Err():
return
}
}
}

When broadcasting transactions, they need to be classified. If it is a blob transaction or a transaction exceeding a certain size, it cannot be propagated directly. For ordinary transactions, they are marked as propagable. The node then looks for peers that do not have this transaction among its current peers. If the node can broadcast directly, it is marked as true, and this process is also implemented in the BroadcastTransactions method:

Once the classification of transactions is completed, transactions that can be propagated directly are sent, while blob transactions or large transactions only broadcast their hashes, retrieving the full transaction when needed.

Transactions that only send hashes during broadcasting will be placed in this field of peer nodes:

New transactions are broadcasted through the p2p module while also receiving new transactions from the p2p network. When initializing the Ethereum instance in eth/backend.go, the p2p module is initialized, adding the transaction pool interface. Once the p2p module is running, it parses transaction requests from p2p messages to add them to the transaction pool.

Specifically, when instantiating the handler, the method for obtaining transactions from other nodes is specified, using the TxFetcher in eth/fetcher to retrieve remote transactions. The TxFetcher uses the fetchTx method here to obtain remote transactions, which actually calls the RequestTxs method implemented in the eth/protocols/eth protocol to retrieve transactions:

// eth/backend.go New function
if eth.handler, err = newHandler(\&handlerConfig{
NodeID: eth.p2pServer.Self().ID(),
Database: chainDb,
Chain: eth.blockchain,
TxPool: eth.txPool,
Network: networkID,
Sync: config.SyncMode,
BloomCache: uint64(cacheLimit),
EventMux: eth.eventMux,
RequiredBlocks: config.RequiredBlocks,
}); err != nil {
return nil, err
}
// eth/handler.go newHandler function, registering the process to obtain new transactions
fetchTx := func(peer string, hashes []common.Hash) error {
p := h.peers.peer(peer)
if p == nil {
return errors.New("unknown peer")
}
return p.RequestTxs(hashes) // Request transactions from other nodes
}
addTxs := func(txs []*types.Transaction) []error {
return h.txpool.Add(txs, false) // Add transactions to the transaction pool
}
h.txFetcher = fetcher.NewTxFetcher(h.txpool.Has, addTxs, fetchTx, h.removePeer)
// eth/handler_eth.go Handle method, after receiving new transactions, adds them to the transaction pool
for _, tx := range *packet {
if tx.Type() == types.BlobTxType {
return errors.New("disallowed broadcast blob transaction")
}
}
return h.txFetcher.Enqueue(peer.ID(), *packet, false)
// eth/fetcher/tx_fetcher.go's Handle method will call the registered addTxs to add the transactions
for j, err := range f.addTxs(batch) {
//….
}

The RequestTxs method sends a GetPooledTransactionsMsg message and receives a response from other nodes with PooledTransactionsMsg, which is processed by the Handle method in backend. In this method, the txFetcher’s Enqueue method is called to add the transactions obtained from other nodes to the transaction pool:

The transaction pool also has a lazy loading design implemented through LazyTransaction in core/txpool/subpool.go. This mechanism reduces memory usage and improves transaction processing efficiency by storing key metadata of transactions and only loading the full transaction data when truly needed. This design plays a crucial role when Ethereum processes a large number of transactions, especially in scenarios like transaction pools and block packaging, where most transactions may ultimately not be included in a block, thus not requiring full loading of all transaction data.

type LazyTransaction struct {
Pool LazyResolver // Transaction resolver to pull the real transaction up
Hash common.Hash // Transaction hash to pull up if needed
Tx *types.Transaction // Transaction if already resolved
Time time.Time // Time when the transaction was first seen
GasFeeCap *uint256.Int // Maximum fee per gas the transaction may consume
GasTipCap *uint256.Int // Maximum miner tip per gas the transaction can pay
Gas uint64 // Amount of gas required by the transaction
BlobGas uint64 // Amount of blob gas required by the transaction
}
func (ltx *LazyTransaction) Resolve() *types.Transaction {
if ltx.Tx != nil {
return ltx.Tx
}
return ltx.Pool.Get(ltx.Hash)
}

Additionally, since Ethereum is a permissionless network, nodes may receive some malicious requests from the network. In extreme cases, nodes may face DDoS attacks, so a series of measures are used to prevent malicious attacks from the network:

Basic transaction validation
Node resource limitations
Transaction eviction mechanisms
p2p network layer protection

Taking LegacyPool as an example (BlobPool also has similar mechanisms), before a transaction is added to the transaction pool, it first undergoes basic validation. In the core/txpool/validation.go's ValidateTransaction method, basic validations are performed on the transaction, including transaction type, size, Gas, etc. If any of these do not meet the requirements, the transaction will be rejected.

The transaction size is regulated using Slot, defined in core/txpool/legacypool/legacypool.go:

const (
txSlotSize = 32 * 1024
txMaxSize = 4 * txSlotSize // 128KB
)

Each transaction cannot exceed 4 slots, and there are maximum slot limits for each account and the entire node. Once an account reaches its limit, it cannot submit new transactions. Once the node reaches its limit, it must evict old transactions. The truncatePending method in core/txpool/legacypool/legacypool.go will fairly evict transactions to prevent a single account from occupying too many resources in the transaction pool:

type Config struct {
AccountSlots uint64
GlobalSlots uint64
}

At the network layer, for blob transactions or transactions exceeding a certain size, the transaction content will not be propagated directly over the network; only the transaction hash will be propagated to avoid excessive data transmission that could lead to DDoS attacks.

Transaction Packaging

After a transaction is submitted to the transaction pool, it propagates among nodes in the Ethereum network. Once a validator from a node is selected to be a block producer, the validator will delegate the consensus layer and execution layer to construct the block.

The validator will first trigger the block construction process from the consensus layer. Upon receiving the block construction request, the consensus layer will call the engineAPI of the execution layer to construct the block. The implementation of engineAPI is in eth/catalyst/api.go. The consensus layer will first call the ForkchoiceUpdated API to send the block construction request. There are multiple versions of ForkchoiceUpdated, and the specific version called depends on the current network version. After the call, a PayloadID is returned, and the corresponding version of the GetPayload API is called to obtain the block construction result.

Regardless of which version of ForkchoiceUpdated is called, it ultimately calls the forkchoiceUpdated method to construct the block:

In the ForkchoiceUpdated method, the current state of the execution layer is validated. If the execution layer is currently synchronizing blocks or if the finalized block does not meet expectations, this method will return an error message directly to the consensus layer, indicating that block construction has failed:

After validating the information from the execution layer, the BuildPayload method in miner/miner.go is called to construct the block. The specific operations for constructing the block are completed in the generateWork method in miner/payload_building.go. It is important to note that after calling this method, an empty payload is generated, and this payloadID is returned to the consensus layer. At the same time, a goroutine is started to actually complete the block packaging process, continuously searching for higher-value transactions in the transaction pool. After each repackaging of transactions, the payload is updated.

The transaction packaging is completed through the fillTransactions method in miner/worker.go, which actually calls the Pending method of the txpool to retrieve the transactions to be packaged:

Before the end of the slot, the consensus layer will call the getPayload API to obtain the final packaged block. If the submitted transaction is included in this block, it will be executed by the EVM and modify the state database. If it is not packaged this time, it will wait for the next opportunity to be packaged.

Transaction Execution

During the transaction packaging process, the transactions are also executed in the EVM, obtaining the state changes after the transactions in the block are completed. This is also done in the generateWork function, which prepares the environment variables for the current block execution, mainly obtaining the latest block and the latest state database:

Here, the state represents the state database:

This forms a structure of StateDB → stateObjects → stateAccount, representing the complete state database, the collection of account objects, and individual account objects, respectively. The StateObject structure contains dirtyStorage, which indicates the state that has changed after the transaction execution, pendingStorage, which indicates the state that has changed after the block execution, and originStorage, which represents the original state. Thus, these three states are ordered from newest to oldest as dirtyStorage → pendingStorage → originStorage. For a detailed analysis of storage, please refer to previous detailed analyses on storage:

In the New method of eth/backend.go, the transaction pool's configuration is loaded at startup, which includes a Locals configuration. The addresses in this configuration are treated as local addresses, and transactions submitted from these local addresses are prioritized.

Once the current environment variables are obtained, the transactions can be executed. First, all transactions to be packaged are retrieved, and local transactions are filtered out, distinguishing them from normal transactions. Then, local and normal transactions are packaged separately in order of fee from high to low. The actual execution of transactions is conducted in the commitTransactions method in miner/worker.go:

Ultimately, this calls the ApplyTransaction function, which executes the transaction in the EVM and modifies the state database:

func ApplyTransaction(evm *vm.EVM, gp *GasPool, statedb *state.StateDB, header *types.Header, tx *types.Transaction, usedGas *uint64) (*types.Receipt, error) {
msg, err := TransactionToMessage(tx, types.MakeSigner(evm.ChainConfig(), header.Number, header.Time), header.BaseFee)
if err != nil {
return nil, err
}
// Create a new context to be used in the EVM environment
return ApplyTransactionWithEVM(msg, gp, statedb, header.Number, header.Hash(), tx, usedGas, evm)
}

Transaction Validation

The situations discussed above pertain to the process of transactions being packaged into blocks. In most cases, nodes will only validate blocks that have already been packaged, rather than packaging blocks themselves.

After synchronizing to a block, the consensus layer uses the engine API to transmit the latest synchronized block to the execution layer. This uses the engine_NewPayload series of methods. These methods ultimately call the newPayload method, which assembles the consensus layer's payload into a block:

Then, it checks whether this block already exists. If it does, it directly returns a cancellation status:

If the current execution layer is still in synchronization status, it cannot receive new blocks temporarily:

If all the above conditions are met, the block will be inserted into the blockchain. It is important to note that when inserting a block, the chain head is not directly specified, as the decision for the chain head involves choosing between chain forks, which relies on the consensus layer to determine:

The consensus layer will call the forkChoiceupdated API to invoke the SetCanonical method in core/blockchain.go to determine the block head:

Another situation that can trigger the setting of the block head is block reorganization, which executes the reorg method in core/blockchain.go, where the current latest confirmed block head is also set.

Returning to the block execution process, the InsertBlockWithoutSetHead method in core/blockchain.go calls the insertChain method. In this method, a series of condition checks are performed, and after completion, the block processing begins:

In the specific Process, the processing logic is clear, similar to the previous transaction packaging process, continuously executing transactions in the EVM and modifying the state database. The difference from packaging is that here, the transactions in the new block are replayed without needing to retrieve them from the transaction pool.

7. Conclusion

Transactions are the only way to drive state changes in Ethereum. The processing of transactions in Ethereum involves multiple stages. Transactions must first undergo validation, then be submitted to the transaction pool, propagated across different nodes via the p2p network, packaged into blocks by block-producing nodes, and finally synchronized by other nodes, executing the transactions in the blocks and synchronizing state changes.

With the continuous development of the Ethereum protocol, it has evolved from initially supporting only one type of transaction to currently supporting five types. These different types of transactions allow Ethereum to adapt to various roles, serving as a platform for DApps as well as a settlement layer for Layer 2 or other off-chain scaling solutions. The recently added EIP-7702 has technically prepared Ethereum for large-scale adoption.

Ref

[1]https://ethereum.org/zh/developers/docs/transactions/

[2]https://hackmd.io/@danielrachi/engine_api

[3]https://github.com/ethereum/go-ethereum/commit/c8a9a9c0917dd57d077a79044e65dbbdd421458b

[4]https://pumpthegas.org/

[5]https://github.com/ethereum/EIPs/pull/9698

Related tags

Geth