Part 2 of the series: How does Bitcoin work?
This three-part series explains the technological underpinnings behind Bitcoin and its underlying distributed ledger technology (DLT). After the first part of the series introduced the history of Bitcoin and the first important building blocks, such as hash functions, private and public keys, the second part now deals with the consensus mechanism Proof-of-Work, signatures and the process of a Bitcoin transaction.
Consensus mechanism and network participants
In the decentralized Bitcoin system, the Proof-of-Work consensus algorithm sets the rules for network participants, called nodes, to agree within the network on which transactions can be considered valid and which cannot. Generally, the algorithm is responsible for ensuring that the protocol, i.e. the rules of the blockchain, are followed. In the Bitcoin network, there are a total of three parties that are important to the transaction process. These include the users, the nodes and the miners. Users want to transfer bitcoins from A to B and forward their transaction to the nodes to do so. In exchange for completing a transaction, users must pay a transaction fee. Nodes receive the transaction information, such as transaction amount and recipient, from the users and send it on to the so-called mempool – the pool of open transactions – of the miners.
A (Full) Node has permanently downloaded the entire blockchain and updates it regularly. Miners eventually assemble transactions from the mempool into a block (more on this later), and then send it to the nodes for verification. The nodes then check the block and the transactions it contains.
But how exactly does a Bitcoin transaction work? The simplified flow of a Bitcoin transaction is shown in Figure 1 and discussed in detail below. 1] Suppose Anja wants to send a Bitcoin to Bernd. To do this, it needs several pieces of information. First, she needs her Private Key to verify that she is indeed the owner of the one Bitcoin she would like to send. Additionally she needs the Bitcoin address of Bernd, so she knows where to send the Bitcoin. In addition, Anja can also optionally add a message text, analogous to the reason for payment for bank transfers. After entering this data, the transaction information is hashed by the hash function and the output is signed with the private key specified by Anja. This creates a digital signature. But how can Bernd or the nodes in the network be sure that Anja actually used the correct private key?
Figure 1: Process of a Bitcoin transaction
For this Anja’s Public Key is taken to hand. This was initially generated using an elliptic curve from the private key, from which the Bitcoin address was generated. 1] This allows the network to validate that Anja used the correct private key without the nodes or other users, apart from Anja herself, knowing the private key. If Anja uses an incorrect private key, the public key cannot verify the digital signature and the transaction is rejected.
If the transaction is valid, it is distributed to all nodes in the entire network and finally ends up in the miner’s mempool. The miners take transactions from the mempool and “package” them into a block. A block is exactly one megabyte in size and comprises around 4,200 transactions. Due to the limited block size, only a maximum of seven transactions per second are currently possible on the Bitcoin network. In addition to the transaction data, a block also consists of a block header.
Block Header and Merkle Trees
The most important parts of the block header for basic understanding are the hash of the previous block, a Merkle root and the so-called nonce. Hash of the previous block means that the last block from the blockchain is hashed and this hash is then part of the block header in the new block. A Merkle Tree is similar in visual structure to a tree diagram (see Figure 2).
Figure 2: Visual representation of a Merkle Tree
The lowest branches of the Merkle Tree contain the hashes of the individual transactions of the block. These are then hashed in pairs until only a single hash remains at the end. This hash is called Merkle Root and is included in the block header. The Merkle root allows all transactions in the block to be mapped with only one hash. If, for example, you swap HB with HC in Figure 2, you end up with a completely different hash for the Merkle root, since HA and HC together result in a different hash than HA and HB before. The same applies to the hash of HB and HD.
Nonce and Mining
The last component of the block header is the nonce. For the time being, you can imagine this as a free field, in which a number between 0 and232 will be inserted later. The block header is important because it is hashed twice with SHA-256. The special thing about this output is that it must start with a certain number of zeros for the block to be considered found, for example like this:
Such a hash is found by the miners substituting numbers from zero to about 4.2 billion (232) for the nonce, which is the original free field. The higher the number of zeros has to be, the harder it is to find the hash. Since the SHA-256 hash function is not reversible, miners must keep trying through the numbers until they find a number that ensures that the hash starts with the number of zeros they are looking for. Then, in this context, a nonce is considered found.
This process of proof-of-work is called mining and requires a great deal of computing power from the computers involved. In the past, mining could still be done profitably with conventional computers, but nowadays it is mainly done with so-called ASIC miners, which are only designed to find the nonce as fast as possible.
In general, mining is extremely costly due to hardware and electricity costs, which is why miners tend to be geographically located in regions with low electricity prices. Due to the high computing power and the associated high energy consumption, mining is also not environmentally friendly.
“In general, mining is extremely costly due to hardware and electricity costs.”
As soon as the first miner has found a matching nonce for its block, so that the block header starts with a certain number of zeros, it forwards the found block to the nodes. The nodes check whether the protocol for the block was followed, e.g. whether a matching nonce was found and the maximum block size was not exceeded. If the protocol has been followed, the block is considered valid and is appended to the blockchain. Since the hash of the previous block is always integrated in each new block at the same time, each block builds on the other, resulting in a concatenation of the blocks. Consequently, if a network subscriber were to attempt to modify data in an older block, the hash of that block would change, and consequently the hashes of subsequent blocks would change as well. So all following blocks would have to be modified and a new nonce would have to be found for each one, which would be extremely computationally intensive. Therefore, the blockchain is considered virtually unchangeable and resistant to manipulation.
But why do miners incur immense hardware and power costs to find the nonce for their block? Basically, a miner’s goal is to find a nonce for his block faster than anyone else. This is because once a miner finds a nonce for their block, it is recognized as valid and added to the blockchain, all other miners stop finding a nonce for that block. This is because as of that moment, a new block has been added to the blockchain and miners need the hash of that block in order to start working on a new block.
The first miner to find a valid nonce for their block will receive a reward of 6.25 bitcoin and the sum of the transaction fees in that block. Mining is therefore done because of the financial incentive. Originally, miners were paid a reward of 50 bitcoin per block – however, this reward is halved every 210,000 blocks (approximately every 4 years) until eventually all 21 million bitcoin are in circulation (approximately in the year 2140). After that, the miners only get paid the transaction fees as a reward.
As mentioned earlier, the difficulty of finding a nonce depends on how many zeros the hash of the block header has to start with. The more zeros are needed at the beginning, the more difficult it is to find a nonce. The difficulty is set so that the network takes about ten minutes to find a matching nonce for a block. In the event of deviations, the difficulty, i.e. the number of zeros searched for, is adjusted in order to continue to maintain the 10-minute rhythm.
Over time, the difficulty of Bitcoin has increased. There are two main reasons for this. For one thing, the bitcoin price went up tremendously. This led to more and more new miners joining the network and consequently mining. Second, mining hardware is becoming more efficient. The consequence of both is that the hash power – i.e. the computing power in bitcoin mining – in the network increases and therefore the difficulty must be increased so that a new block continues to be found on average every ten minutes(Narayanan et al., 2016).
1] For an even more detailed overview of the bitcoin transaction process, see: https://www.mme.ch/fileadmin/files/documents/Publikationen/Bitcoin_Luka_Mueller.pdf.