Seriously, What the Hell Is a Blockchain?
Blockchains are often explained with a lot of tech jargon by people in mathematics, cryptography, and network engineering. It turns out that blockchains are more straightforward than you might think, at least for the most part.
Don’t miss out on how smart money is playing the crypto game. Subscribe to our premium newsletter – Crypto Investor.
As a person born in 1995, I’ve always considered myself computer savvy, yet I struggled to understand what a blockchain was for quite some time. What finally clicked for me was starting with the most basic concepts and slowly building up from there.
So, in this explainer, we’ll start with the simple concept of a computer and build up to a blockchain. If you are already familiar with the concepts of servers, databases and distributed databases, feel free to skip to the blockchain section.
Fast Facts:
- A blockchain is a form of database, more specifically a distributed database.
- The data stored on a blockchain are cryptocurrency transactions.
- Blockchains store data (transactions) in chronological groups, known as blocks, instead of folders and tables like normal databases.
- Bitcoin’s blockchain is open and accessible to anyone, unlike a centralized database run by a company or government.
- Unlike databases where information can be added, removed or edited, blockchains can only be added to.
What is a Computer?
A computer is a piece of electronic equipment that can read and manipulate data. Computers come in many forms including desktops, laptops, tablets, gaming consoles and cellphones.
What is data?
Data is simply information, and it can come in endless formats ranging from videos and photos to text. In the past, we stored these types of information on physical objects like paper or film. With computers, we can keep this information digitally.
The combination of components in our computers allows us to access and alter all that data in a digital format quickly and easily.
What is a Server?
Servers are computers that host websites, files, databases or other services. When you want to access that website or service, you are accessing the server that houses it. For example, when you want to look at your Gmail inbox, you are accessing a Google server that is providing the service of Gmail.
All computers have something called an IP address (internet protocol address) that is essentially that computer’s mailing address. A website’s name is actually just a code for the server’s IP address that the site is located in. When you type Google into your search bar, it takes you to the server that holds Google.
Servers can be set up so that more than one has the same IP address, allowing large websites like Google to spread out the traffic amongst its thousands of servers.
What is a Database?
The next step to understand blockchain is understanding what a database is
A database is a giant collection of information stored on servers that can be easily accessed, managed, and updated.
This large collection of information, or “data,” can sometimes require hundreds or thousands of servers operating in giant facilities known as server farms (huge buildings with thousands of computers).
Large internet companies like Amazon and Google use massive server farms to store their websites, apps, and users’ data. Typically, only a select number of approved people control these databases and they exist in one central location. This means its security depends entirely on the server farm not having malfunctions or those with access not getting compromised by hackers.
Data could be lost if a fire broke out at the farm or leaked if a hack occurred. The central location and control points make for obvious points of attack for hackers. For this reason, some databases are distributed among computers in different physical locations. Databases like this are called distributed databases.
What is a Distributed Database?
Distributed databases are stored in servers separated by location instead of one central location for security reasons. In the context of a distributed database, these servers are often called nodes.
This way, if one location has a malfunction or is hacked it can be shut down and the other nodes in different locations can continue running to maintain the database.
Now that you understand the concepts up to this point, it should be easier to grasp blockchain because blockchain is really just a form of a distributed database.
What is a Blockchain?
You can think of a blockchain as a version of a database, more specifically, a distributed database. The main differences are in the type of data it stores, the way it stores it, who is allowed access and that data on a blockchain cannot be manipulated or deleted.
Note: Blockchains can be made “permissionless” (accessible to anyone like Bitcoin) or “permissioned” (built by a company or group that only gives certain people access). This article explains blockchain in the context of Bitcoin, which is permissionless.
What it stores: Bitcoin’s blockchain is a type of distributed database that stores Bitcoin transactions.
How it stores it: Instead of a typical database where information is stored in arbitrary folders, Bitcoin transactions are stored in “blocks.” As new transactions occur, they get grouped together in these so-called blocks.
These blocks only have room for so many transactions, and when a block fills up, it is chained onto the previous block and added to the long chain of transactions (hence the “blockchain”).
This creates a chronological history of transactions, much like a ledger, from the first transaction in the first block to the last transaction in the most recent block. The blockchain saves these blocks in a format that allows us to view a perfectly recorded history of Bitcoin transactions.
Who is allowed access: Like a database, Bitcoin’s blockchain needs a collection of computers to function. And like distributed databases, Bitcoin’s blockchain is not stored in one central location. Instead, it is dispersed among many computers and locations. This way, if one computer goes down, plenty of others keep the data (the ledger of transactions) alive.
Governments or companies operate the computers that run typical databases, but Bitcoin relies on average individuals with personal computers. Those who wish to be a node to help run the blockchain download Bitcoin’s open-source software and the whole, or partial, history of Bitcoin transactions.
Transactions can’t be manipulated or deleted: Another fundamental difference between databases and Bitcoin is that, unlike a database where older data can be deleted or changed, Bitcoin transactions are irreversible. In that sense, Bitcoin’s blockchain is like a database that can only be added to, where transactions are never altered or removed.
If Blockchain Is Just a Type of Database, What’s So Special About Bitcoin?
Not only that, but how does such a database maintain accurate data? And how does it remain secure if anyone can just start running a node and participate?
These are all great questions, and this is where Bitcoin truly becomes interesting. While the basic concept of Bitcoin’s blockchain is relatively simple, it has certain features that make it a major breakthrough in computer science.
A problem in computer science, known as the Byzantine Generals Problem, had never been entirely solved until Satoshi Nakamoto created Bitcoin. Robert Shostak first found and formalized the problem in 1978 during a NASA-sponsored computer science project.
An analogy to the problem, as described by researchers Leslie Lamport, Robert Shostak and Marshall Pease in their 1982 paper, goes like this:
“We imagine that several divisions of the Byzantine army are camped outside an enemy city, each division commanded by its own general. The generals can communicate with one another only by messenger. After observing the enemy, they must decide upon a common plan of action. However, some of the generals may be traitors, trying to prevent the loyal generals from reaching agreement.”
So, how do the generals ensure that they are all on the same page and that the information they have received is accurate? The battle could be lost if they don’t all work together.
Now imagine this but instead of generals, it is nodes in a database. If some nodes in a database malfunction and begin sending incorrect information to the others, how does the database form a consensus on the correct set of data?
While a centralized database operated by a government or company has administrators that can correct the problem, a distributed database with nodes run by random individuals on the internet, like a blockchain, may not be able to.
To solve this problem, Satoshi Nakamoto used a consensus mechanism called proof-of-work.
What Is a Consensus Mechanism?
A consensus mechanism is a system that allows nodes in a distributed computer system (database, blockchain or otherwise) to reach a “consensus” about the correct set of data. Simply put, it is a set of rules that allows everyone to agree on what is right or wrong.
This gives blockchain networks their security and allows the participants (nodes) to verify the authenticity of data (transactions) without having to trust each other.
Nakamoto used a consensus mechanism called Proof-of-Work (PoW) to solve the Byzantine problem, which involves the Bitcoin buzzword “mining.”
Proof-of-Work
To put it simply, Proof-of-work is the process where Bitcoin nodes compete for the right to update the blockchain with a new block of transactions. The competition is to solve an extremely complex puzzle before other nodes do.
This puzzle is really hard to solve but, once solved, easily verifiable by the rest of the nodes. So, the node must provide an answer, also known as a “proof,” that everyone else can then easily verify if correct or not.
One of the best analogies I have read for the complex puzzle nodes solve is from Nathaniel Popper’s book, Digital Gold.
“… it is relatively easy to multiply 2,903 and 3,571 using a piece of paper and pencil, but much, much harder to figure out what two numbers can be multiplied together to get 10,366,613.”
In this analogy, the node must determine what two numbers multiplied together result in 10,366,613 by guessing random combinations of numbers until the correct result is found. The node then provides the answer (the answer being 2,903 and 3,571), or “proof,” to other nodes who can then easily multiply the numbers and verify that it is correct.
Whoever solves the puzzle first gets to broadcast the block of transactions to the other nodes. This ensures that only someone who has invested enough energy and computational power earns the right to add new transactions to the ledger.
When the nodes receive the new block they perform something like an audit of previous transactions to ensure that the new transactions add up correctly and that the correct amount of Bitcoin remained on the ledger.
After all the nodes verify that the transactions in the new block make sense against the previous ledger entries, the new block is chained to the previous block and forever saved to the blockchain. The node that solved the puzzle is then rewarded with Bitcoin.
This process is commonly referred to as “mining” as the computer work it takes a node to earn the Bitcoin reward can be thought of as the digital equivalent to the real-world work that mining gold requires.
Because it takes so much computational power to add a new block to the chain it becomes impossible to try to add fraudulent transactions like adding extra Bitcoin to one’s wallet. If someone wanted to try this they alone would need to make up over 50% of all Bitcoin nodes and computing power so that they could add a new block and then have the majority of nodes accept and verify that block as legitimate.
Given how large Bitcoin’s blockchain has become today, the upfront cost of the computer equipment necessary to attempt such a thing would be effectively impossible for any group or even government to accomplish.
And even if it were successful people would find out that there is an issue with the system and therefore sell their holdings, devaluing the very currency they were trying to counterfeit.
So the process of proof-of-work effectively solves the byzantine problem because nodes can trust new transactions (the data on the blockchain) without needing to trust or know each other. And because there is an economic incentive via Bitcoin rewards to participate rather than attack it, Bitcoin’s blockchain will remain Byzantine fault-tolerant for as long as people believe Bitcoin has value.
The combination of these features results in an immutable ledger of economic transactions that is controlled by the collective of its users rather than any company, government or group.