“Sharding” is a proposed method of splitting the infrastructure of Ethereum into smaller pieces with the goal of scaling the platform so it can support many more users than it currently does.
Ethereum is the second-largest blockchain and was designed to make it easier to build decentralized applications that would give users more control over their finances and online data, among other envisioned benefits. The idea is these decentralized alternatives will spread, offering an alternative to apps – such as Robinhood or Twitter – that have a centralized point of control. Ethereum would thus serve as a “world computer,” open to all, that cannot be shut down.
However, in order to be able to offer strong alternatives to existing apps, Ethereum will need to be able to store massive amounts of data. For traditional apps, services like Amazon Web Services (AWS) store petabytes of data from thousands of applications. Right now, though, Ethereum is far from being able to store data as efficiently as a centralized web service like AWS. In fact, Ethereum has historically suffered platform-stopping performance lapses due to a single app taxing the network.
Related: Why Ethereum and Bitcoin Are Very Different Investments
Sharding is one possible method of enabling Ethereum to store more data, a step it needs to take before its method of running decentralized apps, or “dapps,” will be able to go mainstream.
Where is Ethereum data stored?
If you replace intermediary services for applications, where is all the data stored?
Under the hood, Ethereum is made up of a global network of nodes run by Ethereum users and companies. Each node stores Ethereum’s entire history. That means it stores all the data – which person sent a transaction on which date and how much money they sent – as well as smart contracts, code written to administer those funds with certain rules.
As you can imagine, this is a lot of data.
Related: Introducing Valid Points: The Risks and Rewards of Staking on Eth 2.0
Why do multiple nodes need to store this entire elephant-sized history? This is what makes Ethereum decentralized, able to create applications that “no one can take down,” as the primary Ethereum website puts it.
If only a few people are capable of running these nodes because they’re so large, for instance, then the network is easier for individuals, or groups, to manipulate. If a single bad actor could commandeer enough of the nodes, they could rewrite Ethereum’s history. Theoretically, that could empower a person to give himself more money at the expense of other Ethereum users.
That’s why the easier it is to run these nodes, the less likely that scenario will happen because control is in the hands of more users. In turn, that makes it more likely that ether (or any cryptocurrency) can live up to its bold promises.
The problem is, these nodes typically require heavy-duty storage space and are complex to run and maintain.
Why does Ethereum need sharding?
Sharding could make running these full nodes easier.
According to block explorer Etherscan, Ethereum full nodes already take up at least five terabytes of space, which is about 10 times what the average computer can hold.
And the nodes are only going to grow bigger and harder to run over time and as more users join the platform.
Sharding is a common technique in computer science for scaling applications so they can support more data. If sharding can be properly implemented in Ethereum – which is still a big if – each user could store just a part of the history of changes to the database, as opposed to the entire thing, which is how a blockchain typically works.
Why isn’t sharding a quick fix?
Sharding is harder than it sounds.
Let’s say we split up an Ethereum node – or “sharded” it – into six pieces.
Piece one needs to be able to know the data coming from the other five nodes is correct. Otherwise it could be tricked into thinking a change was made that didn’t really occur. This turns out to be a hard problem to solve, and developers are still seeking a solution.
When will sharding go live on Ethereum?
Sharding has been an idea since Ethereum emerged in 2013. It is still not clear yet whether it will work. Also, it’s not clear when it will be added to Ethereum.
Sharding is a planned part of Ethereum 2.0, a series of upgrades to the Ethereum blockchain that officially began rolling out on Dec. 1, 2020. Sharding is more likely to be incorporated in the later stages of the upgrade because of its potential dangers and complexity.