Storing data on Bitcoin SV

synfonaut 2 weeks ago bitcoinstorage

The BSV community[^1] has not found consensus on how to store data on Bitcoin.

Despite what the Bitcoin SV Wiki says:

Bitcoin's ability to store data on the blockchain enables any and all types of applications to be built on top of it.

You can't actually "store data on the blockchain." Some miner has to store it. But miners are telling users they'll prune data as soon as its mined—pruning just means deleting.

The "consensus" is that data services will appear to fill the gap, but this brings more questions than it answers.

BSV miners that plan on offering data services would do well to announce their policy on pruning and retrieving data, so developers can plan ahead and build with confidence.

What does a large transaction fee buy you?

What exactly are users paying for when they pay to put a large file on Bitcoin?

The "consensus" is that they're paying for being timestamped into a block.

Timestamping is just a way to prove that some data existed prior to some point in time. By putting it on Bitcoin you get really strong guarantees—it's an excellent use of Bitcoin.

But something non-technical people should know — you can convert an extremely large file into a very small unique string, that's called hashing.

Then you can just timestamp the hash instead of the large file, saving yourself hundreds of dollars in txfees.

The secret to storing data efficiently on Bitcoin

But that's not all... just like hashing a big file into a unique string, you can hash hashes — creating further unique strings. This is something internally, Bitcoin uses extensively.

What's kind of amazing about hashing hashes, or merklizing, is how much you can store in such a little space.

When you hear people talking about SPV proofs in Bitcoin, they're talking about merklizing data—extremely compact ways to prove data integrity.

What does this mean for storing data on-chain? You don't even need to store file hashes, that's even more expensive than most people need.

Hash the hashes, and snapshot it every once and a while—every couple of hours or days.

Instead of uploading hundreds of thousands of weather transactions, Weather SV could just upload 1 per day and get the same guarantees for significantly less money.

As if that weren't cheap enough.... if you're using Bitcoin to timestamp hashed data, you don't even need to pay anything — OpenTimestamps has been offering this as a free service since 2017.

The catch? It works and runs great on BTC.

Miners are being paid for a service they're not providing

Why would a customer of Bitcoin pay hundreds of dollars to timestamp their data into the blockchain, when they could do it for a penny with the hash of that data?

Are customers and developers stupid? Or are they showing market demand by paying for a service they hope/expect a for-profit company to step up and provide?

If miners expect customers to continue paying large txfees, why would they do that without being able to reliably recover the file?

People who argue for hashing on-chain data rather than actual on-chain data are asking for a 1000x reduction in today's mining revenues.

Miners may not like that, but there's an even bigger problem.

Removing on-chain data reduces on-chain entrepreneurial activity

Removing on-chain data restricts exactly the kind of entrepreneurial activity BItcoin needs to succeed.

You currently have a wave of developers building open networks on Bitcoin. If apps are the source of truth instead of Bitcoin, we're just recreating the same Facebooks' and Twitter's of today. Users will not own their data.

If you get de-platformed, you will not have the opportunity to export your data.

Expecting everyone to manage their own data is the same as expecting everyone to run their own node

The "consensus" is non-miners will offer this service as data utility nodes, archives, libraries, etc..

Again, this brings more questions than answers.

How will those services get transaction data if they are not mining? At least one miner has indicated only miners will send txs to other miners, will these service providers have to pay for the data? Are miners storing the data? Are they not pruning it? Why aren't they serving it themselves?

Why wouldn't a miner vertically integrate? At least one miner seems to already be headed this way.

Would it be a change to the economic consensus protocol if miners changed the way getdata worked so it wasn't a free and open API call anymore?

Bitcoin is the source of truth—but you don't get that with just hashes of data on-chain. Somebody has to store the data, if you're saying that's Amazon, I've got news for you.

Bitcoin is the new cloud

Imagine if all of the existing clouds got together on the same infrastructure and started competing. They were incentivized to connect to and collaborate with each other like never before. The infrastructure itself formed a global CDN, and customers had a standardized way of interacting with their data where they could maintain control. The system was balanced in such a way that no one provider could get too far ahead without earning their spot fair and square.

That's Bitcoin. That's Metanet.

But neither of those are likely without universal data storage. If I can't rely on retrieving data from the chain (even for a high cost), Metanet won't work.

Miners have a decision to make:

If you like today's high fees and want even higher fees, start offering additional value-add services.

If you focus only on optimizing your cost and reducing utility, your revenue will drop even faster.

The solution to on-chain data is easier than we think

Some miner just has to go first. They just have to say this is what I will offer and this is what I will charge.

I appreciate current miners coming out with the expectation that they will prune data—I wish it was the other way—talking about all the new services they will offer to add utility rather than remove it.

Confusion is worse than not knowing though. It'd be great to hear from more miners on this subject.

BSV touts a stable protocol, for good reason — it allows a stable foundation for long-term building and planning.

This is part of the stable protocol that BSV wants businesses and developers to feel confident building on, what does my txfee buy me?

[^1]: Yea I said it