Blockchainizing Existing Databases

Blockchain has been a buzzword for the past several years and it hasn’t lived to its promises (yet). The value proposition usually includes vague claims about trust and unmodifiability, but rarely that has brought demonstrable improvement to existing processes.

There are dozens of blockchain projects, networks, protocols, “standards”, and all of them can in some way help you solve either data integrity issues (guarantee that data has not been tampered with) or multi-party trust issues (several companies participating in one process shouldn’t have to trust each other in order to have automated cross-organization business processes).

However, deploying and integrating a separate blockchain solution is usually a large project in itself and especially in the COVID-19 crisis likely gets postponed because of the questionable return on investment.

But for the enterprise, blockchain is largely a shared database. Sharing data with other participants in a given business process in a secure way that doesn’t allow any of the participants to cheat. And this can be achieved not by adding a whole new blockchain infrastructure that would in turn integrate with existing systems (which in many cases can’t be integrated easily because they don’t have APIs), but by “blockchainizing” the existing database.

Ideally, what I’m describing should be a project itself, which is either deployed alongside the database, or as part of an application. And what it can do is as follows:

  • Select tables and columns to share with other participants – obviously only parts of the database should be shared with others
  • Define shared data model and data transformations – sometimes data has to be transformed, or masked, in order to meet regulatory or privacy requirements. Certainly, the databases of participants will differ and they have to be aligned to a common model.
  • Track inserts, updates and deletes, sign them and send them to the peers
  • Manage a PKI with a private key shared with the rest of the participants (or some of them)
  • Generate merkle trees based on the “transactions” (inserts, updates and deletes) and expose them to be verified regularly by the peers
  • Provide an admin dashboard to view various aspects of the system – activity, status of peers, configuration options

Effectively, that’s also an integration effort. Defining shared data models and transformations. It also involves setting up some piece of software that does the “blockchaining” and communication with peers. But since the goal is usually to have a shared database, it makes sense to go directly at the database level, rather than providing an append-only key-value store which is then queried in order to fill the actual database.

Can such an approach be just a configuration option for existing solutions like Openchain, Hyperledger, Corda? It could be – allowing them to stream changes directly to and from an existing database in a predefined fashion.

This post is in the “random ideas” category, things that I’ve thought about could be implemented, but never found the time to do so. I think blockchain should be taken to the ground and viewed as an infrastructure components. Much like enabling database encryption or adding another data source to an ESB for the sake of integrating two systems. Because the business case for blockchain is usually this – integrate several systems and don’t allow them to cheat. I think this can be achieved by plugging at the database level for the systems integrated.

The post Blockchainizing Existing Databases appeared first on Bozho's tech blog.