The Emergence of Data Over Oil
If oil dominated the last century, data is the leading candidate to dominate this one. An often repeated phrase is that “data is the new oil”. But beyond the simple similarity of being important, is it useful to use oil as an analogy for data? How are they different?
1. Scarce vs. Cumulative
Oil is a scarce resource. Technology innovations related to discovering and processing oil merely help to contain price increases. Data is not just abundant, it is a cumulative resource. Technology innovations lead to a collapse in the cost of collecting and manipulating data, and our new data builds on top of our existing data. Personalization is an obvious application of this idea. To date, companies have mostly used this for targeted content and commerce. Going forward, the bigger opportunity is in areas like personalised learning and medicine.
2. Rival vs. Non-rival
If oil is being used, then the same oil cannot be used somewhere else because it is a rival good. This results in a natural tension about who controls oil. If data is being used, the same data can be used elsewhere because it is a non-rival good. It is up to us to appreciate this difference and embrace the potential. An obvious example is the power of open source. Our Internet would not exist in its current form without the positive impact of open source. It’s also clear that our intellectual property laws have not fully understood the implications of data being a non-rival good.
3. Tangible vs. Intangible
As a tangible product, oil faces high friction, transportation and storage costs. These costs place limits on the applications of oil. As an intangible product, data has much lower friction, transportation and storage costs. The result is a much wider range of applications due to fewer physical restrictions. The exponential growth in content and media is an obvious result of the fact that data is intangible. Less obvious is the transformational opportunity for a globally distributed manufacturing supply chain which would be connected by data rather than the current system of shipping and flying physical goods everywhere.
4. Process vs. Relationships
The lifecycle of oil is defined by process: extraction, refining and distribution. This process is relatively stable and predictable. The lifecycle of data is defined by relationships: with other data, with context and with itself via feedback loops. These relationships are dynamic and uncertain, requiring an entirely different approach to building value. This highlights the difference between complicated and complex systems. Oil and industrial assets benefit from the use of ideas like six sigma to enhance efficiency. Data benefits from the use of technology like deep learning to learn from the data itself, a form of exploration.
5. Linear vs. Non-Linear
A fixed amount of oil results in a predictable amount of output. There is no possibility of non-linear upside surprise. A fixed amount of knowledge can create huge value in a non-linear way. The laws of physics really do limit the benefits we can derive from a given amount of oil. On the contrary, the concept of zero is a core building block supporting our entire digital technology infrastructure. A completely non-linear benefit from a deceptively simple idea.
Each of the above differences is valid individually, but taken together they multiply in importance and reflect the emergent nature of systems based on data. While our current oil based system has some features of emergence, the new data based system has vastly more potential for new and unpredictable applications.
To be clear, the impact of these applications is not always going to be good for society, certainly at an individual level, and sometimes at the aggregate level. Even with a much smaller level of emergence, we’ve already seen the damage that our oil based system can leave to our planet.
We should not be blindly following data and need to appreciate that the emergent properties of data make it much more powerful than oil. As data becomes increasingly important, it’s critical to understand the differences between oil and data. To fully enjoy the benefits of data, we need to fundamentally update all of our global ecosystems as soon as possible across business, government, and society. This is not something that can be done by any single person, company, or country. For those interested to build a stronger global ecosystem together, let’s connect and make it happen.