3 April 2014

Big Data for Big Business?


 March Paper

A Taxonomy of Data-driven Business Models used by Start-up Firms

The exponential growth of available and potentially valuable data compounded by the Internet, social media, cloud computing and mobile devices – often referred to as big data, has an embedded value potential that must be commercialised. Correspondingly, the quote ‘Data is the new oil’ (WEF, 2011; Rotella, 2012) became widespread and established the analogy to natural resources needing to be exploited and refined to guarantee growth and profit.

Some studies estimate an increase of annually created, replicated and consumed data from around 1,200 exabytes in 2010 to 40,000 exabytes in 2020 (Gantz and Reinsel, 2012). In some industries, such as financial services, big data has spurred entirely new business models. The CEBR (2012) has speculated that the benefits of big data innovation opportunities are projected to contribute £24 billion to the UK economy between 2012 and 2017, while the increased prospects for small start-up creation are projected to be worth £42 billion. New jobs related to big data are estimated to reach 58,000 over the same period.

In terms of exploiting data as a resource, business models supporting data-related ventures to capture value, subsequently called data-driven business models (DDBM), are needed. Notably, scholars have published surprisingly little on this topic. Hence, understanding what business models relying on data look like remains a research question. Thus, a recent study developed by the Cambridge Service Alliance contributes to closing this literature gap and focuses on identifying the different types of data-driven business model in the start-up scene – paying attention to their commercialisation approach, irrespective of their current financial success. The study aimed to contribute to answering the overarching research question:

“What types of business model are present among companies relying on data as a resource of major importance for their business (key resource)?”

In order to answer this question:

Firstly, we have to define the term ‘data-driven business model’ (DDBM), which has not yet been defined in the scholarly literature. Therefore, this paper contributes by providing a definition of a data-driven business model as a business model that relies on data as a key resourceThis definition has three implications. First, a data-driven business model is not limited to companies conducting analytics, but also includes companies that are ‘merely’ aggregating or collecting data. Second, a company may sell not just data or information, but also any other product or service that relies on data as a key resource. An example is a company called kinsa, which sells thermometers for the iPhone and provides a service to constantly monitor the body temperature. Third, it is obvious that any company uses data in some way to conduct business – even a small restaurant relies on the contact details of its suppliers and uses a reservation book. However, the focus lies on companies that are using data as a key resource for their business model.

Secondlydevelop a framework that allows systematic analysis and comparison of data-driven business models. Therefore, this paper proposes a data-driven business model framework (DDBM). The framework aims to provide a set of possible attributes for every business model dimension to comprehensively describe any DDBM. It was developed in two steps. First, the dimensions of the data-driven business model framework (DDBM) were derived from a systematic review of six of the most important existing business model frameworks, measured by the number of citations. Second, for each of the identified dimensions, a collectively exhaustive set of features was identified using literature from related disciplines, for example, data warehousing, business intelligence, data mining, and cloud-based business models. Based on this review, the DDBM framework consists of six dimensions with in total 35 features to most of the business model frameworks, namely, key resources, key activities, value proposition, customer segment, revenue model and cost structure. For each of these six dimensions, features were derived from literature to be able to exhaustively describe the DDBM.

DDBM Framework

Thirdly, develop a taxonomy to identify clusters of companies with similar business models exist in the identified sample. To develop this taxonomy, business model descriptions of a random sample of 100 start-up firms were coded using the DDBM framework. By a subsequent application of clustering algorithms to the coded descriptions different types of business model were identified:



DDBM matrix of centroids
  • Type A: ‘Free data collector and aggregator’: Companies of this type create value by collecting and aggregating data from a vast number of different, mostly free, available data sources. Subsequently, the other distinctive key activity is data distribution, for example, through an API or Web-based dashboard. Other key activities performed by companies of this type are data crawling (35%) and visualisation (24%). While companies of this type are characterised by the use of free available data (100%) – mostly social media data (65%) – other data sources like proprietary acquired data (12%) or crowdsourced data (12%) are also aggregated by some of the companies. 
  • Type B: ‘Analytics-as-a-service’: The second cluster comprises companies providing analytics as a service. These companies are characterised by conducting analytics (100%) on data provided by their customers (100%). Further noteworthy activities include data distribution (36%), mainly through providing access to the analytics results via an API and visualisation of the analytics results (36%). In addition to the data provided by customers, some companies of this type also include other data sources, mainly to improve the analytics. Sendify, for example, a company providing real-time inbound caller scoring, also joins external demographic data with inbound call data to improve the analysis
  • Type C: ‘Data generation and analysis’: Companies in this cluster all share the common characteristic that they generate data themselves rather than relying on existing data. Subsequently, all companies in this cluster share the key activity ‘data generation’. Besides generating data, most of the companies also perform analytics on this data. Within the cluster, companies can be roughly subdivided into three groups: companies that generate data through crowdsourcing; Web analytics companies; and companies that generate data through smartphones or other physical sensors 
  • Type D: ‘Free data knowledge discovery’: The companies in this cluster are characterised by the use of free available data and analytics performed on this data. Furthermore, as not all free data sources are available in a machine-readable format, some such companies crawl data from the Web (data generation 50%). An archetypical example of a ‘free data knowledge discovery’ company of this type is Gild, which provides a service for companies by helping them to recruit developers. To identify talented programmers, Gild automatically evaluates the code they publish on open source sites like GitHub or Google Code, as well as their contribution on Q&A websites like Stack Overflow. Based on this evaluation, a score is created that expresses the strength of a developer and allows hidden talents to be identified (Gild, 2013).
  •  Type E: ‘Data-aggregation-as-a-service‘: Companies in this cluster create value neither by analysing nor creating data but through aggregating data from multiple internal sources for their customers. This cluster can therefore be labelled ‘aggregation-as-a-service’. After aggregating the data, the companies provide the data through various interfaces (distribution: 83%) and/or visualise it (33%). The areas of application are focused mostly on aggregating customer data from different sources (e.g. Bluenose) or from individuals (e.g. Who@) within an organisation. Other companies focus on specific segments or problems. AlwaysPrepped, for example, helps teachers to monitor their students’ performance by aggregating data from multiple education programmes and websites. Similar to Type B (‘analytics-as-a-service’), the revenue models of such companies are primarily subscription-based and mainly business customers are targeted
  • Type F: ‘Multi-source data mash-up and analysis’: Cluster F contains companies that aggregate data provided by their customers with other external, mostly free, available data sources, and perform analytics on this data. The offering of companies in this cluster is characterised by using other external data sources to enrich or benchmark customer data. A typical example of a business model of this type is Welovroi, a Web-based digital marketing monitoring and analysing tool that allows tracking of a large number of different metrics based on data provided by customers. However, Welovroi also integrates external data and allows benchmarking of the success of marketing campaigns.
To conclude, the study provides a series of implications that may be particularly helpful to companies already leveraging ‘big data’ for their businesses or planning to do so. The proposed Data Driven Business Model (DDBM) framework represents a basis for the analysis and clustering of business models. For practitioners the dimensions and various features may provide guidance on possibilities to form a business model for their specific venture. The framework allows identification and assessment of available potential data sources that can be used in a new DDBM. It also provides comprehensive sets of potential key activities as well as revenue models.


Cambridge Service Alliance