Justifying Data Discovery

56% of decision makers are expanding their use of external data (and 20% more plan to do so) according to recent Forrester Research. Taking this leap is not easy – exploration can be perceived as a luxury. And a challenging one :

  • There is a high cost and friction to go from concept or use case to accessing a test sample of data from external sources. Many of our clients have estimated over $100,000 per analytical study per vendor, depending on the sensitivity of data being shared
  • There is a decent chance of failure in pursuing any one data source, because data’s value is specific to the context – the model in which it is being used – so what works for your peers may not add lift for you
  • The timeline for external data access is highly unpredictable – sometimes there are quick wins (e.g. a rapid fraud problem solve), sometimes it takes material internal resources (e.g. replacing a customer workflow with pre-filled high precision data)

Many data scientists need systematic ways to justify budgets for accelerated and expanded data exploration and toolkits. Here are some of the most effective strategies that we have found :

  1. Benchmarking external data use : A few targeted conversations with consulting firms or marketplace providers like Demyst can usually help identify if you are well ahead of or behind the curve in data use. Although not always true, more data sources means more columns, which means better models, which almost always means better economics outcomes (cost per customer acquisition, risk or loss rates, net margins, etc). So a fundamental question to justify investment is – how many sources do we use versus peers?
  2. Attach to a use case : Data about an entity – a business, consumer, etc – has a myriad of valuable uses across an enterprise, but typically budget owners have funded initiatives that map to only one. Align closely to that. If mitigating fraud is a key initiative, then that and that alone can typically support infrastructure investment for consumer data access, all the other use case benefits are captured for “free”
  3. De-risk exploration : The market is shifting and data can increasingly be tested at scale for negligible fees, work with a specialist platform that allows this (obviously we’re biased), but recognize that if going wide enough valuable data will almost always be found, so you’ll need to start early in thinking about how to operationalize. Data vendors will be very open to strategic partnering when there is a concrete sequence of steps to commercial use and value sharing. Once value is understood, budget justification is far easier
  4. Understand and defray internal costs : Enterprises are already investing in costly data access and usage, this involves information security and vendor diligence processes, and hugely valuable data scientist time better spent on analytics. Map this out (an example is below for a build vs buy estimate for Demyst which can be re-framed for any initiative).

Looking for what to do next? Explore our data catalog or sign up for a new account! Feel free to reach out to us if you have any questions or feedback.

Mark Hookey

Mark Hookey

CEO and Founder
Share on facebook
Facebook
Share on twitter
Twitter
Share on linkedin
LinkedIn

Get in touch



More to explore

Demsyt Python API

Demyst – Hosted Notebooks

Python has clearly taken over big-data programming language of choice over the last few years. DemystData uses the power and ease of python to provide external data through our Python API. And with Binder, we

Read More »

The External Data Imperative

“First we have to get our internal data in order”. It’s something we hear frequently. Why invest externally when there is an underutilized resource on hand? Today’s article breaks down the dimensions of when to

Read More »
Close Menu