How to Increase Data Variety: Populating Demyst Catalog in third-party applications

Why Data Variety Matters

Data variety is essential for your internal data catalog because increasing the breadth of datasets helps to increase accuracy in your data analysis. It also helps your staff to understand the information that’s currently available out there.

Demyst is your one-stop shop for data variety, and in this post, we will see how to populate data.world with our product catalogs using our favorite tool, python.

How to Get Data Variety in Your Internal Catalog

Before we get started, you would need to sign-up on Demyst and Data.world. It took me a minute to sign-up and update my profile on these websites.

The first thing we are going to do is see a sample of product catalog available on Demyst and what will be populated into the data.world. To do this, we are using our python API and calling our, you guessed it, product_catalog() method. You can find more information on our catalog methods and data dictionaries by viewing our Demyst Analytics Python package here.

Data Variety

Heading over to data.world, you will need to enable Python as a valid integration and create an API token. And just like our python API, installing it and configuring to use the API token was super easy and each a one-liner.

We now create a dataset using a sample CSV file that gives us the data.world URL for our dataset.

We can now add each product from Demyst as a separate CSV file or upload all the products catalog in a single master CSV file in this dataset.  And these few lines can be converted to a script that a user can run one time or periodically to add all metadata from the Demyst product catalog into their own catalog.

Let’s see the dataset we just created

We now have the complete catalog as an external dataset available to us. This becomes beneficial to us for investigating the products in an environment we have already integrated with and sharing it across our organization. But let’s not stop at ingesting the catalog and create a project associated with this dataset.

Creating a project linked with our dataset gives us the opportunity to run queries and create insights that can further be shared and discussed. So let’s create a sample query that finds us all the NAICS attribute and the products from Demyst that provide this attribute in their response.

Lastly, let’s see that query as an insight saved to our project with some visualization using Chart-Builder that provides vega-lite visualizations. This helps in seeing how many provide NAICS codes and how many of those are NAICS description. The visualization might be lite but don’t let us inhibit you from more advanced visualizations.

That was pretty simple, wasn’t it? And you have the ability to rerun the selected few lines to update your dataset periodically and create complex insights into hundreds of attributes from Demyst. You can try these steps on our hosted notebook and contact us at client@demystdata.com to let us know about it.

Harshit Singh

Harshit Singh

Share on facebook
Facebook
Share on twitter
Twitter
Share on linkedin
LinkedIn

More to explore

The Demyst Approach to “Agile Data”

Analogous to the contrast between waterfall and agile methodologies in software development, Agile Data is focused on achieving a minimum viable releasable improvement. Ignore the scrum masters for a moment; what allows organizations to follow

Read More »

Just Click Here

In today’s marketing landscape, the function of lead generation has expanded beyond the traditional mandate of procuring contact details and now requires that companies have an in-depth understanding of their customers’ firmographic data. Just Click

Read More »

Digital Verification APIs

Uber, Amazon, Netflix and every other breakout success of our age set the new standard in consumer engagement – leverage technology and data to enable 1 click to buy. They weren’t different processes, retailers previously

Read More »
Close Menu