API Reference
We are going to walk through all of the available methods in the Demyst Analytics Python package. This will give you a broad overview of the features and capabilities of the package.
Analytics
The Analytics class drives all of the methods that help you access external data. Generally, you want to instantiate a separate Analytics object for each data study.
Examples
Username & Password Authentication
The best way to get started is to let the toolkit prompt you for your username and password.
from demyst.analytics import Analytics
# If you don't pass in any parameters, you will be prompted
# for username and password.
analytics = Analytics()
Key-based Authorization For non-interactive scripts, use the key parameter to pass in your API key.
from demyst.analytics import Analytics
# Pass in your API key with the key parameter.
analytics = Analytics(key="XXXXXXXXXXXXXXXXXXX")
More details on Analytics()
class Analytics(kwargs***)
Argument | Defaults | Notes |
---|---|---|
inputs | {} | Default input DataFrame to use |
region | "us" | Which of the global edges to use: us, sg, au |
username | None | If None provided, then prompted |
password | None | If None provided, then prompted |
sample_mode | True | Return test data, set to false for live mode |
config_file | None | Config file that stores these options |
key | None | For non-interactive use |
.input_files
Lists the various sample input entities available that can be filtered or used as it is for search or enrich. Use these input files for sample execution of methods in the package.
Examples
List and query sample input files
Listing the hosted input files and querying them on an attribute
from demyst.analytics import Analytics
analytics = Analytics()
# This will print the input files available
analytics.input_files()
# This will print a subset of input (in dataframe format)
analytics.input_file('us_business_entity', 100, {"post_code" : "94123"})
# The resulting dataframe looks like:
city post_code country street
0 San Francisco 94123 US 2953 Baker St
1 San Francisco 94123 US 1628 Union St
More details on analytics.input_file()
analytics.input_file(_filename, row_limit=None, filters=None_)
Argument | Defaults | Notes |
---|---|---|
filename | None | Required, sample input file name |
row_limit | None | Number of rows required in output |
filters | None | Attribute/Column header and value to filter on |
Results: Provides filtered input file that can be used for search or enrichment
.sample_data
Returns sample data for a provider. Also allows to filter on number of rows and values of a column you need.
Examples
Show sample data
This fetches some sample data for the google_latlon provider.
from demyst.analytics import Analytics
analytics = Analytics()
analytics.sample_data('google_latlon', 5, { "state": "CA" })
# The resulting dataframe looks like
street city state post_code country good
9 4910 Castana Ave Apt 8 California CA 90712 US 1
0 652 N. Marengo Ave. #202 California CA 91101 US 1
1 1303 W 168th St. Apt.9 California CA 90247 US 1
2 14666 Hiawatha St California CA 91345 US 1
3 9689 Saint George St California CA 91977 US 1
8 2739 1/2 E Monroe St Indiana CA 90810 US 1
More details on analytics.sample_data()
analytics.sample_data(provider, row_limit=None, filters=None)
Argument | Defaults | Notes |
---|---|---|
provider | None | Required, name of data provider |
row_limit | None | Number of rows required in output |
filters | None | Attribute/Column header and value to filter on |
Results: Provides sample data that can will be similar from enrichment
.validate
Checks whether the input dataframe's column names and values would be accepted by the Demyst system. You can run this as a quick preflight check before kicking off an enrichment job
Examples
Validating CSVs
For non-interactive scripts, use the key parameter.
from demyst.analytics import Analytics
analytics = Analytics()
inputs = pd.read_csv('inputs.csv',
dtype = {'phone': object, 'post_code': object})
analytics.validate(inputs)
phone post_code
0 15555555555 10010
More details on analytics.validate()
analytics.validate(inputs, providers=None, notebook=True)
Argument | Defaults | Notes |
---|---|---|
inputs | None | Required, unless provided to Analytics() |
providers | [] | List of Data Products to validate against |
notebook | True | Produce HTML report, or Boolean if false |
Results: If `notebook` is true, returns an HTML object suitable for Jupyter notebook display. Otherwise returns a boolean indicating whether the validation succeeded.
.search
Looks for providers that are able to return data for the provided inputs. Use this when you have some data and want to see which of our data providers might be able to use it. The headers of input data must be of Demyst types. You can also do an unguided search of products using this method.
Examples
Searching providers
For non-interactive scripts, use the key parameter.
from demyst.analytics import Analytics
analytics = Analytics()
inputs = pd.read_csv('inputs.csv', dtype = {'phone': object,
'post_code': object})
analytics.search(inputs)
# This will output a nicely-formatted list of providers to the notebook
analytics.search("business")
# This will output providers for business category
More details on analytics.search()
Argument | Defaults | Notes |
---|---|---|
inputs | None | Required, unless provided to Analytics() |
tags | None | List of tags to search for |
view | html | Set to "json" to produce JSON output or "dataframe" for a table of products |
strict | False | If true, only return providers for which all inputs are present |
Results: If view=json, returns a list of result objects, otherwise returns an HTML object suitable for Jupyter notebook display
.attribute_search
Looks for data providers which contain the provided attribute. If you are looking for a certain attribute and need to know which providers have them, use the attribute_search. It will list all of the providers which contain that attribute in their response.
Examples
Searching for an Attribute
In this example, we will look for the attribute NAICS and which providers can provide me with NAICS (North American Industry Classification System) for the business.
from demyst.analytics import Analytics
analytics = Analytics()
analytics.attribute_search(name="naics")
# This will print a list (in dataframe format) of the providers
# The resulting dataframe containing providers and attribute names looks like:
attribute provider
0 naics_codes experian_business_facts
1 primary_naics equifax_austin_tetra_details
.enrich_and_download
Augments your input data with results from our data providers. This is the main entry point to the Demyst Data platform. The enrich_and_download method is actually a convenience wrapper around the more primitive functionality provided by enrich, enrich_wait, and enrich_download. We recommend that you use enrich_and_download to get started, and switch to those other methods later, e.g. when you have lots of data to process.
Examples
Enriching an input dataframe
This example uses enrich_and_download to augment an input dataframe containing some email addresses with our built-in domain_from_email data provider that simply splits the addresses into username and hostname, and returns them right back.
from demyst.analytics import Analytics
import pandas as pd
analytics = Analytics()
inputs = pd.DataFrame.from_dict([
{ "email_address": "foo@example.com" },
{ "email_address": "test@test.com" }
])
# Here we only use a single data provider, but you can pass in
# any number of data provider names to use.
results = analytics.enrich_and_download(["domain_from_email"], inputs)
print(results)
The resulting dataframe looks like this:
inputs.email_address domain_from_email.row_id domain_from_email.client_id \
0 foo@example.com 0
1 test@test.com 1
domain_from_email.host domain_from_email.user domain_from_email.error
0 example.com foo
1 test.com test
Note that your input column email_address was mirrored back in a prefixed form as inputs.email_address. The columns starting with domain_from_email were added by the data provider. While this example is somewhat contrived, it shows the basic workings of enrichment: you pass in a dataframe and the names of some providers to use, and get back a dataframe containing additional data from the providers.
If instead of providers you wish to enrich a channel- which would allow deeper use of our config-driven Data API capabilities- then instead of a list, input the integer channel ID (located to the right of the channel name on the platform).
We use channel 6508 in the following example, which has a configuration of 4 providers: experian_business_search_api, hosted_equifax_mds, hosted_experian_cpdb and hosted_experian_crdb_2. For brevity we do not show all columns but as it is above, the columns starting with a specific data provider were added by that data provider.
from demyst.analytics import Analytics
import pandas as pd analytics = Analytics()
analytics = Analytics()
inputs = pd.DataFrame({
'business_name': ['Strava Inc', 'Demystdata'],
'street': ['208 Utah St', '28 W 25th St' ],
'city': ['San Francisco', 'New York'],
'state': ['CA', 'NY'],
'post_code': ['93104', '10010'],
'country': ['US', 'US'],
})
# Now instead of a list of data providers, pass the channel_id parameter.
results = analytics.enrich_and_download(6508, inputs)
# The resulting dataframe looks like this:
inputs.business_name inputs.city inputs.country inputs.post_code inputs.state inputs.street \
0 Strava Inc San Francisco US 93104 CA 208 Utah St
1 Demystdata New York US 10010 NY 28 W 25th St
experian_business_search_api.is_hit ... hosted_experian_cpdb.year_business_started ... hosted_experian_crdb_2.annual_sales_size_code ... \
0 True 2007 A
1 True 2010 B
More details on analytics.enrich_and_download()
analytics.enrich_and_download(providers_or_channel_id, inputs, validate=True, all_updates=False, hosted_input=None)
Argument | Defaults | Notes |
---|---|---|
providers_or_channel_id | [] or Integer | List of provider names to query, or Channel ID Integer |
inputs | Inputs to pass to providers | |
validate | True | Perform validation before enrichment |
all_updates | False | Include historical data in results |
hosted_input | None | Use a sample input file instead of the provided inputs |
Results: Returns the enriched dataframe.
.enrich
enrich is the lower-level (compared to enrich_and_download) workhorse that lets you kick off an enrichment job asynchronously. It immediately returns a job ID, which you can use with our other methods:
- Manually check the status of the job with enrich_status.
- Wait for the job to finish with enrich_wait.
- Download the results with enrich_download. You can even download partial results while the job is still running.
Use enrich for long-running jobs with real data; if you're just getting started we recommend to use enrich_and_download which runs synchronously and does all of that for you.
Examples
Manual control over enrichment
We're re-using the example from enrich_and_download, but use `enrich` which doesn't block the notebook and thus allows us to keep working while the enrichment is in progress.
from demyst.analytics import Analytics
import pandas as pd
analytics = Analytics()
inputs = pd.DataFrame.from_dict([
{ "email_address": "foo@example.com" },
{ "email_address": "test@test.com" }
])
# This kicks off the job... once it prints the job ID you can continue working.
# List of providers or channel ID integer value
job_id = analytics.enrich(["domain_from_email"], inputs)
# If you want to inquire about the status of the job, do the following.
# This will print some status information and return true if the job is finished.
finished = analytics.enrich_status(job_id)
# You can also wait for the job to finish:
analytics.enrich_wait(job_id)
# Now we're ready to download the data:
results = analytics.enrich_download(job_id)
More details on analytics.enrich()
analytics.enrich(providers, inputs, validate=True, all_updates=None, hosted_input=None)
Argument | Defaults | Notes |
---|---|---|
providers_or_channel_id | [] or Integer | List of provider names to query, or Channel ID Integer |
inputs | None | Inputs to pass to providers |
validate | True | Perform validation before enrichment |
all_updates | False | Include historical data in results |
hosted_input | None | Use a sample input file instead of the provided inputs |
Results: Returns the ID of the started enrichment job.
.enrich_status
enrich_status returns true if an enrichment job created with enrich is complete, false if it's still running. It also prints some information about job progress.
Examples
See the example for enrich.
More details on analytics.enrich_status()
analytics.enrich_status(id)
Argument | Defaults | Notes |
---|---|---|
id | None | Job ID from enrich() |
Results: Returns true if the job is complete, false if it's still running.
.enrich_wait
enrich_wait waits until an enrichment job created with enrich is complete. It's similar to running enrich_status in an infinite loop.
Examples
See the example for enrich.
More details on analytics.enrich_wait()
analytics.enrich_wait(id)
Argument | Defaults | Notes |
---|---|---|
id | None | Job ID from enrich() |
Results: None.
.enrich_download
enrich_download downloads the augmented data of an enrichment job created with enrich and returns the resulting dataframe. By default, enrich_download will wait until the results are complete, but it also lets you download partial results while the job is still running. To do this, pass block_until_complete=False to enrich_download.
Examples
Manual control over enrichment
We're re-using the example from enrich_and_download, but use `enrich` which doesn't block the notebook and thus allows us to keep working while the enrichment is in progress. Once the enrichment is done, we use enrich_download to retrieve the results.
from demyst.analytics import Analytics
import pandas as pd
analytics = Analytics()
inputs = pd.DataFrame.from_dict([
{ "email_address": "foo@example.com" },
{ "email_address": "test@test.com" }
])
# This kicks off the job... once it prints the job ID you can continue working.
job_id = analytics.enrich(["domain_from_email"], inputs)
# If you want to inquire about the status of the job, do the following.
# This will print some status information and return true if the job is finished.
finished = analytics.enrich_status(job_id)
# You can also wait for the job to finish:
analytics.enrich_wait(job_id)
# Now we're ready to download the data:
results = analytics.enrich_download(job_id)
More details on analytics.enrich_download()
analytics.enrich_download(id)
Argument | Defaults | Notes |
---|---|---|
id | None | Job ID from enrich() |
block_until_complete | True | Wait for all providers to finish if True, download partial results otherwise. |
Results: Returns the enriched dataframe.
.enrich_with_hosted_inputs
Like enrich_and_download, but instead of enriching an input dataframe, it uses sample inputs (see input_files). Like with input_files, the sample input can be filtered and limited to a certain number of rows.
Examples
Enrich with sample input files
We're re-using the example from enrich_and_download, but use `enrich` which doesn't block the notebook and thus allows us to keep working while the enrichment is in progress. Once the enrichment is done, we use enrich_download to retrieve the results.
from demyst.analytics import Analytics
analytics = Analytics()
# This prints the enriched result using the `us_business_entity` inputs
analytics.enrich_with_hosted_inputs(['domain_from_email'], 'us_business_entity')
More details on analytics.enrich_with_hosted_inputs()
analytics.enrich_with_hosted_inputs(providers, hosted_input, row_limit=None, filters=None)
Argument | Defaults | Notes |
---|---|---|
providers | None | Required list of provider names |
hosted_input | True | Required sample input file name |
row_limit | None | Number of rows to use from sample inpu |
filters | None | Attribute/Column header and value to filter sample input on |
Results: Enriched dataframe
.enrich_download_to_disk
enrich_download_to_disk downloads the augumented data of an enrichment job created with enrich and saves it as a CSV file on disk. Use this instead of enrich_download if your outputs are very large.
Examples
Download enrichment to disk
We're re-using the example from enrich, which doesn't block the notebook and thus allows us to keep working while the enrichment is in progress.
from demyst.analytics import Analytics
import pandas as pd
analytics = Analytics()
inputs = pd.DataFrame.from_dict([
{ "email_address": "foo@example.com" },
{ "email_address": "test@test.com" }
])
# This kicks off the job... once it prints the job ID you can continue working.
job_id = analytics.enrich(["domain_from_email"], inputs)
# downloads the enriched dataset to output.csv on your disk
analytics.enrich_download_to_disk(job_id, "output.csv")
More details on analytics.enrich_download_to_disk()
analytics.enrich_download_to_disk(id, file_path, overwrite=False, block_until_complete=True)
Argument | Defaults | Notes |
---|---|---|
id | Job ID from enrich() | |
file_path | None | Path of output CSV file. |
overwrite | False | If true, overwrites the output file if it exists. If false, aborts if file exists. |
block_until_complete | True | Wait for all providers to finish if True, download partial results otherwise. |
.enrich_credits
enrich_credits prints information about the cost of running an enrichment. Use this to see how many credits a job would take before running it. It has the same parameters as enrich.
Examples
Getting credit information
Here we're re-using the example from enrich, but instead of actually running the job, we just print how many credits it would take.
from demyst.analytics import Analytics
import pandas as pd
analytics = Analytics()
inputs = pd.DataFrame.from_dict([
{ "email_address": "foo@example.com" },
{ "email_address": "test@test.com" }
])
# Don't actually run the job, just print how many credits it would take.
print(analytics.enrich_credits(["domain_from_email"], inputs))
More details on analytics.enrich_credits()
analytics.enrich_credits(providers, inputs, validate=True)
Argument | Defaults | Notes |
---|---|---|
providers | [] | List of provider names to query |
inputs | None | Inputs to pass to providers |
validate | True | Perform validation before enrichment |
Results: Returns the number of the credits running the job would cost.
.products
products returns information about each of our data providers as a dataframe.
Examples
Listing data providers
This example shows how to list all or some data providers.
from demyst.analytics import Analytics
a = Analytics()
# You can either get information about all providers...
a.products()
# ...or some providers, by specifying their names:
a.products(["domain_from_email", "email_age"])
More details on analytics.products()
analytics.products(product_names)Argument | Defaults | Notes |
---|---|---|
provider_names | [] | A list of product names to return. |
Results: Returns a dataframe with information about data providers.
.product_catalog
product_catalog returns information about the inputs and outputs of a data provider as a dataframe. You can also get this information for all of our data providers.
Examples
Getting information about data providers
This example shows how to list all or some data providers.
from demyst.analytics import Analytics
a = Analytics()
# Call it like this to get info about particular providers...
a.product_catalog(["domain_from_email", "email_age"])
# ...or like this to get info about all providers:
a.product_catalog(all_products=True)
More details on analytics.product_catalog()
analytics.product_catalog(provider_names=[], all_products=False)
Argument | Defaults | Notes |
---|---|---|
provider_names | [] | A list of product names to return. |
all_products | False | Set to true if you want info about all available products. |
Results: Returns information about inputs and outputs of providers as a dataframe.
.product_inputs
product_inputs is like product_catalog, but returns only the inputs of data providers.
Examples
Getting information about data providers
This example shows how to get information about the inputs of some data providers.
from demyst.analytics import Analytics
a = Analytics()
a.product_inputs(["domain_from_email", "email_age"])
More details on analytics.product_inputs()
analytics_.product_inputs(provider_names=[], all_products=False)
Argument | Defaults | Notes |
---|---|---|
provider_names | [] | A list of product names to return. |
all_products | False | Set to true if you want info about all available products. |
Results: Returns information about inputs of providers as a dataframe.
.product_outputs
product_outputs is like product_catalog, but returns only the outputs of data providers.
Examples
Getting information about data providers
This example shows how to get information about the outputs of some data providers.
from demyst.analytics import Analytics
a = Analytics()
a.product_outputs(["domain_from_email", "email_age"])
More details on analytics.product_outputs()
analytics.product_outputs(provider_names=[], all_products=False)Argument | Defaults | Notes |
---|---|---|
provider_names | [] | A list of product names to return. |
all_products | False | Set to true if you want info about all available products. |
Results: Returns information about outputs of providers as a dataframe.
.product_stats
product_stats accepts an array of data products as an argument and returns a dataframe of performance metrics and metadata for each of those products' fields
Getting performance statistics for three products
This example shows how to get product stats on each output field for dnb_find_company, housecanary_property_details, and infutor_property_append.
from demyst.analytics import Analytics
analytics = Analytics()
providers = ["dnb_find_company", "housecanary_property_details", "infutor_property_append"]
stats = analytics.product_stats(providers)
print(stats)
# Alternatively you can return results for all providers:
analytics.product_stats(all_products=True)
The resulting dataframe looks like the following:
input_entity stats_updated_on product \
0 us_business_entity 2019-10-04 07:06:37 dnb_find_company
1 us_business_entity 2019-10-04 07:06:37 dnb_find_company
2 us_business_entity 2019-10-04 06:22:46 dnb_find_company
3 us_business_entity 2019-10-04 08:44:23 dnb_find_company
4 us_business_entity 2019-10-04 02:43:14 dnb_find_company
product_error_rate product_match_rate \
0 0.0 0.540441
1 0.0 0.521989
2 0.0 1.000000
3 0.0 1.000000
4 0.0 1.000000
attribute_flattened_name \
0 find_company_response_detail.candidate_matched...
1 find_company_response_detail.candidate_returne...
2 find_company_response_detail.find_candidate[0]...
3 find_company_response_detail.find_candidate[0]...
4 find_company_response_detail.find_candidate[0]...
attribute_fill_rate \
0 0.540441
1 0.521989
2 1.000000
3 1.000000
4 1.000000
attribute_consistency_rate attribute_unique_values \
0 0.494737 92
1 0.494737 23
2 0.000000 99
3 0.000000 1
4 0.960000 261
most_common_values \
0 {'1': 117, '3': 14, '2': 25, '4': 12}
1 {'1': 117, '25': 53, '3': 14, '2': 25, '4': 12}
2 {'10': 18, '1': 14, '3': 10, '2': 23, '5': 15,...
3 {'1': 267}
4 {}
std median \
0 187614.338032 323.0
1 7.353355 12.0
2 656121.473294 240.0
3 0.000000 1.0
4 NaN NaN
mean max_value variance \
0 53636.032609 950391.0 3.519914e+10
1 12.565217 25.0 5.407183e+01
2 96321.010101 6540000.0 4.304954e+11
3 1.000000 1.0 0.000000e+00
4 NaN NaN NaN
min_value attribute_onboarded_date \
0 1.0 2018-07-26T20:01:22.000Z
1 1.0 2018-07-26T20:01:22.000Z
2 1.0 2018-07-26T20:01:22.000Z
3 1.0 2018-07-26T20:01:22.000Z
4 NaN 2018-07-26T20:01:22.000Z
attribute_audited_date attribute_pii \
0 2019-10-04 07:06:37 0
1 2019-10-04 07:06:37 0
2 2019-10-04 06:22:46 0
3 2019-10-04 08:44:23 0
4 2019-10-04 02:43:14 0
attribute_use_case
0 Address Verification, Business Contact, Busine...
1 Address Verification, Business Contact, Busine...
2 Address Verification, Business Contact, Busine...
3 Address Verification, Business Contact, Busine...
4 Address Verification, Business Contact, Busine...
More details on analytics.product_stats()
analytics.product_stats(providers)
Argument | Defaults | Notes |
---|---|---|
provider_names | [] | List of provider names to view stats. |
Results: Returns the performance data and metadata of products' fields.
report
report accepts an input dataframe and the response dataframe from the enriched methods. Report will provide you with statistical data at product and attribute level. Each row will contain the response attribute from enriched methods and various details entailing to the data in the attribute. This includes the type, fill_rate and unique number of values(nunique) in the attribute and on the product level it will include the match_rate. report accepts an input dataframe and the response dataframe from the enriched methods. Report will provide you with statistical data at product and attribute level. Each row will contain the response attribute from enriched methods and various details entailing to the data in the attribute. This includes the type, fill rate, and unique number of values in the attribute and on the product level it will include the match rate.
Getting statistics from the enriched data
This example shows how to get stats on each attribute for enriched data from seon_email and neutrino_email_verify.
from demyst.analytics import Analytics
analytics = Analytics()
inputs = pd.DataFrame.from_dict([
{ "email_address": "foo@example.com" },
{ "email_address": "test@test.com" }
])
providers = ["seon_email", "neutrino_email_verify"]
result = analytics.enrich_and_download(providers, inputs)
stats = analytics.report(result)
print(stats.head(5))
The resulting dataframe looks like the following:
product_name product_match_rate attribute_name \
0 inputs 100.00 email_address
1 neutrino_email_verify 100.00 client_id
2 neutrino_email_verify 100.00 domain
3 neutrino_email_verify 100.00 domain_error
4 neutrino_email_verify 100.00 email_address
attribute_fill_rate attribute_type unique_values \
0 100.00 object 2
1 0.00 object 0
2 100.00 object 2
3 100.00 bool 1
4 100.00 object 2
most_common_values cardinality \
0 {"test@test.com": 1, "foo@example.com": 1} 100.00
1 {"": 2} nan
2 {"example.com": 1, "test.com": 1} 100.00
3 {"false": 2} 50.00
4 {"test@test.com": 1, "foo@example.com": 1} 100.00
std median mean max_value min_value variance
0 NaN NaN NaN NaN NaN NaN
1 NaN NaN NaN NaN NaN NaN
2 NaN NaN NaN NaN NaN NaN
3 0.0 0.0 0.0 0.0 0.0 0.0
4 NaN NaN NaN NaN NaN NaN
Types
At the heart of the Demyst Platform is its type system. Types are associated with column names. For example, a column named post_code is expected to contain a postal code.
Data Type | Description | Example |
---|---|---|
blob | Base64-encoded binary data | RGVteXN0 |
business_name | The name of a company | Demyst Data Ltd. |
city | The name of a city | New York City |
country | Must be a 2 or 3 character iso code https://en.wikipedia.org/wiki/ISO_3166-1_alpha-3 or https://en.wikipedia.org/wiki/ISO_3166-1_alpha-2 | US, AU, SG |
domain | An internet domain name | demyst.com |
email_address | An email address | support@demyst.com |
first_name | A first name | John |
full_name | A full name | John Doe |
gender | A gender or abbreviation | m, male, f, female |
ip4 | IP address (version 4) | 192.168.0.1 |
last_name | A last name | Smith |
latitude | Number between -90.0–90.0 | 40.7 |
longitude | Number between -180.0–180.0 | -73.9 |
marital_status | A marital status or abbreviation | m, married, s, single, ... |
middle_name | A middle name | Rupert |
number | A number. Supports integral and decimal numbers of arbitrary size and precision | 42 |
percentage | A number between 0.0 and 100.0 | 99%, 99 |
phone | Country dependent, for US must be 10 digits without leading one or 11 digits with, area code must be valid | 917-475-1881 |
post_code | If US 5 or 9 digit postcode, dash or no dash separating. other countries need be non empty | 10001 |
sic_code | A Standard Industrial Classification code. 4 digit character string | 2024 |
state | If US it must be a valid 2 character state code or state name. Empty otherwise | NY, New York |
street | Non-empty. A street name | 100 Main St |
string | A character string | foo |
url | A Uniform Resource Locator. Starts with http: or https: | https://www.demyst.com |
us_ein | An Employer Identification Number. Dashes and spaces stripped from input by us, must be 9 numeric character string | 12-3456789 |
us_ssn | A Social Security Number. Dashes and spaces stripped from input by us, must be 9 numeric character string | 078-05-1120 |
us_ssn4 | The last four digits of a Social Security Number | 1120 |
year_month | A particular month of a year. In format yyyy-MM | 2019-01 |
year | A year | 2019 |