List of Transformations

This document lays out the input and output schema for the full list of Transformations provided in the Catalog. Note that for every single output, Refuel will also produce a confidnence score.

Staffing, Recruiting and HRTech

Resume Parsing

Input:

resume_link (str): Either a publicly readable URL, or a path to S3 or GCS that can be read by Refuel through our integration.

Output:

candidate_name (str): Name of the candidate.
contact_info (json): A JSON object containing any physical addresses, email addresses, phone numbers or web addresses (LinkedIn, Github, personal websites, etc) for the candidate.
education (list(json)): A list of JSON objects, where each JSON contains the school, major, degree, start year, end year and other information about a specific educational degree for the candidate.
work_history (list(json)): A list of JSON objects, where each JSON contains the job title, company, start month, start year, end month, end year and description about a specific job held by the candidate.
skills (str): List of skills demonstrated by the candidate based on evidence in their resume.

Job Description Parsing

Input:

text (str): Raw text from a job description.

Output:

company (str): The company or organization offering this job.
title (str): The job title for this job.
location (str): Location where the job is based.
pay (json): A JSON object containing information about the pay period (hourly, weekly, monthly, etc), minimum and maximum amounts, any bonuses, etc.
skills (str): List of skills required by the job description.

Skills Extraction and Mapping

Input:

link (str): Either a publicly readable URL, or a path to S3 or GCS for a resume, job description or other document from which skills needs to be extracted.

Output:

skills (str): List of skills demonstrated by the candidate based on evidence in their resume, and mapped against a taxonomy.

Job Title Normalization

Input:

title (str): Job title to be normalized.

Output:

normalized_title (str): Job title as a string, with typos corrected, short forms expanded (ex. Sr to Senior), unnecessary modifiers or adjectives removed.

Job Title Seniority Classification

Input:

title (str): Job title.

Output:

seniority (str): The job title will be categorized against the following taxonomy:

Owner
Founder
C Suite
Partner
VP
Head
Director
Manager
Senior
Entry
Intern

Sales data and SalesTech

Headquarters or Physical Address for a Business

Input:

business_name (str): Name of the business.

Output:

address (str): The physical address or headquarters for the business name supplied. If no physical address is found, “Not Found” is returned.

Revenue Estimate

Input:

business_name (str): Name of the business.
website (str): Business website (domain).
address (str): Complete address of the Business HQ.

Output:

revenue (str): The latest estimated revenue of the business. If a revenue number cannot be extracted, “Not Found” is returned.

Lead Scoring

Input:

business_description (str): Description of the business that is qualifying leads.
icp_description (str): Description of the ideal customer profile for the business.
customer_title (str): Job title of the lead at the lead’s company.
customer_company (str): Company of the lead.
customer_name (str): Name of the lead.

Output:

lead_score (str): A score between 0 and 100, indicating the likelihood of the lead being a good fit for the business.
lead_score_rationale (str): A rationale for the lead score, explaining the reasoning behind the score.

Get Phone Numbers for business

Input:

business_name (str): Name of the business to extract the phone number for.
website (str): Website of the business to extract the phone number for.
address (str): Address of the business to extract the phone number for.

Output:

phone_number (str): Phone number of the business.

Domain Name Extraction

Input:

business_name (str): The name of the business for which the domain name needs to be extracted.
address (str): The address of the business for which the domain name needs to be extracted.

Output:

domain_name (str): The domain name/website of the business.

ICP Fit Classification

Input:

business_name (str): The name of the business looking for potential customers.
business_description (str): The description of the business looking for potential customers. A text description of the business and its offerings.
icp_description (str): The description of the ideal customer profile (ICP) of the business looking for potential customers. This describes the ICP in detail which will be matched to information extracted about the potential customer.
customer_company (str): The name of the company that is being evaluated for potential fit with the business.
customer_website (str): The website of the company that is being evaluated for potential fit with the business.

Output:

icp_fit (str): The ICP fit of the business. This returns how good of a fit the customer is for the business based on the ICP description. This will be one of the following values - High, Medium, Low.

SIC Classification

Input:

business_name (str): The name of the business for which SIC code needs to be found.
website (str): The website of of the business.
address (str): The address of the business.

Output:

sic_code (str): The relevant SIC codes of the business. The full list of possible codes can be found here.

NAICS Industry Classification

Input:

business_name (str): The name of the business.
website (str): The website of of the business.
address (str): The address of the business.

Output:

naics_sector (str): The NAICS sector of the business. The sector is the first two digits of the 6-digit NAICS code.
naics_industry (str): One or more 6-digit NAICS codes under which the business is categorized. If the business has multiple industries, the codes will be returned as a semicolon-separated list.

The full NAICS taxonomy can be found here.

MCC Industry Classification

Input:

business_name (str): The name of the business.
website (str): The website of of the business.
address (str): The address of the business.

Output:

mcc_categories (str): One or more Merchant Category Codes (MCC) under which the business is categorized. If the business has multiple MCC codes, the codes will be returned as a semicolon-separated list.

The full list of possible codes can be found here.

Number of Employees

Input:

business name (str): The name of the business.
website (str): The website of of the business.
address (str): The address of the business.

Output:

Number of employees (str): The number of employees who work at the business.

Address Cleaning and Normalization

Input:

address (str): The unformatted address to be cleaned and normalized.

Output:

clean addresses (str): The clean address in a standard format.

Introduction

Guides

Data Transformations Catalog

List of Transformations

Staffing, Recruiting and HRTech

Resume Parsing

Job Description Parsing

Skills Extraction and Mapping

Job Title Normalization

Job Title Seniority Classification

Sales data and SalesTech

Headquarters or Physical Address for a Business

Revenue Estimate

Lead Scoring

Get Phone Numbers for business

Domain Name Extraction

ICP Fit Classification

SIC Classification

NAICS Industry Classification

MCC Industry Classification

Number of Employees

Address Cleaning and Normalization

Introduction

Guides

Data Transformations Catalog

​Staffing, Recruiting and HRTech

​Resume Parsing

​Job Description Parsing

​Skills Extraction and Mapping

​Job Title Normalization

​Job Title Seniority Classification

​Sales data and SalesTech

​Headquarters or Physical Address for a Business

​Revenue Estimate

​Lead Scoring

​Get Phone Numbers for business

​Domain Name Extraction

​ICP Fit Classification

​SIC Classification

​NAICS Industry Classification

​MCC Industry Classification

​Number of Employees

​Address Cleaning and Normalization

Staffing, Recruiting and HRTech

Resume Parsing

Job Description Parsing

Skills Extraction and Mapping

Job Title Normalization

Job Title Seniority Classification

Sales data and SalesTech

Headquarters or Physical Address for a Business

Revenue Estimate

Lead Scoring

Get Phone Numbers for business

Domain Name Extraction

ICP Fit Classification

SIC Classification

NAICS Industry Classification

MCC Industry Classification

Number of Employees

Address Cleaning and Normalization