Overview

A task in Refuel, is a sequence of steps to define how you want to transform your data with LLMs. Tasks can range from simple to complex and may involve multiple steps executed in a chained sequence to achieve the desired transformation.

Defining your first task

Conceptually a task is a sequence of steps to be executed in a specific order to transform your data with LLMs. Every task in Refuel has the following components:

Task name: A name for your task.
Context: This is an overview of the dataset you are working with, and the problem you are trying to solve. These will be used to guide the LLM to take on a specific role/persona.
Field(s): One or more output fields that will be generated by the LLM. Conceptually, a field is a new column you’re adding to your dataset - the result of the data transformation.
Models: Provider and model to use for the task.
Advanced settings: Additional settings to configure task behavior.

Here’s a quick video overview of how to define a new task:

Output fields

Refuel supports a variety of output field types, depending on the task you are trying to solve:

Single category classification: A single category classification task is a task where the LLM will classify an input into one of a set of predefined categories.
Multi category classification: A multi category classification task is a task where the LLM will classify an input into one or more of a set of predefined categories.
Extraction: LLM will extract attributes from the input, and return them as a structured JSON schema.
Generation: LLM will generate a free-format text output based on the input.

Enrichments fields

Enrichments allow you to configure external data sources from which relevant context can be fetched and supplied to the LLM when producing an output. This is critical for tasks where it is safer to rely on an external knowledge base (vs the LLM’s internal knowledge) to ensure accuracy/freshness of outputs.

Refuel currently supports enrichments from the following sources:

Web search
Maps search
Website scraping
Extracting text from images
Custom enrichments (beta) - see custom enrichments for more details.

You can add enrichments to your task by clicking on “Add field” and selecting the enrichment source you want to use. When defining an enrichment field, you will typically select the input columns that will be used to fetch the enrichment data for each row, and optionally any other configurations required.

Supported Models

Refuel supports LLMs from a variety of providers:

Provider	Name
OpenAI	GPT-4 Turbo
OpenAI	GPT-4o
OpenAI	GPT-4o mini
OpenAI	GPT-4
OpenAI	GPT-3.5 Turbo
Anthropic	Claude 3.5 (Sonnet)
Anthropic	Claude 3 (Opus)
Anthropic	Claude 3 (Haiku)
Google	Gemini 1.5 (Pro)
Google	Gemini 1.5 (Flash)
Google	Gemini 2.0 (Flash)
Mistral	Mistral Small
Mistral	Mistral Large
Refuel	Refuel LLM-2
Refuel	Refuel LLM-2-small

You can select the model you want to use for your task from the dropdown in the task editor. In addition to the base models, any LLMs that you have finetuned for a specific task will also be available in the same dropdown:

Advanced Settings

In addition to the model and field settings, you can also configure task behavior related to how LLM responses are cached, how is provided feedback used for few shot prompting and more by navigating to the Advanced section in the task editor.

Few shot prompting: is a technique where you provide examples of the desired output to the LLM to help it understand the task better. You can update the number of examples you want to provide to the LLM in the Few shot learning section. Refuel will dynamically select the most relevant examples for each row to be processed.
Caching LLM responses: By default, Refuel will cache the LLM responses. This means that if the LLM is called with the exact same prompt (model configuration + task guidelines + input data values to be processed) we will serve the completion from the cache instead of calling the LLM. This is useful in practice to improve latenct and reduce costs. You can disable this by setting the Cache LLM responses toggle to off.
Beam search : Beam search is a technique where the LLM generates multiple completions in parallel and then selects the most likely completion. This behavior is enabled by default, but you can disable it by setting the Beam search toggle to off.
Confidence: By default, Refuel will calculate the confidence score for each row based on the LLM response. You can disable this by setting the Compute Confidence Scores toggle to off.

Introduction

Guides

Data Transformations Catalog

Defining your first task

Output fields

Enrichments fields

Supported Models

Advanced Settings

Introduction

Guides

Data Transformations Catalog

​Defining your first task

​Output fields

​Enrichments fields

​Supported Models

​Advanced Settings

Defining your first task

Output fields

Enrichments fields

Supported Models

Advanced Settings