refuel
is available as a library on PyPI. The code is open source and available on GitHub.
Option | Is Required | Default Value | Comments |
---|---|---|---|
api_key | Yes | None | Used to authenticate all requests to the API server |
project | Yes | None | The name of the project you plan to use for the current session |
timeout | No | 60 | Timeout in seconds |
max_retries | No | 3 | Max number of retries for failed requests |
max_workers | No | Num CPUs (os.cpu_count()) | Max number of concurrent requests to the API server |
Option | Is Required | Default Value | Comments |
---|---|---|---|
file_path | Yes | - | Path to the data you wish to upload |
dataset_name | Yes | - | Unique name of the dataset being uploaded |
source | Yes | file | Place where the file resides |
wait_for_completion | No | False | Whether to poll for the completion of the dataset ingestion |
uri
source, the file_path
should be publicly accessible (eg. S3 presigned url) for Refuel to process it.
Option | Is Required | Default Value | Comments |
---|---|---|---|
dataset | Yes | - | Name of the dataset you want to query and retrieve items (rows) from |
max_items | No | 20 | Max number of rows you want to fetch |
offset | No | 0 | If this is set to a positive number, say N, then the first N rows will be skipped and the API will return “max_items” number of rows after skipping the first N rows. |
order_by
param to sort by any other columns in the dataset or by the label or confidence score from a labeling task.
field
can be either ‘label’ or ‘confidence’.order_by
list if you would like to sort by multiple columns (used in the case of ties). Some details about the keys for each dict in the order_by
list:
Key | Is Required | Default Value | Description | Comments |
---|---|---|---|---|
field | Yes | The name of the column in the dataset to sort by | In addition to the columns in the dataset, the field can also be ‘label’ or ‘confidence’, if the task and subtask names are specified. | |
direction | No | ASC | The direction that you would like to sort the specified column by | Should be ASC or DESC |
subtask | No | null | The name of the subtask for which you would like to sort by label or confidence | This should only be provided if the field is ‘label’ or ‘confidence’ and requires a task name to be specified in the function params. |
Sentiment Analysis
in your Refuel account, which has two subtasks (output fields):
(i) predicted_sentiment
- the predicted sentiment
(ii) explanation
- a one sentence explanation of why the LLM output the predicted sentiment as Positive or Negative for the item.
Here are a few filters we can define for this task:
ground_truth_sentiment
)”:>
| Greater than |
| >=
| Greater than or equal to |
| =
| Equals |
| <>
| Not equal to |
| <
| Less than |
| <=
| Less than or equal to |
| IS NULL
| True if field is undefined |
| IS NOT NULL
| True if field is defined |
| LIKE
| String matching: True if value is in field |
| ILIKE
| String matching (case insensitive): True if value is in field |
| NOT LIKE
| String does not match: True if value is not in field |
| NOT ILIKE
| String does not match (case insensitive): True if value is not in field | |
create_task
function:
Parameter | Is Required | Default Value | Comments |
---|---|---|---|
task | Yes | None | Name of the new task you’re creating |
dataset | Yes | None | Dataset (in Refuel) for which you are defining this task |
context | Yes | None | Context is a high level description of the problem domain and the dataset that the LLM will be working with. It typically starts with something like ‘You are and expert at …’ |
fields | Yes | None | This is a list of dictionaries. Each entry in this list defines an output field generated in the task. See below for details about the schema of each field |
model | No | team default | LLM that will be used for this task. If not specified, we will use the default LLM set for your team, e.g. GPT-4 Turbo |
fields
list above:
Parameter | Is Required | Default Value | Comments |
---|---|---|---|
name | Yes | None | Name of the output field, e.g. llm_predicted_sentiment |
type | Yes | None | Type of output field. This is one of: [classification , multilabel_classification , attribute_extraction , webpage_transform , web_search ] |
guidelines | Yes | None | Output guidelines for the LLM for this field. Note that if the field type is a web_search type, the guidelines will be simply the query template |
labels | Yes (for classification field types) | None | list of valid labels, this field is only required for classification type tasks |
input_columns | Yes | None | Columns from the dataset to use as input when passing a “row” in the dataset to the LLM. |
ground_truth_column | No | None | A column in the dataset that contains ground truth value for this field, if one exists. Note this is an optional parameter. |
fallback_value | No | None | A fallback/default value that the LLM should generate for this field if a row cannot be processed successfully |
Provider | Name |
---|---|
OpenAI | GPT-4 Turbo |
OpenAI | GPT-4o |
OpenAI | GPT-4o mini |
OpenAI | GPT-4 |
OpenAI | GPT-3.5 Turbo |
Anthropic | Claude 3.5 (Sonnet) |
Anthropic | Claude 3 (Opus) |
Anthropic | Claude 3 (Haiku) |
Gemini 1.5 (Pro) | |
Mistral | Mistral Small |
Mistral | Mistral Large |
Refuel | Refuel LLM-2 |
Refuel | Refuel LLM-2-small |
num_items
parameter is not specified, it will label the entire dataset.
task_run
object has the following schema:
status
enum shows the current task run status. It can be one of the following values:
not_started
: this is the starting state before the batch run has been kicked offactive
: a batch task run is ongoingpaused
: the batch run was paused before the full dataset was labeled.failed
: the batch run failed due to a platform error. This should ideally never happencompleted
: the batch run was completed successfullymetrics
is a list containing all metrics for the current task run. Currently the platform supports the following metrics:
num_labeled
: number of rows from the dataset that have been labelednum_remaining
number of rows from the dataset that are remainingtime_elapsed_seconds
: time (in seconds) since the task run started. This is only populated when the task run is active (since this metric is not valid when there is no active run).time_remaining_seconds
estimated time (in seconds) remaining for the task run to complete. This is only populated when the task run is active (since this metric is not valid when there is no active run).inputs
is a dictionary, with keys as names of the input columns defined in the task. For example, let’s consider an application for sentiment classification called my_sentiment_classifier
, with two input fields - source
and text
. You can use it as follows:
response
has the following schema:
refuel_output[i]
contains the output for inputs[i]
refuel_fields
is a list whose length is equal to the number of fields defined in the application. For example, let’s say my_sentiment_classifier
has just one field, sentiment
. In this case the output will be:explain
parameter to True
to get an explanation for why the provided label was returned. The explanation will be returned in the explanation
field in the response, along with the label
and confidence
:
explain_fields
parameter for which you want an explanation returned. If explain_fields
is provided, explanations will be returned regardless of whether explain
is set to True
or False
. Here’s an example of how to get explanations for the sentiment
field in the my_sentiment_classifier
application:
telemetry
parameter to True
to get additional info such as the model, provider, and number of tokens used (prompt, output, and total) in the request. The telemetry data will be returned in the usage
field in the response.
alabel
with the exact same parameters as with label
. This will submit the inputs for labeling with Refuel and returns the refuel_uuid to get the labeled item back.
refuel_uuid
from the output to get the labeled item back using get_labeled_item
method.
refuel_uuid
from the response above: