Skip to content

Introduction

isolated isolated

Autolabel is a Python library to label, clean and enrich datasets with Large Language Models (LLMs).

🌟 (New!) Access RefuelLLM through Autolabel

You can access RefuelLLM, our recently announced LLM purpose built for data labeling, through Autolabel (Read more about it in this blog post). Refuel LLM is a Llama-v2-13b base model, instruction tuned on over 2500 unique (5.24B tokens) labeling tasks spanning categories such as classification, entity resolution, matching, reading comprehension and information extraction. You can experiment with the model in the playground here.

Refuel Performance

You can request access to Refuel LLM here. Read the docs about using RefuelLLM in autolabel here.

Features

  • Autolabel data for NLP tasks such as classification, question-answering and named entity-recognition, entity matching and more.
  • Seamlessly use commercial and open source LLMs from providers such as OpenAI, Anthropic, HuggingFace, Google and more.
  • Leverage research-proven LLM techniques to boost label quality, such as few-shot learning and chain-of-thought prompting.
  • Confidence estimation and explanations out of the box for every single output label
  • Caching and state management to minimize costs and experimentation time

Getting Started

You can get started with Autolabel by simpling bringing the dataset you want to label, picking your favorite LLM and writing a few lines of code.

  • Installation and your first labeling task: Steps to install Autolabel and run sentiment analysis for movie reviews using OpenAI's gpt-3.5-turbo.
  • Classification tutorial: A deeper dive into how Autolabel can be used to detect toxic comments at 95%+ accuracy.
  • Command Line Interface: Learn how to use Autolabel's CLI to intuitively create configs from the command line.
  • Here are more examples with sample notebooks that show how Autolabel can be used for different NLP tasks.

Resources

  • Discord: Join our Discord community for conversations on LLMs, Autolabel and so much more!
  • Github: Create an issue to report any bugs or give us a star on Github.
  • Contribute: Share your feedback or add new features, and help us improve Autolabel!