# Introduction

![Encoderfile](/files/k4f2Fx010HvRxhaH4CH5)

<p align="center"><strong>Deploy Encoder Transformers as self-contained, single-binary executables.</strong><br><br><a href="https://github.com/mozilla-ai/encoderfile"><img src="https://img.shields.io/github/v/release/mozilla-ai/encoderfile?style=flat-square" alt=""> </a><a href="https://github.com/mozilla-ai/encoderfile/blob/main/LICENSE"><img src="https://img.shields.io/github/license/mozilla-ai/encoderfile?style=flat-square" alt=""></a></p>

***

**Encoderfile** packages transformer encoders—and their classification heads—into a single, self-contained executable.

Replace fragile, multi-gigabyte Python containers with lean, auditable binaries that have **zero runtime dependencies**. Written in Rust and built on ONNX Runtime, Encoderfile ensures strict determinism and high performance for financial platforms, content moderation pipelines, and search infrastructure.

## Why Encoderfile?

While **Llamafile** focuses on generative models, **Encoderfile** is purpose-built for encoder architectures. It is designed for environments where compliance, latency, and determinism are non-negotiable.

* **Zero Dependencies:** No Python, no PyTorch, no network calls. Just a fast, portable binary.
* **Smaller Footprint:** Binaries are measured in megabytes, not the gigabytes required for standard container deployments.
* **Protocol Agnostic:** Runs as a REST API, gRPC microservice, CLI tool, or MCP Server out of the box.
* **Compliance-Friendly:** Deterministic and offline-safe, making it ideal for strict security boundaries.

> **Note for Windows users:** Pre-built binaries are not available for Windows. Please see our guide on [building from source](https://mozilla-ai.github.io/encoderfile/reference/building/) for instructions on building from source.

## Use Cases

| Scenario            | Application                                                                      |
| ------------------- | -------------------------------------------------------------------------------- |
| **Microservices**   | Run as a standalone gRPC or REST service on localhost or in production.          |
| **AI Agents**       | Register as an MCP Server to give agents reliable classification tools.          |
| **Batch Jobs**      | Use the CLI mode (infer) to process text pipelines without spinning up servers.  |
| **Edge Deployment** | Deploy sentiment analysis, NER, or embeddings anywhere without Docker or Python. |

## Supported Models

Encoderfile supports encoder-only transformers for:

* **Token Embeddings** - clustering, embeddings (BERT, DistilBERT, RoBERTa)
* **Sequence Classification** - Sentiment analysis, topic classification
* **Token Classification** - Named Entity Recognition, PII detection
* **Sentence Embeddings** - Semantic search, clustering

See our guide on [building from source](https://mozilla-ai.github.io/encoderfile/reference/building/) for detailed instructions on building the CLI tool from source.

Generation models (GPT, T5) are not supported. See [CLI Reference](/encoderfile/reference/cli.md) for complete model type details.

## Quick Start

### 1. Install CLI

Download the pre-built CLI tool:

```bash
curl -fsSL https://raw.githubusercontent.com/mozilla-ai/encoderfile/main/install.sh | sh
```

Or build from source (see [Building Guide](/encoderfile/reference/building.md)).

### 2. Export Model & Build

Export a HuggingFace model and build it into a binary:

```bash
# Export to ONNX
optimum-cli export onnx --model <model-id> --task <task> ./model

# Build the encoderfile
encoderfile build -f config.yml
```

See the [Building Guide](/encoderfile/reference/building.md) for detailed export options and configuration.

### 3. Run & Test

Start the server and make predictions:

```bash
# Start server
./build/sentiment-analyzer.encoderfile serve

# Make a prediction
curl -X POST http://localhost:8080/predict \
  -H "Content-Type: application/json" \
  -d '{"inputs": ["Your text here"]}'
```

See the [API Reference](/encoderfile/reference/api-reference.md) for complete endpoint documentation.

**Next Steps:** Try the [Token Classification Cookbook](/encoderfile/cookbooks/token-classification-ner.md) for a complete walkthrough.

## How It Works

Encoderfile compiles your model into a self-contained binary by embedding ONNX weights, tokenizer, and config directly into Rust code. The result is a portable executable with zero runtime dependencies.

![Encoderfile architecture diagram illustrating the build process: compiling ONNX models, tokenizers, and configs into a single binary executable that runs as a zero-dependency gRPC, HTTP, or MCP server.](/files/XbazqhvDNG87oAApFqeV)

## Documentation

### Getting Started

* [**Installation & Setup**](/encoderfile/getting-started.md) - Complete setup guide from installation to first deployment
* [**Building Guide**](/encoderfile/reference/building.md) - Export models and configure builds

### Tutorials

* [**Token Classification (NER)**](/encoderfile/cookbooks/token-classification-ner.md) - Build a Named Entity Recognition system
* [**Transforms Guide**](/encoderfile/transforms/index.md) - Custom post-processing with Lua scripts

### Python Library

* [**Building with Python**](/encoderfile/python-library/building-with-python.md) - Build encoderfiles programmatically with the Python package
* [**Python API Reference**](/encoderfile/python-library/api-reference.md) - Complete reference for all classes and functions

### Reference

* [**CLI Reference**](/encoderfile/reference/cli.md) - Full documentation for `build`, `serve`, and `infer` commands
* [**API Reference**](/encoderfile/reference/api-reference.md) - REST, gRPC, and MCP endpoint specifications

## Community & Support

* [**GitHub Issues**](https://github.com/mozilla-ai/encoderfile/issues) - Report bugs or request features
* [**Contributing Guide**](/encoderfile/community/contributing.md) - Learn how to contribute
* [**Code of Conduct**](/encoderfile/community/code_of_conduct.md) - Community guidelines

{% hint style="info" %}
Standard builds of Encoderfile require glibc to run because of the ONNX runtime. See [this issue](https://github.com/mozilla-ai/encoderfile/issues/69) on progress on building Encoderfile for musl linux.
{% endhint %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.mozilla.ai/encoderfile/index.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
