> For the complete documentation index, see [llms.txt](https://docs.mozilla.ai/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.mozilla.ai/encoderfile/reference/building.md).

# Building Guide

This guide explains how to build custom encoderfile binaries from HuggingFace transformer models.

## Prerequisites

Before building encoderfiles, ensure you have:

* [Python 3.13+](https://www.python.org/downloads/) - For exporting models to ONNX
* [uv](https://docs.astral.sh/uv/getting-started/installation/) - Python package manager

If you are compiling the encoderfile CLI from source, make sure you also have:

* [Rust](https://rust-lang.org/tools/install/) - For building the CLI tool and binaries
* [protoc](https://protobuf.dev/installation/) - Protocol Buffer compiler

To compile encoderfile's Python bindings, you must also have [Maturin](https://www.maturin.rs/) installed. Instructions to install Maturin can be found [here](https://www.maturin.rs/installation.html).

### Installing Prerequisites

**macOS:**

```bash
# Install Rust
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

# Install uv
curl -LsSf https://astral.sh/uv/install.sh | sh

# Install protoc
brew install protobuf
```

**Linux:**

```bash
# Install Rust
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

# Install uv
curl -LsSf https://astral.sh/uv/install.sh | sh

# Install protoc (Ubuntu/Debian)
sudo apt-get install protobuf-compiler

# Install protoc (Fedora)
sudo dnf install protobuf-compiler
```

**Windows:**

```powershell
# Install Rust - Download rustup-init.exe from https://rustup.rs

# Install uv
powershell -c "irm https://astral.sh/uv/install.ps1 | iex"

# Install protoc - Download from https://github.com/protocolbuffers/protobuf/releases
```

## Development Setup

If you're contributing to encoderfile or modifying the source:

```bash
# Clone the repository
git clone https://github.com/mozilla-ai/encoderfile.git
cd encoderfile

# Set up the development environment
just setup
```

This will:

* Install Rust dependencies
* Create a Python virtual environment
* Download model weights for integration tests

## Building the CLI Tool

First, build the encoderfile CLI tool:

```bash
cargo build --bin encoderfile --release
```

The CLI binary will be created at `./target/release/encoderfile`.

Optionally, install it to your system:

```bash
cargo install --path encoderfile --bin encoderfile
```

## Step-by-Step: Building an Encoderfile

### Step 1: Prepare Your Model

You need a HuggingFace model with ONNX weights. You can either export a model or use one with existing ONNX weights.

#### Option A: Export a Model to ONNX

Use `optimum-cli` to export any HuggingFace model:

```bash
optimum-cli export onnx \
  --model <model_id> \
  --task <task_type> \
  <output_directory>
```

**Examples:**

**Embedding model:**

```bash
optimum-cli export onnx \
  --model sentence-transformers/all-MiniLM-L6-v2 \
  --task feature-extraction \
  ./models/embedder
```

**Sentiment classifier:**

```bash
optimum-cli export onnx \
  --model distilbert-base-uncased-finetuned-sst-2-english \
  --task text-classification \
  ./models/sentiment
```

**NER model:**

```bash
optimum-cli export onnx \
  --model dslim/bert-base-NER \
  --task token-classification \
  ./models/ner
```

**Available task types:**

* `feature-extraction` - For embedding models
* `text-classification` - For sequence classification
* `token-classification` - For token classification (NER, POS tagging, etc.)

See the [HuggingFace task guide](https://huggingface.co/docs/optimum/exporters/onnx/usage_guides/export_a_model) for more options.

#### Option B: Use a Pre-Exported Model

Some models on HuggingFace already have ONNX weights:

```bash
git clone https://huggingface.co/optimum/distilbert-base-uncased-finetuned-sst-2-english
```

#### Verify Model Structure

Your model directory should contain:

```
my_model/
├── config.json          # Model configuration
├── model.onnx           # ONNX weights (required)
├── tokenizer.json       # Tokenizer (required)
├── special_tokens_map.json
├── tokenizer_config.json
└── vocab.txt
```

### Step 2: Create Configuration File

Create a YAML configuration file (e.g., `config.yml`):

```yaml
encoderfile:
  # Model identifier (used in API responses)
  name: my-model

  # Model version (optional, defaults to "0.1.0")
  version: "1.0.0"

  # Path to model directory
  path: ./models/my-model

  # Model type: embedding, sequence_classification, or token_classification
  model_type: embedding

  # Output path (optional, defaults to ./<name>.encoderfile in current directory)
  output_path: ./build/my-model.encoderfile

  # Cache directory (optional, defaults to system cache)
  cache_dir: ~/.cache/encoderfile

  # Optional: Lua transform for post-processing
  # transform:
  #   path: ./transforms/normalize.lua
```

**Alternative: Specify file paths explicitly:**

```yaml
encoderfile:
  name: my-model
  model_type: embedding
  output_path: ./build/my-model.encoderfile
  path:
    model_config_path: ./models/config.json
    model_weights_path: ./models/model.onnx
    tokenizer_path: ./models/tokenizer.json
```

### Step 3: Build the Encoderfile

Build your encoderfile binary:

```bash
./target/release/encoderfile build -f config.yml
```

Or, if you installed the CLI:

```bash
encoderfile build -f config.yml
```

The build process will:

1. Detect your system platform and download the base runtime binary
2. Load and validate the configuration
3. Check for required model files
4. Validate the ONNX model structure
5. Format assets and append to the base binary
6. Output the binary to the specified path (or `./<name>.encoderfile` if not specified)

For more information on encoderfile file formats and build process, check out our page on [Encoderfile File Format](/encoderfile/reference/file_format.md).

**Build output:**

```
./build/my-model.encoderfile
```

### Step 4: Test Your Encoderfile

Make the binary executable and test it:

```bash
chmod +x ./build/my-model.encoderfile

# Test with CLI inference
./build/my-model.encoderfile infer "Test input"

# Or start the server
./build/my-model.encoderfile serve
```

## Configuration Options

> For a complete set of configuration options, see the [CLI Reference](/encoderfile/reference/cli.md)

## Model Types

### Embedding Models

For models using `AutoModel` or `AutoModelForMaskedLM`:

```yaml
encoderfile:
  name: my-embedder
  path: ./models/embedding-model
  model_type: embedding
  output_path: ./build/my-embedder.encoderfile
```

**Examples:**

* `bert-base-uncased`
* `distilbert-base-uncased`
* `sentence-transformers/all-MiniLM-L6-v2`

### Sequence Classification Models

For models using `AutoModelForSequenceClassification`:

```yaml
encoderfile:
  name: my-classifier
  path: ./models/classifier-model
  model_type: sequence_classification
  output_path: ./build/my-classifier.encoderfile
```

**Examples:**

* `distilbert-base-uncased-finetuned-sst-2-english` (sentiment)
* `roberta-large-mnli` (natural language inference)
* `facebook/bart-large-mnli` (entailment)

### Token Classification Models

For models using `AutoModelForTokenClassification`:

```yaml
encoderfile:
  name: my-ner
  path: ./models/ner-model
  model_type: token_classification
  output_path: ./build/my-ner.encoderfile
```

**Examples:**

* `dslim/bert-base-NER`
* `bert-base-cased-finetuned-conll03-english`
* `dbmdz/bert-large-cased-finetuned-conll03-english`

## Advanced Features

### Cross-compilation

Specify a target architecture for your encoderfile by using the `--platform` argument:

```bash
encoderfile build -f encoderfile.yml --platform <insert_target_here>
```

Encoderfile releases pre-built base binaries for the following architectures:

* `x86_64-unknown-linux-gnu`
* `aarch64-unknown-linux-gnu`
* `x86_64-apple-darwin`
* `aarch64-apple-darwin`

If you want to build the base binary locally, you can also point to a path. For example:

```bash
# build encoderfile base binary from source (will be at ./target/release/encoderfile-runtime)
cargo build -p encoderfile-runtime --release

# create encoderfile
encoderfile build \
  -f encoderfile.yml \
  --base-binary-path ./target/release/encoderfile-runtime
```

### Lua Transforms

Add custom post-processing with Lua scripts:

```yaml
encoderfile:
  name: my-model
  path: ./models/my-model
  model_type: token_classification
  transform:
    path: ./transforms/softmax_logits.lua
```

**Inline transform:**

```yaml
encoderfile:
  name: my-model
  path: ./models/my-model
  model_type: embedding
  transform: "return lp_normalize(output)"
```

By default, libraries `table`, `string` and `math` are enabled if property `lua_libs` is not present. This property allows you to specify a different set of libraries as strings, to choose from:

* `coroutine`
* `table`
* `io`
* `os`
* `string`
* `utf8`
* `math`
* `package`

Note that, if this property is present, no libraries are loaded by default, so all used libraries must be present.

**Inline transform:**

```yaml
encoderfile:
  name: my-model
  path: ./models/my-model
  model_type: embedding
  lua_libs:
    - table
    - string
    - math
    - os
  transform: | 
    t = os.time()
    return lp_normalize(output)
```

### Custom Cache Directory

Specify a custom cache location:

```yaml
encoderfile:
  name: my-model
  path: ./models/my-model
  model_type: embedding
  cache_dir: /tmp/encoderfile-cache
```

## Troubleshooting

### Error: "No such file: model.onnx"

**Solution:** Ensure your model directory contains ONNX weights.

```bash
# Export with optimum-cli
optimum-cli export onnx --model <model_id> --task <task> <output_dir>
```

### Error: "Could not locate model config at path"

**Solution:** The model directory is missing required files (config.json, tokenizer.json, model.onnx).

```bash
# Check directory contents
ls -la ./path/to/model
```

### Error: "cargo build failed"

**Solution:** Check that Rust and dependencies are installed.

```bash
rustc --version
cargo --version
protoc --version
```

### Build is very slow

**Solution:** The first build compiles many dependencies. Subsequent builds will be faster. Use release mode for production:

```bash
# Debug builds are slow
cargo build --bin encoderfile

# Release builds are optimized
cargo build --bin encoderfile --release
```

## CI/CD Integration

### GitHub Actions Example

```yaml
name: Build Encoderfile

on:
  push:
    branches: [main]

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3

      - name: Install Rust
        uses: dtolnay/rust-toolchain@stable

      - name: Install protoc
        run: sudo apt-get install -y protobuf-compiler

      - name: Export model to ONNX
        run: |
          pip install optimum[exporters]
          optimum-cli export onnx \
            --model distilbert-base-uncased \
            --task feature-extraction \
            ./model

      - name: Create config
        run: |
          cat > config.yml <<EOF
          encoderfile:
            name: my-model
            path: ./model
            model_type: embedding
            output_path: ./build/my-model.encoderfile
          EOF

      - name: Build encoderfile
        run: |
          cargo build --bin encoderfile --release
          ./target/release/encoderfile build -f config.yml

      - uses: actions/upload-artifact@v3
        with:
          name: encoderfile
          path: ./build/*.encoderfile
```

## Binary Distribution

After building, your encoderfile binary is completely self-contained:

* No Python runtime required
* No external dependencies
* No network calls needed
* Portable across systems with the same architecture

You can distribute the binary by:

1. Copying it to the target system
2. Making it executable: `chmod +x my-model.encoderfile`
3. Running it: `./my-model.encoderfile serve`

## Next Steps

* [CLI Reference](https://mozilla-ai.github.io/encoderfile/reference/cli/) - Complete command-line documentation
* [API Reference](https://mozilla-ai.github.io/encoderfile/reference/api-reference/) - REST, gRPC, and MCP APIs
* [Getting Started Guide](https://mozilla-ai.github.io/encoderfile/getting-started/) - Step-by-step tutorial
* [Contributing](/encoderfile/community/contributing.md) - Help improve encoderfile


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.mozilla.ai/encoderfile/reference/building.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
