> For the complete documentation index, see [llms.txt](https://docs.mozilla.ai/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.mozilla.ai/encoderfile/cookbooks/token-classification-ner.md).

# Token Classification (NER)

This cookbook walks through building, deploying, and using a Named Entity Recognition (NER) model with Encoderfile. We'll use BERT fine-tuned for NER to identify people, organizations, and locations in text.

## What You'll Learn

* Export a token classification model to ONNX
* Build a self-contained encoderfile binary
* Deploy as a REST API server
* Make predictions via HTTP
* Use CLI for batch processing

## Prerequisites

* `encoderfile` CLI tool installed ([Installation Guide](/encoderfile/index.md#1-install-cli))
* Python with `optimum[exporters]` for ONNX export
* `curl` for testing the API

***

## Step 1: Export the Model

We'll use `dslim/bert-base-NER`, a BERT model fine-tuned for named entity recognition.

{% hint style="info" %}
**About the Model**

This model recognizes 4 entity types:

* **PER** - Person names
* **ORG** - Organizations
* **LOC** - Locations
* **MISC** - Miscellaneous entities
  {% endhint %}

### Export to ONNX

```bash
# Install optimum if you haven't already
pip install optimum[exporters]

# Export the model
optimum-cli export onnx \
  --model dslim/bert-base-NER \
  --task token-classification \
  ./ner-model
```

{% hint style="info" %}
**What files are created?**

The export creates:

```
ner-model/
├── config.json          # Model configuration
├── model.onnx          # ONNX weights
├── tokenizer.json      # Fast tokenizer
├── tokenizer_config.json
└── special_tokens_map.json
```

{% endhint %}

***

## Step 2: Create Configuration

Create a YAML configuration file for building the encoderfile.

{% tabs %}
{% tab title="ner-config.yml" %}

```yaml
encoderfile:
  name: ner-tagger
  version: "1.0.0"
  path: ./ner-model
  model_type: token_classification
  output_path: ./build/ner-tagger.encoderfile
```

{% endtab %}

{% tab title="With Optional Transform" %}

```yaml
encoderfile:
  name: ner-tagger
  version: "1.0.0"
  path: ./ner-model
  model_type: token_classification
  output_path: ./build/ner-tagger.encoderfile
  transform: |
    --- Apply softmax to normalize logits
    function Postprocess(arr)
        return arr:softmax(3)
    end
```

{% endtab %}
{% endtabs %}

{% hint style="success" %}
**Configuration Options**

* `name` - Model identifier used in API responses
* `path` - Directory containing ONNX model files
* `model_type` - Must be `token_classification` for NER
* `output_path` - Where to save the binary (optional)
* `transform` - Optional Lua script for post-processing
  {% endhint %}

***

## Step 3: Build the Binary

Build your self-contained encoderfile binary:

```bash
# Create output directory
mkdir -p build

# Build the encoderfile
encoderfile build -f ner-config.yml
```

{% hint style="success" %}
**Build Output**

You should see output like:

```
Validating model...
Generating project...
Compiling binary...
✓ Build complete: ./build/ner-tagger.encoderfile
```

{% endhint %}

The resulting binary is **completely self-contained** - it includes:

* ONNX model weights
* Tokenizer
* Full inference runtime
* REST and gRPC servers

***

## Step 4: Start the Server

Launch the encoderfile server:

```bash
# Make executable (if needed)
chmod +x ./build/ner-tagger.encoderfile

# Start server
./build/ner-tagger.encoderfile serve
```

{% hint style="info" %}
**Server Startup**

```
Starting HTTP server on 0.0.0.0:8080
Starting gRPC server on [::]:50051
Model: ner-tagger v1.0.0
```

{% endhint %}

The server is now running with both HTTP and gRPC endpoints.

***

## Step 5: Make Predictions

Now let's test the NER model with different types of text.

### Example 1: Basic Entity Recognition

{% tabs %}
{% tab title="Request" %}

```bash
curl -X POST http://localhost:8080/predict \
  -H "Content-Type: application/json" \
  -d '{
    "inputs": ["Mozilla is headquartered in San Fancisco, CA"]
  }'
```

{% endtab %}

{% tab title="Expected Response" %}

```json
{
  "results": [
    {
      tokens: [{
       "token_info": {
                    "token": "Mozilla",
                    "token_id": 12556,
                    "start": 0,
                    "end": 2
                },
                "scores": [
                    -0.48987845,
                    2.912971,
                    -1.6960273,
                    2.2318482,
                    -3.2153757
                  ]
              .....
      "label": "B-ORG",
      "score": 4.5583587
    }
    ]
    }
  ],
  "model_id": "ner-tagger"
}
```

{% endtab %}

{% tab title="Interpretation" %}
**Entities Found:**

* **Mozilla** → `B-ORG`, `I-ORG` (Organization)
* **San Francisco** → `B-LOC` (Location)
* **CA** → `B-LOC` (Location)

The `B-` prefix indicates the beginning of an entity, `I-` indicates inside/continuation, and `O` means outside any entity.
{% endtab %}
{% endtabs %}

### Example 2: Multiple Sentences

{% tabs %}
{% tab title="Request" %}

```bash
curl -X POST http://localhost:8080/predict \
  -H "Content-Type: application/json" \
  -d '{
    "inputs": [
      "Yvon Chouinard founded Patagonia in 1957.",
      "The Eiffel Tower is located in Paris, France."
    ]
  }'
```

{% endtab %}

{% tab title="Expected Entities" %}
**Sentence 1:**

* **Yvon** → Person (PER)
* **Patagonia** → Organization (ORG)

**Sentence 2:**

* **Eiffel Tower** → Miscellaneous (MISC)
* **Paris** → Location (LOC)
* **France** → Location (LOC)
  {% endtab %}
  {% endtabs %}

## Step 6: CLI Inference

For batch processing or one-off predictions, use the CLI directly:

### Single Input

```bash
./build/ner-tagger.encoderfile infer \
  "Tim Cook presented the new iPhone at Apple Park in California."
```

### Batch Processing

```bash
./build/ner-tagger.encoderfile infer \
  "Amazon was founded by Jeff Bezos in Seattle." \
  "Mozilla's headquarters are in San Francisco, California." \
  "Marie Curie won the Nobel Prize in Physics." \
  -o results.json
```

This saves all results to `results.json` for further processing.

***

## Advanced Usage

### Custom Ports

```bash
./build/ner-tagger.encoderfile serve \
  --http-port 3000 \
  --grpc-port 50052
```

### HTTP Only (Disable gRPC)

```bash
./build/ner-tagger.encoderfile serve --disable-grpc
```

### Production Deployment

```bash
# Copy to system location
sudo cp ./build/ner-tagger.encoderfile /usr/local/bin/

# Run as a service (example with systemd)
/usr/local/bin/ner-tagger.encoderfile serve \
  --http-hostname 0.0.0.0 \
  --http-port 8080
```

***

## Understanding the Output

### Token Classification Labels

The model uses the IOB (Inside-Outside-Beginning) tagging scheme:

| Prefix | Meaning             | Example                                |
| ------ | ------------------- | -------------------------------------- |
| `B-`   | Beginning of entity | `B-PER` for "Barack" in "Barack Obama" |
| `I-`   | Inside/continuation | `I-PER` for "Obama" in "Barack Obama"  |
| `O`    | Outside any entity  | `O` for "is" or "the"                  |

### Entity Types

| Label  | Description   | Examples                               |
| ------ | ------------- | -------------------------------------- |
| `PER`  | Person names  | "John Smith", "Marie Curie"            |
| `ORG`  | Organizations | "Apple Inc.", "United Nations"         |
| `LOC`  | Locations     | "Paris", "California", "Mount Everest" |
| `MISC` | Miscellaneous | "iPhone", "Nobel Prize"                |

### Response Format

```json
{
  "results": [
    {
      "tokens": ["word1", "word2", ...],          // Tokenized input
      "logits": [[...], [...], ...],              // Raw model outputs
      "predicted_labels": ["B-PER", "O", ...]     // Predicted entity tags
    }
  ],
  "model_id": "ner-tagger"
}
```

***

## Troubleshooting

### Unexpected Entity Recognition

{% hint style="warning" %}
**Model Limitations**

The model may struggle with:

* Rare or domain-specific entities
* Ambiguous contexts (e.g., "Washington" as person vs. location)
* Non-English text
* Very long sequences (>512 tokens)
  {% endhint %}

**Solution:** Fine-tune on domain-specific data or use a specialized model.

### Performance Optimization

If inference is slow:

```yaml
# Consider adding a transform to reduce output size
transform: |
  function Postprocess(arr)
    -- Only return top prediction per token
    return arr:argmax(3)
  end
```

### Server Connection Issues

```bash
# Check if server is running
curl http://localhost:8080/health

# Try different port
./build/ner-tagger.encoderfile serve --http-port 8081
```

***

## Next Steps

* [**Transforms Guide**](/encoderfile/transforms/index.md) - Learn about custom post-processing with Lua scripts
* [**Transforms Reference**](/encoderfile/transforms/reference.md) - Complete transforms API documentation
* [**API Reference**](/encoderfile/reference/api-reference.md) - REST, gRPC, and MCP endpoint specifications
* [**CLI Reference**](/encoderfile/reference/cli.md) - Full documentation for build, serve, and infer commands

***

## Summary

You've learned to:

* ✅ Export a token classification model to ONNX
* ✅ Build a self-contained encoderfile binary
* ✅ Deploy as a REST API server
* ✅ Make predictions via HTTP and CLI
* ✅ Understand NER output format

The encoderfile you built is production-ready and can be deployed anywhere without dependencies!


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.mozilla.ai/encoderfile/cookbooks/token-classification-ner.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
