Skip to content

Latest commit

 

History

History

Use GLiNER for NER annotation

The GLiNER model is a BERT family model for generalist NER. We download the model from HuggingFace, but the original model is available on GitHub.

Before you begin

Before you begin, you must install the Label Studio ML backend.

This tutorial uses the gliner example.

Running with Docker (recommended)

  1. Start Machine Learning backend on http://localhost:9090 with prebuilt image:
docker-compose up
  1. Validate that backend is running
$ curl http://localhost:9090/
{"status":"UP"}
  1. Create a project in Label Studio. Then from the Model page in the project settings, connect the model. The default URL is http://localhost:9090.

Building from source (advanced)

To build the ML backend from source, you have to clone the repository and build the Docker image:

docker-compose build

Running without Docker (advanced)

To run the ML backend without Docker, you have to clone the repository and install all dependencies using pip:

python -m venv ml-backend
source ml-backend/bin/activate
pip install -r requirements.txt

Then you can start the ML backend:

label-studio-ml start ./dir_with_your_model

Configuration

Parameters can be set in docker-compose.yml before running the container.

The following common parameters are available:

  • BASIC_AUTH_USER - Specify the basic auth user for the model server.
  • BASIC_AUTH_PASS - Specify the basic auth password for the model server.
  • LOG_LEVEL - Set the log level for the model server.
  • WORKERS - Specify the number of workers for the model server.
  • THREADS - Specify the number of threads for the model server.
  • LABEL_STUDIO_URL - Specify the URL of your Label Studio instance. Note that this might need to be http://host.docker.internal:8080 if you are running Label Studio on another Docker container.
  • LABEL_STUDIO_API_KEY- Specify the API key for authenticating your Label Studio instance. You can find this by logging into Label Studio and and going to the Account & Settings page.

A Note on Model Training

If you plan to use a webhook to train this model on "Start Training", note that you do not need to configure a separate webhook. Instead, go to the three dots next to your model on the Model tab in your project settings and click "start training".

Additionally, note that this container has been set for a VERY SMALL demo set, with only 1 non-eval sample (we expect the first 10 data samples to be for evaluation.)

If you're working with a larger dataset, be sure to:

  1. update num_steps and batch size to the number of training steps you want and the batch size that works for your dataset.
  2. change the uploaded model after training (line 239 of model.py) to the highest checkpoint that you have.