GitHub - fastino-ai/GLiNER2: Unified Schema-Based Information Extraction

Extract entities, classify text, parse structured data, and extract relations—all in one efficient model.

GLiNER2 unifies Named Entity Recognition, Text Classification, Structured Data Extraction, and Relation Extraction into a single 205M parameter model. It provides efficient CPU-based inference without requiring complex pipelines or external API dependencies.

Fine-tune via Pioneer. Additional documentation via Pioneer docs. Join discussions on Discord and Reddit.

🎯 One Model, Four Tasks: Entities, classification, structured data, and relations in a single forward pass

💻 CPU First: Lightning-fast inference on standard hardware—no GPU required

🛡️ Privacy: 100% local processing, zero external dependencies

🚀 Installation & Quick Start

The base install gives you Schema, SchemaInput, RegexValidator, GLiNER2API, InputExample, TrainingDataset, and all JSONL validation tooling — everything needed to build schemas, validate data, and call the cloud API without pulling in PyTorch.

To load and run models locally, install the [local] extra:

Enable fp16 and/or torch.compile for faster inference — no extra dependencies required.

🌐 API Access: GLiNER XL 1B

Our biggest and most powerful model—GLiNER XL 1B—is available exclusively via API. No GPU required, no model downloads, just instant access to state-of-the-art extraction. Get your API key at gliner.pioneer.ai.

The models are available on Hugging Face.

📚 Documentation & Tutorials

Comprehensive guides for all GLiNER2 features:

Text Classification - Single and multi-label classification with confidence scores

Entity Extraction - Named entity recognition with descriptions and spans

Structured Data Extraction - Parse complex JSON structures from text

Combined Schemas - Multi-task extraction in a single pass

Regex Validators - Filter and validate extracted spans

Relation Extraction - Extract relationships between entities

API Access - Use GLiNER2 via cloud API

Training Data Format - Complete guide to preparing training data

Model Training - Train custom models for your domain

LoRA Adapters - Parameter-efficient fine-tuning

Adapter Switching - Switch between domain adapters

Extract named entities with optional descriptions for precision:

Single or multi-label classification with configurable confidence:

3. Structured Data Extraction

Parse complex structured information with field-level control:

Extract relationships between entities as directional tuples:

5. Multi-Task Schema Composition

Combine all extraction types when you need comprehensive analysis:

🏭 Example Usage Scenarios

Field Types and Constraints

Filter extracted spans to ensure they match expected patterns, improving extraction quality and reducing false positives.

FlashDeberta (Optional GPU Acceleration)

For DebertaV2-based models, you can use FlashDeberta to accelerate inference on GPU via flash attention kernels.

The flag is only effective when the model uses a DebertaV2 encoder and the flashdeberta package is installed. Otherwise standard HuggingFace AutoModel is used automatically.

A benchmark script is included to compare the two backends:

Process multiple texts efficiently in a single call:

🎓 Training Custom Models

Train GLiNER2 on your own data to specialize for your domain or use case.

Training Data Format (JSONL)

GLiNER2 uses JSONL format where each line contains an input and output field:

Training from JSONL File

LoRA Training (Parameter-Efficient Fine-Tuning)

Train lightweight adapters for domain-specific tasks:

Smaller size: Adapters are ~2-10 MB vs ~450 MB for full models

Faster training: 2-3x faster than full fine-tuning

Easy switching: Swap adapters in milliseconds for different domains

For more details, see the Training Tutorial and Data Format Guide.

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

If you use GLiNER2 in your research, please cite:

Built upon the original GLiNER architecture by the team at Fastino AI.

Source: https://github.com/fastino-ai/GLiNER2