GitHub - fastino-ai/GLiNER2: Unified Schema-Based Information Extraction
Extract entities, classify text, parse structured data, and extract relations—all in one efficient model.
Extract entities, classify text, parse structured data, and extract relations—all in one efficient model.
GLiNER2 unifies Named Entity Recognition, Text Classification, Structured Data Extraction, and Relation Extraction into a single 205M parameter model. It provides efficient CPU-based inference without requiring complex pipelines or external API dependencies.
Fine-tune via Pioneer. Additional documentation via Pioneer docs. Join discussions on Discord and Reddit.
🎯 One Model, Four Tasks: Entities, classification, structured data, and relations in a single forward pass
💻 CPU First: Lightning-fast inference on standard hardware—no GPU required
🛡️ Privacy: 100% local processing, zero external dependencies
🚀 Installation & Quick Start
The base install gives you Schema, SchemaInput, RegexValidator, GLiNER2API, InputExample, TrainingDataset, and all JSONL validation tooling — everything needed to build schemas, validate data, and call the cloud API without pulling in PyTorch.
To load and run models locally, install the [local] extra:
Enable fp16 and/or torch.compile for faster inference — no extra dependencies required.
🌐 API Access: GLiNER XL 1B
Our biggest and most powerful model—GLiNER XL 1B—is available exclusively via API. No GPU required, no model downloads, just instant access to state-of-the-art extraction. Get your API key at gliner.pioneer.ai.
The models are available on Hugging Face.
📚 Documentation & Tutorials
Comprehensive guides for all GLiNER2 features:
Text Classification - Single and multi-label classification with confidence scores
Entity Extraction - Named entity recognition with descriptions and spans
Structured Data Extraction - Parse complex JSON structures from text
Combined Schemas - Multi-task extraction in a single pass
Regex Validators - Filter and validate extracted spans
Relation Extraction - Extract relationships between entities
API Access - Use GLiNER2 via cloud API
Training Data Format - Complete guide to preparing training data
Model Training - Train custom models for your domain
LoRA Adapters - Parameter-efficient fine-tuning
Adapter Switching - Switch between domain adapters
Extract named entities with optional descriptions for precision:
Single or multi-label classification with configurable confidence:
3. Structured Data Extraction
Parse complex structured information with field-level control:
Extract relationships between entities as directional tuples:
5. Multi-Task Schema Composition
Combine all extraction types when you need comprehensive analysis:
🏭 Example Usage Scenarios
Field Types and Constraints
Filter extracted spans to ensure they match expected patterns, improving extraction quality and reducing false positives.
FlashDeberta (Optional GPU Acceleration)
For DebertaV2-based models, you can use FlashDeberta to accelerate inference on GPU via flash attention kernels.
The flag is only effective when the model uses a DebertaV2 encoder and the flashdeberta package is installed. Otherwise standard HuggingFace AutoModel is used automatically.
A benchmark script is included to compare the two backends:
Process multiple texts efficiently in a single call:
🎓 Training Custom Models
Train GLiNER2 on your own data to specialize for your domain or use case.
Training Data Format (JSONL)
GLiNER2 uses JSONL format where each line contains an input and output field:
Training from JSONL File
LoRA Training (Parameter-Efficient Fine-Tuning)
Train lightweight adapters for domain-specific tasks:
Smaller size: Adapters are ~2-10 MB vs ~450 MB for full models
Faster training: 2-3x faster than full fine-tuning
Easy switching: Swap adapters in milliseconds for different domains
For more details, see the Training Tutorial and Data Format Guide.
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
If you use GLiNER2 in your research, please cite:
Built upon the original GLiNER architecture by the team at Fastino AI.