Flair is a Python library developed by Zalando Research that stands out as a notably user-friendly NLP framework. Flair NLP provides intuitive interfaces with exceptional multilingual embeddings, especially for various multilingual embedding frameworks like GloVe and transformer-based models on Hugging Face.
flaiR Installation
System Requirements:
- Python >= 3.9
- R >= 4.2.0
- Rstudio
Option 1: Fast Install flaiR
and Python with
reticulate
- Install Python automatically through R
reticulate
:
# Install and load reticulate package
install.packages("reticulate")
library(reticulate)
- Install Python 3.10 using reticulate:
install_python(version = "3.10")
- Verify the Python installation and see which version is being used:
py_config() # Shows Python configuration details
- To install flaiR package:
install.packages("remotes")
remotes::install_github("davidycliao/flaiR", force = TRUE)
- Load the flaiR package (when the message is printed showing the corresponding flair NLP python version, it indicates successful attachment of Flair NLP python to your R):
library(flaiR)
#> flaiR: An R Wrapper for Accessing Flair NLP 0.13.1
Option 2: Docker Setup
The Docker image (12.5GB) includes a complete R/Python development environment with RStudio Server + R environments and Python 3.9 with flair NLP and its dependencies (PyTorch, models).
Please ensure your system meets these requirements:
- At least 15GB of free disk space
- Minimum 16GB RAM recommended
- Stable internet connection for initial download
- Docker installed and running
Intel/AMD Processors:
# Pull and run
docker pull ghcr.io/davidycliao/flair-rstudio:latest
docker run -d \
-p 8787:8787 \
--user root \
-e USER=rstudio \
-e PASSWORD=rstudio123 \
--name flair-rstudio \
ghcr.io/davidycliao/flair-rstudio:latest
Apple Silicon (M1 Series MacOS):
# Pull and run with platform specification
docker pull --platform linux/amd64 ghcr.io/davidycliao/flair-rstudio:latest
docker run -d \
-p 8787:8787 \
--platform linux/amd64 \
--user root \
-e USER=rstudio \
-e PASSWORD=rstudio123 \
--name flair-rstudio \
ghcr.io/davidycliao/flair-rstudio:latest
Access RStudio Server:
- Open browser:
http://localhost:8787
- Username:
rstudio
- Password:
rstudio123
Troubleshooting Guide
The flaiR
will:
- Automatically check Python environment
- Install required Python packages (flair and its dependencies)
- Configure the Python-R connection
install.packages("remotes")
remotes::install_github("davidycliao/flaiR", force = TRUE)
Installation typically proceeds automatically with RStudio and R handling dependencies. However, users (particularly those with Mac ARM64 systems) may encounter compilation issues. Here’s a comprehensive guide to resolve common problems:
Mac M1/M2/M3 (Apple Silicon) Specific Setup
- Architecture Options
- Requires specific compiler setup
- May encounter additional compatibility issues
- Native ARM64 Version
- Requires specific compiler setup
- May encounter additional compatibility issues
Automatic Dependencies (Python)
flaiR
R package will automatically install the following
Python packages during setup:
Core Dependencies - Python (3.9 or 3.10 recommended) - flair[word-embeddings] (>=0.11.3) - numpy (>=1.22.4, <1.29.0) - PyTorch (>=2.0.0, <2.6.0) - scipy (==1.12.0)
Additional Dependencies - gensim (>=4.3.2) - transformers (>=4.30.0) - sentencepiece (>=0.1.99) - bpemb (>=0.3.5)
All other required dependencies will be automatically handled during
the installation process. You can manage Python settings in RStudio
through Tools -> Global Options -> Python
.
To check your Python configuration, use:
reticulate::py_config()
Recommended Configurations
For stable usage, we strongly recommend installing these specific versions.
OS | R Versions | Python Version |
---|---|---|
Mac | 4.3.2, 4.2.0, |
3.10.x, 3.9 |
Windows | 4.0.5, Latest | 3.10.x, 3.9 |
Ubuntu | 4.3.2, 4.2.0, 4.2.1 | 3.10.x, 3.9 |
R Version Requirements
The flaiR package has specific R version requirements based on Matrix package compatibility:
- R 4.5.0+: Matrix 1.7-0 or newer
- R 4.4.x: Matrix 1.6-x
- R 4.3.x: Matrix 1.5-1
- R 4.2.x: Matrix 1.4-1
*: On R 4.2.1, particularly when using the Matrix package on ARM 64 architecture Macs (M1/M2), compatibility issues with gfortran may occur. It’s recommended to avoid this combination.
If you encounter any problems or have questions:
- Check the documentation
- Visit our Issues page
- Join our Discussion forum
Troubleshooting for Docker
- If you encounter any issues:
# Try using a different port (e.g., 8788)
docker run -d \
-p 8788:8787 \
--platform linux/amd64 \
--user root \
--name flair-rstudio \
ghcr.io/davidycliao/flair-rstudio:latest
Then access via http://localhost:8788
- Container already exists:
# Remove existing container
docker stop flair-rstudio
docker rm flair-rstudio
Then retry the run command.
- Check container status:
# View running containers
docker ps
# View all containers including stopped ones
docker ps -a
# View container logs
docker logs flair-rstudio
NLP Tasks
For R users, flairR extends FlairNLP with three NLP task functions to extract features in a neat format with data.table. Through these featured functions, you don’t have to write loops to format parsed output on your own. The main features include part-of-speech tagging, named entity recognition and sentiment analysis. Additionally, to handle the load on RAM when dealing with larger corpora, flairR supports batch processing to handle texts in batches, which is especially useful when dealing with large datasets, to optimize memory usage and performance.
Core Featured Functions | Loader | Supported Models from Flair NLP |
---|---|---|
get_entities() |
load_tagger_ner() |
en (English), fr (French), da
(Danish), nl (Dutch), and more. |
get_pos() |
load_tagger_pos() |
pos (English POS), fr-pos (French POS),
de-pos (German POS), nl-pos (Dutch POS), and
more. |
Training and Fine-tuning
In flairR, we use the simplest S3
method to wrap major modules. All modules will work like R6 in the
R environment when loaded from Flair NLP. In Python, both functions and
methods (sometimes referred to as functions in R) within a class can be
accessed using the $
operator. For example,
from flair.trainers import ModelTrainer
in Python is
equivalent to
ModelTrainer <- flair_trainers()$ModelTrainer
in R
environment with flairR.
Wrapped Flair NLP Modules with S3 | Corresponding Code Practices When Loading Modules from FlairNLP |
---|---|
flair_datasets() |
from flair.datasets import * |
flair_nn() |
from flair.nn import * |
flair_splitter() |
from flair.splitter import * |
flair_trainers() |
from flair.trainers import * |
flair_models() |
from flair.models import * |