Process Token Embeddings from Flair Sentence Object
Source:R/process_embeddings.R
process_embeddings.Rd
This function processes token embeddings from a Flair sentence object and converts them into a matrix format with token names as row names. It handles the extraction of embeddings from tokens, retrieval of token texts, and conversion to matrix format.
Value
A matrix where:
Each row represents a token's embedding
Row names are the corresponding token texts
Columns represent the dimensions of the embedding vectors
Details
The function will throw errors in the following cases:
If sentence is NULL or has no tokens
If any token is missing an embedding
If any token is missing text
Examples
if (FALSE) { # \dontrun{
# Create a Flair sentence
sentence <- Sentence("example text")
WordEmbeddings <- flair_embeddings()$WordEmbeddings
# Initialize FastText embeddings trained on Common Crawl
fasttext_embeddings <- WordEmbeddings('en-crawl')
# Apply embeddings
fasttext_embeddings$embed(sentence)
# Process embeddings with timing and messages
embedding_matrix <- process_embeddings(sentence, verbose = TRUE)
} # }