Tidy emoji tokens, one row per occurrence with metadata
Source:R/emoji-extraction.R
emoji_tokens.Rdemoji_tokens() expands data to one row per emoji occurrence (in reading
order), keeping the original columns and adding the glyph together with its
name, category and sentiment score. This mirrors the one-token-per-row shape
familiar from tidy text mining and is convenient for counting, joining and
plotting.
Value
A tibble with the original columns plus .emoji, .emoji_name,
.emoji_category and .emoji_sentiment. Rows without emoji are dropped.
See also
emoji_frequency() for corpus-level counts and emoji_sentiment()
for per-row sentiment.
Examples
df <- data.frame(id = 1:2, text = c("great \U0001f600", "bad \U0001f621"))
emoji_tokens(df, text)
#> # A tibble: 2 × 6
#> id text .emoji .emoji_name .emoji_category .emoji_sentiment
#> <int> <chr> <chr> <chr> <chr> <dbl>
#> 1 1 great 😀 😀 grinning face Smileys & Emotion 0.572
#> 2 2 bad 😡 😡 enraged face Smileys & Emotion -0.173