1

I have started working on a sentiment analysis, but I have problem with transforming the lexicon into the required format

My data looks like something this:

word alternativeform1 alternativeform2 value
abmachen abgemacht abmachst 0.4
Aktualisierung Aktualisierungen NA 0.2

I need it to look like this

word value
abmachen 0.4
abgemacht 0.4
abmachst 0.4
Aktualisierung 0.2
Aktualisierungen 0.2

Can you help me find the easy way to do this? Thank you very much :)

Honza88
  • 60
  • 1
  • 6
  • from tidyverse package, try the pivot_longer with the "value" column as the ID df_long <- pivot_longer(data=df, cols=c(1:3), names_to="value", value_to="word") – BPeif Nov 03 '21 at 15:16
  • Does this answer your question? [Reshaping data.frame from wide to long format](https://stackoverflow.com/questions/2185252/reshaping-data-frame-from-wide-to-long-format) – Martin Gal Nov 03 '21 at 15:18

1 Answers1

1

You could use

library(dplyr)
library(tidyr)

df %>% 
  pivot_longer(-value, values_to = "word") %>% 
  drop_na(word) %>% 
  select(word, value)

This returns

# A tibble: 5 x 2
  word             value
  <chr>            <dbl>
1 abmachen           0.4
2 abgemacht          0.4
3 abmachst           0.4
4 Aktualisierung     0.2
5 Aktualisierungen   0.2
Martin Gal
  • 16,640
  • 5
  • 21
  • 39