1

Hence, I have the following dataset, Unifreq[2:6], which looks like this:

> Unifreq[2:6]
   and    you    for   that   with 
343668 171744 165788 153540 103160

when I index the data like this:

Looking at this solution from here:

https://stackoverflow.com/questions/23167827/using-reshape-from-wide-to-long-in-r  

I then tried to do it in this fashion:

data.frame(frequency = Unifreq[1:20])

I was not sure how to get it done, but I made some progress and got this now:

> data.frame(frequency = Unifreq[1:20])
     frequency
the     646772
and     343668
you     171744
for     165788
that    153540
with    103160
this     89900
was      88608
have     83172
are      77528
but      72908
not      64128
your     54936
all      54684
from     52880
just     52052
out      47504
they     47044
like     46660
will     46572

The recommendation to use stack is nice, and it now looks like this:

> df1 <- stack(Unifreq[1:20], index=F)
> names(df1) <- c("Frequency", "Word")
> head(df1, 10)
   Frequency Word
1     646772  the
2     343668  and
3     171744  you
4     165788  for
5     153540 that
6     103160 with
7      89900 this
8      88608  was
9      83172 have
10     77528  are

Nevertheless, I would like to exclude the indexing, so they can look like this:

Word   Frequency
and     343668
you      171744
...

I tried the link that you provided, but It does not seem to help me. I am sort of new at this and did not understand how to shape the data to two separate columns and display the data as a table.

How would I reshape this data in R?

Johnny
  • 819
  • 1
  • 10
  • 24
  • Hi Johnny, can you clarify how what you're looking for is different than what @akrun suggests without a trivial change in column names? Stated differently, the solution results in a 2 column `data.frame` with integer row names and column names of `ind` and `values`. What are you looking for if not that? – Ian Campbell Jun 25 '20 at 22:20
  • You can use `print(df1, row.names = FALSE)` to remvoe the rownames from printing. – Ronak Shah Jun 26 '20 at 00:24

1 Answers1

1

This can be achieved with stack from base R

out <- stack(Unifreq)[2:1]
names(out) <- c("Word", "Frequency")
#  Word Frequency
#1  and 343668
#2  you 171744
#3  for 165788
#4 that 153540
#5 with 103160

data

Unifreq <- structure(list(and = 343668L, you = 171744L, `for` = 165788L, 
    that = 153540L, with = 103160L), class = "data.frame", row.names = c(NA, 
-1L))
akrun
  • 874,273
  • 37
  • 540
  • 662