user_a - 3
user_b - 4
user_c - 1
user_d - 4
I want to show the distribution over number of tweets per author in r using a histogram. The original file has 1048575 such rows
I did hist(df$twitter_count, nrow(df))
but I don't think its correct
Asked
Active
Viewed 6,017 times
0

Mehru
- 1
- 1
- 3
-
please include your data as editable text instead of link to an image – Imran Ali Oct 22 '17 at 04:37
-
Hi Mehru - welcome to SO... it would help me help you if I knew a little more about your data - see https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example. Your nrow(df) is speficying the breaks in the histogram... If you are looking at doing some conditional histograms (e.g. number of tweets per day/week/month/year per author) you might consider using lattice or ggplot2. – James Thomas Durant Oct 22 '17 at 04:40
-
If you want the histogram of twitter counts, just use `hist(df$twitter_count)` – kangaroo_cliff Oct 22 '17 at 04:44
-
1see [here](https://stackoverflow.com/questions/46860454/constructing-histogram-from-2-variables-in-1-column-in-r/46860693#46860693) – vaettchen Oct 22 '17 at 05:19
-
1Possible duplicate of [Constructing histogram from 2 variables in 1 column in R](https://stackoverflow.com/questions/46860454/constructing-histogram-from-2-variables-in-1-column-in-r) – vaettchen Oct 22 '17 at 05:20
3 Answers
3
It seems I have misunderstood the question. I think following could be what the OP is looking for.
df <- data.frame(user = letters,
twitter_count = sample.int(200, 26))
ggplot(df, aes(user, twitter_count)) +
geom_col()
Assuming you are looking for multiple histograms.
Replace user
with respective variable name in your data.frame.
# Example data
df <- data.frame(user = iris$Species,
twitter_count= round(iris[, 1]*10))
# Histograms using ggplot2 package
library(ggplot2)
ggplot(df, aes(x = twitter_count)) +
geom_histogram() + facet_grid(.~user)
Best to use an alternative method to see the distributions of twitter counts if your data contain many twitter users.

kangaroo_cliff
- 6,067
- 3
- 29
- 42
1
If each row of the data.frame represents a user -
set.seed(1)
df <- data.frame(user = letters, twitter_count = rpois(26, lambda = 4) + 1)
hist(df$twitter_count)

James Thomas Durant
- 285
- 4
- 13
0
Since you said, distribution for 'each user', I think it should be a bar blot:
require(data.table)
dat <- fread("
user_a - 3
user_b - 4
user_c - 1
user_d - 4"
)
barplot( names.arg = dat$V1, as.numeric(dat$V3) )
or if you are looking for histograms, then:
hist(as.numeric(dat$V3), xlab = "", main="Histogram")

LeMarque
- 733
- 5
- 21