How to get distribution of a column in R

Question

Im working on stackoverflow data dump .csv file and I need to to find the distribution of scores for questions.

I opened the file in R and extracted the two columns that I need which are the PostTypeID and Score.

example :

I need to find :

3 rows in the score column that has the score 11.

2 rows in the score column that has the score 3. .... etc

The thing is the data is too large, it has 3 million rows and I don't know how to get the distribution.

Note I'm a beginner in R, so I need the simplest way to do that.

You mention *"filter"* and *"get the distribution"*, the two are not the same. Please read about how to ask good questions (refs https://stackoverflow.com/help/mcve and https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example), and then edit your question. Some pointers: consumable data (e.g., `dput`) and desired output. — r2evans, Mar 02 '18 at 04:15

score 2 · Accepted Answer · answered Mar 02 '18 at 04:29

2

You are looking for the table function.

If d is your data structure, then you want

table(d$Score)

answered Mar 02 '18 at 04:29

Daniel V

how I can I plot that ? – user8863554 Mar 02 '18 at 04:45
2

Do you mean `hist(x$Score)`? – r2evans Mar 02 '18 at 04:51
I want to plot the result I got from" table(d$Score)" – user8863554 Mar 02 '18 at 04:52

score 1 · Answer 2 · answered Mar 02 '18 at 04:09

1

x=data[, score==3] to get rows with score 3

answered Mar 02 '18 at 04:09

Ronak Bokaria

There are millions of scores, 3 was an example – user8863554 Mar 02 '18 at 04:13
1

do you mean `data[data$score==3,]`? you are filtering on columns (not rows and `score` is likely not defined by itself. – r2evans Mar 02 '18 at 04:13
All the rows will be returned but if you want to filter further the use data[, score==3|posttypeid=2 ] or use head() tail() functions to specify number of rows you want in result. – Ronak Bokaria Mar 02 '18 at 04:17
I guess fetching row is more beneficial so i wrote that for rows. – Ronak Bokaria Mar 02 '18 at 04:19

2 Answers2