3

I am a newbie in R. I need to generate some graphs. I imported an excel file and need to create a histogram on one column. My importing code is-

file=read.xlsx('femalecommentcount.xlsx',1,header=FALSE)
col=file[2]

col looks like this (part) -

36961     1
36962     1
36963     7
36964     1
36965     2
36966     1
36967     1
36968     4
36969     1
36970     6
36971     3
36972     1
36973     6
36974     6
36975     2
36976     2
36977     8
36978     2
36979     1
36980     1
36981     1

the first column is the row number. I'm not sure how to remove this. The second column is my data that I want a histogram on. hist() function requires a vector, I'm not sure how exactly to convert.

If I just simple call -

hist(col)

it gives-

Error in hist.default(col) : 'x' must be numeric

I have tried few commands randomly from the internet, but they didn't work.

My eventual goal is to just generate a good histogram (and maybe other charts) on that column, to get a good understadning of the spread of my data.

Nasif Imtiaz Ohi
  • 1,563
  • 5
  • 24
  • 45
  • 1
    Try `hist(col$second_col)` or whatever that second column is called. – Ronak Shah Jan 08 '18 at 05:07
  • 1
    It should be `col=file[[2]]` or `col=file[, 2]` – MrFlick Jan 08 '18 at 05:10
  • actually my collumn doesn't have a header. hist(col$1) / hist(file$2) don't work. @RonakShah – Nasif Imtiaz Ohi Jan 08 '18 at 05:11
  • @MrFlick col=file[[2]] shows different represntation, same when I tried to convert it into a vector. typeof(col) though now says integer. However, hist() giving the same error. – Nasif Imtiaz Ohi Jan 08 '18 at 05:15
  • well, thanks @MrFlick , after that col=as.numeric(col) did the trick. But I think the histogram is imcomplete. But I'll look into that and that's another issue. Thanks – Nasif Imtiaz Ohi Jan 08 '18 at 05:18
  • 1
    `typeof()` isn't really that useful, use `class()` instead. I'm guessing your data is a "factor" which means there are probably non-numeric values in there or your import was incorrect. You don't want to use `as.numeric` on factors. It's better to properly import your data first and only use `as.numeric(as.character())` if absolutely necessary. – MrFlick Jan 08 '18 at 05:25

1 Answers1

0
  1. It should be col=file[[2]] or col=file[, 2] --- solution given in comment
  2. data import should be in correct way to avoid numeric issue
Nasif Imtiaz Ohi
  • 1,563
  • 5
  • 24
  • 45