0

I'm very new to programming, so I'm sorry if this is a basic question that has been answered a bunch of times. I'm trying to plot a histogram that has months on the X-axis and number of sunspots on the Y-axis. To get the data I'm using stats library.

x <- datasets::sunspot.month
h <- hist(x, breaks=12, col="red", xlab="Month", main="Histogram with Normal Curve")

this is the code that i currently have and I'm getting some weird (probably wrong results), any advice on what I should try?

P.S. You can ignore the nominal curve I'll try to do that on my own once I get this right.

Results of the plot

How the data looks like

rawr
  • 20,481
  • 4
  • 44
  • 78
hmm
  • 3
  • 2
  • 1
    Does it really make sense to have a histogram where the x axis is not a continuous variable? Would a bar plot not be better suited for this data? – user438383 Oct 08 '20 at 18:46
  • What does sunspot.month look like? – NotThatKindODr Oct 08 '20 at 18:46
  • @user438383 I'm supposed to make a histogram, tbh I'm not sure why either – hmm Oct 08 '20 at 18:49
  • @NotThatKindODr I posted picture of the data – hmm Oct 08 '20 at 18:49
  • @NotThatKindODr, it's in base R, `data("sunspot.month", package="datasets")`. – r2evans Oct 08 '20 at 18:50
  • hmm, in general, please do not post pictures of data. In this case, you can be "clear" that you're using `datasets::sunspot.month`, since many may not be familiar with its presence in the base R package `datasets`. – r2evans Oct 08 '20 at 18:50
  • 2
    @r2evans true my description was vague, will try to be more precise next time – hmm Oct 08 '20 at 18:52
  • 1
    Some datasets are "well known", including `mtcars` and `iris`, and perhaps `diamonds`. Others are less-well-known, so it just helps to be clear from where you got the data. If it isn't in base R (`datasets` or similar) and is not in a package relevant to the question, then it's much better to provide a usable, unambiguous sample of the data, such as `dput(head(x))` (some data does not do well just copying from the console). Images are bad for several reasons, including that I cannot copy an image and use data in my R console. – r2evans Oct 08 '20 at 18:55
  • @r2evans Fair enough, all valid points especially about copy pasting, again I'm new to all this so I appreciate the tips – hmm Oct 08 '20 at 19:05
  • no issues, some of these are perhaps unique (or more-enforced) on SO where copying code over to a console makes testing/playing a *lot* easier. There as a "meta" discussion on it as well (https://meta.stackoverflow.com/a/285557) with many salient points, including: screen-readers, google-searching, and mobile devices. (My preferred way to refer to this format is as the [`.NORM`](https://xkcd.com/2116/) format, thanks to XKCD.) – r2evans Oct 08 '20 at 19:16

1 Answers1

1

Because it's a time-series, we can extract the "time" component with the (wait for it) time function :-)

time(sunspot.month)
#       Jan  Feb  Mar  Apr  May  Jun  Jul  Aug  Sep  Oct  Nov  Dec
# 1749 1749 1749 1749 1749 1749 1749 1750 1750 1750 1750 1750 1750
# 1750 1750 1750 1750 1750 1750 1750 1750 1751 1751 1751 1751 1751
# 1751 1751 1751 1751 1751 1751 1751 1752 1752 1752 1752 1752 1752
# 1752 1752 1752 1752 1752 1752 1752 1752 1753 1753 1753 1753 1753
# ...

If you look at the data a little more closely, the "time" values are decimal years,

options(digits=9)
head(time(sunspot.month))
# [1] 1749.00000 1749.08333 1749.16667 1749.25000 1749.33333 1749.41667

We can extract the "month only" from this by using a modulus of 1 (%% 1), which will give us a 0-based array. From there, we can table it to get the counts of each month, and then @user438383 suggested, present a bar plot.

Also, for labeling the months, we can use the base::month.abb constant.

barplot(table(as.integer(1+12*(time(sunspot.month) %% 1))),
        names.arg=month.abb)

sunspots bar plot, by month

r2evans
  • 141,215
  • 6
  • 77
  • 149
  • This is fairly advanced for me, but I'll check everything out and make sure to understand it. I appreciate the effort a lot. – hmm Oct 08 '20 at 19:05
  • I understand. Step through it one-by-one, starting with `time(sunspot.month) %% 1` (seeing all fractions on `[0,1)`), then understand what `1+12*` is doing to that. After that, `table` just gives counts per occurrence ... and I use `as.integer` for many reasons (including https://stackoverflow.com/questions/9508518/why-are-these-numbers-not-equal). After that, I think `barplot(..., names.arg=)` would just fall into place :-) – r2evans Oct 08 '20 at 19:07
  • 1
    Will do, I promise I'll understand all of this, just want to say that I didn't ask just to get it solved, I was just stuck here for hours, with no idea what to do next. Ty once again – hmm Oct 08 '20 at 19:14