0

I have a data.frame, irv, with a column of interest. class(irv) returns 'data.frame'.

is.recursive(irv) returns TRUE, is.atomic(irv) returns FALSE. In the console, irv$x returns the column of interest. max(irv$x) also returns the appropriate max value.

Inside a plot call I am attempting to set the xlim using the max value of this column so I have

plot(y~x,data = subsetirv,xlab = '', ylab = '', ylim = c(0,20), 
     xlim = c(0, max(irv$x)),
     cex = 0.5, pch = 19) 

those are all the arguments I have in case there is some weird argument interaction

Yet every time it throws the following error:

Error in irv$x : $ operator is invalid for atomic vectors

Why would the plot() call think that irv is atomic when everything else claims that it is a data frame?

Normally I would try to provide reproducible data, but I can't reproduce the problem other than with my actual data and I'm not sure of how to share the real data in a reasonable way.

is there some weird interaction I'm not thinking of?

btw, the data that is being plotted is a subset of irv, if that matters.

-edit- Something I just tried was saving the dataframe as a different object name. It was originally called irv and i saved into a new object called testdf. This resolved the issue. Is irv something in the plot or max function environments? The name was clearly a problem but I don't know why.

-edit2- after a suggestion, here is a pastebin of the output of dput(head(irv)): pastebin and here is the output of str(irv):

'data.frame':   16198 obs. of  17 variables:
 $ reader     : chr  "MG" "MG" "MG" "MG" ...
 $ read       : int  1 1 1 1 1 1 1 1 1 1 ...
 $ age        : num  2 3 4 5 6 7 8 9 10 11 ...
 $ fishid     : Factor w/ 2118 levels "2010_TNS_0135",..: 7 7 7 7 7 7 7 7 7 7 ...
 $ otorad     : num  6.15 9.52 13.47 17.32 22.28 ...
 $ year       : chr  "2010" "2010" "2010" "2010" ...
 $ readid     : chr  "2010_TNS_0153_MG_1" "2010_TNS_0153_MG_1" "2010_TNS_0153_MG_1" "2010_TNS_0153_MG_1" ...
 $ incwidth   : num  3.94 3.37 3.94 3.85 4.96 ...
 $ profflag   : chr  "good" "good" "good" "good" ...
 $ median     : num  3.85 3.82 3.78 3.77 3.78 ...
 $ upper75prob: num  4.44 4.19 3.94 3.94 4.03 ...
 $ lower25prob: num  3.58 3.65 3.67 3.5 3.56 ...
 $ IQR        : num  0.859 0.543 0.269 0.437 0.465 ...
 $ diff_flag  : num  0.0954 -0.8162 0.6171 0.1933 2.5376 ...
 $ roll_flag  : chr  "good" "good" "good" "good" ...
 $ irv        : num  3.936 -0.563 0.571 -0.09 1.104 ...
 $ irvf       : chr  "good" "good" "good" "good" ...
C. Denney
  • 577
  • 4
  • 16
  • It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. I'm confused by the edit, is the problem when the variable is named `df` as in your code or was it with `irv`? From your description it just sounds like a typo at some point. – MrFlick Mar 20 '19 at 20:48
  • Like I said, I wasn't able to reproduce the problem except with my real data, and don't know of a good/easy way to share the real data. I was calling it df throughout, because I didn't think/know that the actual name mattered, and I was trying to explain the problem in as straightforward a way as possible and I thought real object names would just unnecessarily clutter the text. However, when I tried changing the name I realized that apparently the name matters, and the problem only ever manifested when the object was called `irv`. I will update to reflect that – C. Denney Mar 20 '19 at 20:55
  • 2
    You can use `dput(head(irv))` to share some of the data. Also important to share is the output of `str(irv)` – Brian Mar 20 '19 at 21:01
  • @Brian edited to add those as requested. – C. Denney Mar 20 '19 at 21:17

1 Answers1

2

The problem isn't that you have a data.frame named irv, it's that you have a column named irv in a data.frame named irv.

When you use the formula syntax with plot(), all parameters are evaluated in the context of the data.frame you pass in the data= parameter. You your xlim = c(0, max(irv$x)) parameter is the basically running

xlim = with(irv, c(0, max(irv$x)))

so it's finding the column named "irv" and that column is a atomic and therefor doesn't know what to do with the $ operator.

To avoid confusion, pre-calculate that value outside the function (and make sure you don't have a column that shares the name of the variable you use

maxx <- max(irv$x)
plot(y~x,data = subsetirv,xlab = '', ylab = '', ylim = c(0,20), 
     xlim = c(0, maxx),
     cex = 0.5, pch = 19) 
MrFlick
  • 195,160
  • 17
  • 277
  • 295