-1

I am a very new R user, and I am trying to use R to create a box plot for prices at target vs at Walmart. I also want to create 2 histograms for the prices at each store as well as qqplots. I keep getting various errors, including "Error in hist.default(mydata) : 'x' must be numeric:" and boxplot(mydata) "Error in x[floor(d)] + x[ceiling(d)] : non-numeric argument to binary operator" . I have correctly uploaded my csv file and I will attach my data for clarity. I have also added a direct c & p of some of my code. I have tried using hist(mydata), boxplot(mydata), and qqplot(mydata) as well, all which have returned with the x is not numeric error. I'm sorry if any of this is dumb, I am extremely new to R not to mention extremely bad at it. Thank you all for your help!

#[Workspace loaded from ~/.RData]
mydata <- read.csv(file.choose(), header = T) names(mydata)
#Error: unexpected symbol in "  mydata <- read.csv(file.choose(), header = T) names"
mydata <- read.csv(file.choose(), header = T)
names(mydata)
#[1] "Product" "Walmart" "Target" 
mydata
                                                   Product
1  Sara lee artesano bread
2  Store brand dozen large eggs
3  Store brand 2% milk 1 gallon (128 fl oz)
4   12.4 oz cheez its
5   Ritz cracker fresh stacks 8ct, 11.8 oz
6  Sabra classic hummus 10 oz
7   Oreo chocolate sandwich cookies 14.3 oz
8   Motts applesauce 6 ct/4oz cups
9   Bananas (each)
10  Hass Avocado (each)
11  Chips ahoy original family size, 18.2 oz
12  Lays potato chips party size, 13 oz
13  Amy’s frozen mexican casserole, 9.5 oz
14  Jack’s frozen pizza original thin crust, 13.8 oz
15 Store brand sweet cream unsalted butter, 4 count, 16 oz
16 Sour cream and onion pringles, 5.5 oz
17 Philadelphia original cream cheese spread, 8 oz
18 Daisy sour cream, regular, 16 oz: 
19 Kraft singles, 24 ct/16 oz: 
20 Doritos nacho cheese, party size, 14.5 oz
21 Tyson Fun Chicken nuggets, 1.81 lb (29 oz), frozen
22 Kraft mac n cheese original, 7.25 oz
23 appleapple gogo squeeze, 12ct, 3.2 oz each 
24 Yoplait original french vanilla yogurt, 6oz
25 Essentia bottled water, 1 liter
26 Premium oyster crackers, 9oz
27 Aunt Jemima buttermilk pancake miz, 32 oz
28 Eggo frozen homestyle waffles, 10ct/12.3 oz
29  Kellogg's Froot Loops, 10.1 oz
30 Tostitos scoops tortilla chips, 10 oz
   Walmart Target
1     2.98   2.99
2     1.93   1.99
3     2.92   2.99
4     3.14   3.19
5     3.28   3.29
6     3.68   3.69
7     3.48   3.39
8     2.26   2.29
9     0.17   0.25
10    1.18   1.19
11    3.98   4.49
12    4.48   4.79
13    4.58   4.59
14    3.42   3.59
15    3.18   2.99
16    1.78   1.79
17    3.24   3.39
18    1.94   2.29
19    4.18   4.39
20    4.48   4.79
21    6.42   6.69
22    1.00   0.99
23    5.98   6.49
24    0.56   0.69
25    1.88   1.99
26    3.12   2.99
27    2.64   2.79
28    2.63   2.69
29    2.98   2.99
30    3.48   3.99
hist(mydata)
#Error in hist.default(mydata) : 'x' must be numeric
x<-sample(LETTERS[1:5],20,replace=TRUE)
df<-data.frame(x)
df
   x
1  E
2  B
3  A
4  B
5  E
6  B
7  A
8  A
9  C
10 E
11 A
12 B
13 A
14 B
15 C
16 D
17 C
18 E
19 A
20 D
x<-sample(LETTERS[1:5],20,replace=TRUE)
df<-data.frame(x)
hist(df$x)
#Error in hist.default(df$x) : 'x' must be numeric
x<-sample(LETTERS[1:5],20,replace=TRUE)
df<-data.frame(x)
barplot(table(df$x))
boxplot(mydata)
#Error in x[floor(d)] + x[ceiling(d)] :
#   non-numeric argument to binary operator
qqplot("Walmart")
#Error in sort(y) : argument "y" is missing, with no default
qqplot(mydata)
#Error in `[.data.frame`(x, order(x, na.last = na.last, decreasing = decreasing)) : 
#  undefined columns selected
#In addition: Warning message:
#In xtfrm.data.frame(x) : cannot xtfrm data frames

Image of data

Ian Campbell
  • 23,484
  • 14
  • 36
  • 57
Olivia A
  • 1
  • 1
  • Welcome to SO! It would be easier to help you if you provide [a minimal reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) including the code you have tried and a snippet of your data or some fake data. – stefan Apr 30 '22 at 06:13
  • I added some of my code. Let me know if that helps! Thank you so much for the tip! – Olivia A Apr 30 '22 at 06:25
  • 3
    Well, for the future I would suggest to go though the link I referenced and learn how to format code and how to provide your data via `dput()`. But from what you posted: You are simply throwing your data into functions. If you want a histogram try with e.g. `hist(mydata$Walmart)` instead of `hist(my data)`. – stefan Apr 30 '22 at 06:39

1 Answers1

1

There seems to be a problem with the data you uploaded but no matter...I will just create data resembling your problem and show you how to do it with some simple code (some may offer alternatives like ggplot, but I think my example will use shorter code and be more intuitive.)

First, we can load ggpubr for plotting functions:

# Load ggpubr for plotting functions:
library(ggpubr)

Then we can create a new data frame, first with the prices and store names, then combining them into a data frame we can use:

# Create price values and store values:
prices.1 <- c(1,2,3,4,5,3)
prices.2 <- c(8,6,4,2,0,1)
store <- c("walmart",
       "walmart",
       "walmart",
       "target",
       "target",
       "target")

# Create dataframe for these values:
store.data <- data.frame(prices.1,
                 prices.2,
                 store)

Now we can just plug in our data into all of these plots nearly the same way each time. the first part of the code is the plot function name, the data part is our stored data, and the x and y values are what we use for our variables:

# Scatterplot:
ggscatter(data = store.data,
          x="prices.1",
          y="prices.2")

enter image description here

# Boxplot:
ggboxplot(data = store.data,
          x="store",
          y="prices.1")

enter image description here

# Histogram:
gghistogram(data = store.data,
            x="prices.1")

enter image description here

# QQ Plot:
ggqqplot(data = store.data,
         x="prices.1")

enter image description here

There are simpler alternatives like base R functions like this, but I find they are much harder to customize compared to ggpubr and ggplot:

plot(x,y)

enter image description here

Of course, you can really customize the ggpubr and ggplot output to look much better, but thats up to you and what you want to learn:

ggboxplot(data = store.data,
          x="store",
          y="prices.1",
          fill = "store",
          title = "Prices of Merchandise by Store",
          caption = "*Data obtained from Stack Overflow",
          palette = "jco",
          legend = "none",
          xlab ="Store Name",
          ylab = "Prices of Merchandise",
          ggtheme = theme_pubclean())

enter image description here

Hope thats helpful. Let me know if you have questions!

Shawn Hemelstrand
  • 2,676
  • 4
  • 17
  • 30