0

I have literally tried to get the average cost of the items from the Price column of my data all day long, and I just cannot figure it out. I am new to this. So please, excuse me if I did not load all information correctly. Here are the details:

  1. Imported a CSV file.
  2. Created a dataset to reveal only the columns that I needed data for.
  3. Attempted to find the average price of fitness items.
#clean names
clean_names(wearables)

> #clean names
> clean_names(wearables)
# A tibble: 215 × 15
   name        price body_location category company_name company_url company_mapping… company_city
   <chr>       <chr> <chr>         <chr>    <chr>        <chr>       <chr>            <chr>       
 1 "Barska GB… $49.… Wrist         Fitness  Barska       http://www… Pomona, Califor… Pomona      
 2 "Belkin GS… $24.… Arms          Fitness  Belkin       http://www… Playa Vista, Ca… Playa Vista 
 3 "Ekho Fit-… $105… Wrist         Fitness  Ekho         http://www… Dallas, Texas, … Dallas      
 4 "Fitbit Fl… $94.… Wrist         Fitness  Fitbit       http://www… San Francisco, … San Francis…
 5 "Garmin Fo… $249… Wrist         Fitness  Garmin       http://www… Olathe, Kansas,… Olathe      
 6 "Garmin In… $169… Wrist         Fitness  Garmin       http://www… Olathe, Kansas,… Olathe      
 7 "Garmin Vi… $79.… Wrist         Fitness  Garmin       http://www… Olathe, Kansas,… Olathe      
 8 "Garmin Vi… $129… Wrist         Fitness  Garmin       http://www… Olathe, Kansas,… Olathe      
 9 "Jawbone -… $112… Wrist         Fitness  Jawbone      https://ww… San Francisco, … San Francis…
10 "Jawbone U… $52.… Wrist         Fitness  Jawbone      https://ww… San Francisco, … San Francis…
# … with 205 more rows, and 7 more variables: company_u_s_state <chr>, company_country <chr>,
#   source <chr>, link <chr>, duplicates_note_1 <lgl>, id <dbl>, image <chr>


#Filter for Fitness Category
wearables <- Wearables_DFE %>% filter(Category == "Fitness")

#Verify table
view(wearables)

enter image description here

#Show the applicable columns
wearables %>% select(Category, Name, Body.Location, Price)

> #Filter for Fitness Category
> wearables <- Wearables_DFE %>% filter(Category == "Fitness")

> #Verify table. Please see image attached. I do not know how to save a dataset of a table.
> view(wearables)
?

#Find the average price for a wearable fitness item, excluding the NA's.
wearables %>% group_by (Body.Location) %>% drop_na %>% summarize(average_cost = mean(Price))

> #Find the average price for a wearable fitness item, excluding the NA's.
> wearables %>% group_by (Body.Location) %>% drop_na %>% summarize(average_cost = mean(Price))
# A tibble: 0 × 2
# … with 2 variables: Body.Location <chr>, average_cost <dbl>
Warning message:
In mean.default(Price) : argument is not numeric or logical: returning NA
> 
r2evans
  • 141,215
  • 6
  • 77
  • 149
EpicMe
  • 1
  • 1
    (1) You're referencing `Price` but your data has `price`, names are case-sensitive. (2) Your `price` column is full of strings, you need to convert to numbers. Use `sub` to remove the leading `$`-sign (note, `$` is special in regex, see https://stackoverflow.com/a/22944075/3358272) (or use `substr`/`substring`), remove "thousands" indicators (comma?), then use `as.numeric`. Once you've done all that, you can calculate averages. (You don't have to do this within the `price` column, you can create a `price_num` and do that so you can keep the formatting of `price` as-is. Over to you.) – r2evans May 16 '22 at 02:06
  • 1
    If you want concrete help with this data, please post your data in an unambiguous format by pasting the output from `dput(head(wearables,10))` into a code block in your question. – r2evans May 16 '22 at 02:07
  • pay attention to when and where you got warning: it clearly says that **the argument you passed is not numeric or logical so it is returning NA.** – tushaR May 16 '22 at 03:04
  • This will be a great help. I appreciate the answers. – EpicMe May 18 '22 at 13:39
  • I am in a class taking this right now. I DO remember a mention of changing data into numeric data but couldn't figure out how for the life of me. I really appreciate this. – EpicMe May 18 '22 at 13:40

1 Answers1

0

First of all, you need to convert your price column from character to numeric. Then group by category and summarise with mean

Try this:

    library(dplyr)
mean_prices <- wearables %>%
        mutate(price = as.numeric(gsub(x = price, 
                             pattern = "$", 
                             replacement = ""))) %>%
        group_by(category) %>%
        summarise(mean = mean(price))

view(mean_prices)
Lucca Nielsen
  • 1,497
  • 3
  • 16