0

I'm just starting to use R and imported an Excel file named tblMainXmasFund.xlsx. I can read it using

readxl::read_excel("C:/Excel/For R/tblMainXmasFund.xlsx"). 

However, when I try to do calculations like sum(Amount) I get:

Error: object 'Amount' not found. If I use sum("Amount") I get: invalid 'type' (character) of argument.

R recognizes the Amount column as being numeric (it's actually double) so I'm guessing I need to change my syntax but don't know to what.

There are only two columns of data (CheckDate and Amount) I will be using so would like to know the code for doing a calculation like sum(Amount) where CheckDate >=10/1/2022 and <=12/31/2022. All suggestions are appreciated.

Martin Gal
  • 16,640
  • 5
  • 21
  • 39
  • 2
    Please make a [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) or [minimal reproducible example](https://stackoverflow.com/help/minimal-reproducible-example) with a sample input. – Martin Gal Mar 18 '22 at 23:08
  • 5
    Typically you would read the data and save it to an object, e.g. `my_data <- readxl::read_excel("C:/Excel/For R/tblMainXmasFund.xlsx")` and then access it with `sum(my_data$Amount)`. If you just read the data it will show what it looks like but it won't remain available to access. – Jon Spring Mar 18 '22 at 23:11
  • Jon: Your code worked - thank you! What I need now is the syntax to sum results by time period and display them in a barplot. My file contains four years worth of data so I need the equivalent of a WHERE statement that specifies the date range for each year. I tried adding different variations of
    Where(XmasFund$CheckDate >=10/1/2022 and XmasFund$CheckDate <=12/31/2022) but get Error: unexpected symbol in "Where(XmasFund$CheckDate >=10/1/2022 and" and don't know what to try next.
    – Bryan Schmidt Mar 19 '22 at 20:29

1 Answers1

0

Hard to know exactly what is going on, but here is my best guess. I have created this 3x2 sheet in Excel to mimic yours:

enter image description here

I then saved it into R by loading the readxl package to load the sheet, saved it as a data frame, then ran the sum on the Amount variable:

# Load library for excel functions:
library(readxl)

# Save excel sheet into data frame:
df <- read_excel("slack.xlsx")

# Run sum of Amount:
sum(df$Amount)

Which gives you the sum below:

> sum(df$Amount)
[1] 352.84

Edit

Based off your comment below, you can summarize your CheckDate values with the following code:

library(tidyverse)

reconstruct %>% 
  mutate(CheckDate = as.character(CheckDate)) %>% 
  group_by(CheckDate) %>% 
  summarise(Total = sum(Amount))

Which gives you these summarized values by date:

# A tibble: 6 x 2
  CheckDate  Total
  <chr>      <dbl>
1 2020-10-15  2800
2 2020-10-16  1500
3 2020-10-17  1000
4 2020-10-19  3600
5 2020-10-20  2450
6 2020-10-21  1700

To graph these, you can use the following basic code, which just adds the ggplot and geombar commands:

reconstruct %>% 
  mutate(CheckDate = as.character(CheckDate)) %>% 
  group_by(CheckDate) %>% 
  summarise(Total = sum(Amount)) %>%
  ggplot(aes(x=CheckDate,
             weight=Total))+
  geom_bar()

enter image description here

If you want it to be a little fancier, you can also use this:

reconstruct %>% 
  mutate(CheckDate = as.character(CheckDate)) %>% 
  group_by(CheckDate) %>% 
  summarise(Total = sum(Amount)) %>%
  ggplot(aes(x=CheckDate,
             weight=Total,
             fill = CheckDate))+
  geom_bar()+
  theme_bw()+
  labs(title = "Check by Date Bar Graph",
       x = "Date of Check",
       y = "Total Amount",
       caption = "*Data obtained from StackOverflow.")+
  theme(legend.position = "none",
        plot.title = element_text(face = "bold"),
        plot.caption = element_text(face = "italic"))+
  scale_fill_hue(l=40, c=35)

Which gives you something a little more colorful:

enter image description here

Shawn Hemelstrand
  • 2,676
  • 4
  • 17
  • 30
  • Shawn: Your code worked when I entered the path for my source file. Thank you for creating a sample file. What I need now is the syntax to sum results by date range and display them in a barplot. My file contains four years worth of data so I need the equivalent of a WHERE statement that specifies the date range for each year. I tried adding different variations of
    Where(CheckDate >=10/1/2022 and CheckDate <=12/31/2022) but get Error: unexpected symbol in "Where(CheckDate >=10/1/2022 and" and don't know what to try next
    – Bryan Schmidt Mar 19 '22 at 20:52
  • Its hard to help without having some kind of example from your data. Any way you can share what this looks like? – Shawn Hemelstrand Mar 19 '22 at 23:54
  • I would upload the source file if I knew how to do it. I'm new to Stack Overflow. Can you please describe how to attach a file and upload it? If it's not possible to upload files, just use the column names in your image above which are identical to mine. I just need to know how to write the code for the WHERE statement for a bar plot where each bar represents data for a specific time period. Use whatever dates you want such as >=10/1/2021 and <=12/31/2021, >=9/1/2020 and <=11/30/2020, >= 10/1/2019 and <=12/15/2019, and >=10/15/2018 and <=11/30/2018. There will be four years displayed. – Bryan Schmidt Mar 20 '22 at 04:30
  • Hey there, sorry I only check here periodically. Use the `dput` command in R to reproduce what you have. You can find out how here: https://www.youtube.com/watch?v=3EID3P1oisg&t=9s&ab_channel=MarkGingrass – Shawn Hemelstrand Mar 24 '22 at 04:18
  • Shawn: I hope this works – Bryan Schmidt Mar 25 '22 at 23:06
  • reconstruct <-structure(list(ID = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15), CheckDate = structure(c(1603065600, 1602720000, 1603238400, 1602806400, 1603065600, 1603152000, 1602720000, 1603152000, 1603065600, 1603238400, 1603152000, 1602892800, 1602806400, 1602892800, 1603152000), tzone = "UTC", class = c("POSIXct", "POSIXt")), Amount = c(2000, 2000, 1200, 1000, 1000, 900, 800, 700, 600, 500, 500, 500, 500, 500, 350)), row.names = c(NA, -15L), class = c("tbl_df", "tbl", "data.frame")) – Bryan Schmidt Mar 25 '22 at 23:40
  • Shawn: I don't know if what I posted above will work. The CheckDate values appear to be in universal time and they are all for the year 2020. The text box only has space for 15 records. If you can work with it use the same date range for each of the four years. I just need the code to create a bar plot that shows one bar for each year of data. For this purpose each bar will be identical. – Bryan Schmidt Mar 25 '22 at 23:48
  • I have edited my answer. Let me know if it fits what you are looking for. – Shawn Hemelstrand Mar 26 '22 at 01:36
  • Stack overflow is not really the place to learn all the basics of programming. I would recommend a good tutorial book. – qwr Mar 26 '22 at 02:50
  • If that is the case, I recommend R for Data Science if you are struggling with the basics. – Shawn Hemelstrand Mar 26 '22 at 04:53