0

I have an Excel file with data (5 columns and 200 rows).

Rows are the next:

TransactionId, UserId, transaction_date, transaction_status, amount_USD.

The data.table is like that (it is not full, just to catch the idea):

Screenshot of datatable1

enter image description here

The task is rearrange the table in way it will look like 3 columns (transaction_date, CHARGED, DECLINED), where CHARGED and DECKLINED - are the sum of all amount_USD by the same date with specific status (Charged or Declined).

So, it should look like that:

enter image description here

And also, when I use such code for importing from Excel - is it correct?

> library("xlsx")

> data1 <- read.xlsx("d:/R/1.xlsx", 1)
zx8754
  • 52,746
  • 12
  • 114
  • 209
P. Dan
  • 3
  • 1
  • 3
  • 2
    Welcome to StackOverflow! Please read the info about [how to ask a good question](http://stackoverflow.com/help/how-to-ask) and how to give a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610). This will make it much easier for others to help you. – Jaap Jul 23 '16 at 16:24
  • If you're able to read in the data from Excel then paste a representation sample of the data using `dput` and we can help you rearrange it. If you can't import the data from Excel then focus on that problem first and let us know what error message you get. – Hack-R Jul 23 '16 at 17:00

2 Answers2

0

These questions have been answered before. Here is a post I found in 2 seconds regarding issues with reading .xlsx files into R.

I recommend saving the xlsx file as a .csv and using read.csv. Read the file in like this:

data1 <- read.csv("d:/R/1.csv", stringsAsFactors=FALSE)

R is not going to recognize the appropriate formats for your date and USD columns. You can make sure to change your USD column by having . instead of ,.

Here is what you want to accomplish using reshape() with sample data since you did not provide any in R.

data1 <- data.frame(
  date=c("22.08.2015","11.08.2015","24.08.2015","28.07.2015"),
  transaction_status=c("CHARGED","DECLINED","CHARGED","DECLINED"),
  amount=c(10.96,14.7,10.61,10.96)
)

#convert your date from character to date
data1$date <- as.Date(data1$date, "%d.%m.%Y") 

#put your data in wide format
data1 <- reshape(data1, idvar="date", timevar="transaction_status", direction="wide")

#sort by date
data1 <- data1[order(data1$date),]

Is this output what you are looking for?

> data1
        date amount.CHARGED amount.DECLINED
4 2015-07-28             NA           10.96
2 2015-08-11             NA           14.70
1 2015-08-22          10.96              NA
3 2015-08-24          10.61              NA
Community
  • 1
  • 1
Warner
  • 1,353
  • 9
  • 23
  • Thank you! The screenshot I provided is not the whole data table... The whole is 200 rows. Am i right that your method is mile manual creating a table? And what about outcome - that looks correct. =) – P. Dan Jul 23 '16 at 17:26
  • @P.Dan I'm not sure if I understand your question. The data I created was just an example with 3 columns and 4 rows. One thing I should add is that since you have those two other columns you'll want to set `idvar=c("TransactionId","UserId","transaction_date")` when you use `reshape` . If I answered your question adequately please check it :) – Warner Jul 23 '16 at 17:35
  • Sorry it was missprint ("mile"="like"). I really don't understand this step : "data1 <- data.frame(...." - is this one creation of your own sample? I mean, when I import .csv - it is NOT data.frame from the beginning, should I convert imported .csv to data.frame or smth? Because when I tried without this step it writes me that it doesn't know how to convert "data$transaction_date" to class "date"... – P. Dan Jul 23 '16 at 22:23
0

It sounds like you want to group by date and transaction type (Charged/Declined).

I'm not sure if you were planning to use R for a specific reason, so I'll show you one way to do it in R and also how to do it in Excel, in case you want to save yourself some effort.

  • Using R

If you plan to use R for this task, you may want to save the current Excel file 'filename.xlsx' as 'filename.csv' because R requires some additional libraries to read in Excel native files. They've never worked for me, so I just use .csv files instead.
The dummy_csv file I created is formatted like yours, so just change the code as necessary:

> dummy_csv
  transaction_ID      Date Transaction_type    Amount
1            101 12.1.2016          Charged  $270.29 
2            102  1.6.2015         Declined  $840.25 
3            103 12.1.2016          Charged  $233.26 
4            104  1.7.2016         Declined  $406.04 
5            105  1.7.2016          Charged  $886.73 
6            106 12.1.2016         Declined  $348.44 
7            107  3.8.1015          Charged  $430.09 
8            108  3.7.2015         Declined  $655.17 

Here's the code:

install.packages('dplyr') # Install dplyr if you don't have it.
library(dplyr) # Load dplyr


dummy_csv <- read.csv("~/dummy_csv.csv") # Import dataset 

dummy_csv <- tbl_df(dummy_csv) # Convert to table dataframe via dplyr

dummy_csv$Amount <- sub(pattern = "$", # Get rid of dollar signs
                    replacement = "",
                    x=as.character(dummy_csv$Amount),
                    fixed=TRUE)

dummy_csv <- mutate(dummy_csv,Amount = as.numeric(Amount)) # Make the values numeric

dummy_csv <- dummy_csv %>%
            group_by(Date,Transaction_type) %>%
            summarise('Daily_Total' = sum(Amount))

The output in R will be:

       Date Transaction_type Daily_Total
 (fctr)           (fctr)       (dbl)
1  1.6.2015         Declined      840.25
2  1.7.2016          Charged      886.73
3  1.7.2016         Declined      406.04
4 12.1.2016          Charged      503.55
5 12.1.2016         Declined      348.44
6  3.7.2015         Declined      655.17
7  3.8.1015          Charged      430.09
  • Using Excel

You can accomplish the same task in Excel without using R.

  1. Highlight the range of data in Excel and click on the Insert menu -> pivot table.

  2. In the Pivot Table dialogue box, Drag 'Date' and 'Transaction_Type' into the Rows box. Drag 'Amount' into the Values box, and set the calculation as SUM().

The output will be the same.

Hope this helps!

  • Thank you! The code seems to be ok, hope the task will be ok too. And yep, the variant with Excel is much better, but the task was exactly in R =( – P. Dan Jul 23 '16 at 22:12