2

I have a large dataframe sorted by fiscal year and fiscal period. I am trying to create a time plot starting at fiscal period 1 of 2015, ending at fiscal period 13 of 2019. I have two columns, one for FY, one for FP. They look like this.

enter image description here

I merged the two columns together separated by a 0 in a new column (C) using the code:

MarkP$C = paste(MarkP$FY, MarkP$FP, sep="0")

This ensures that my new column is a numeric variable.

It looks like this (check column C)

enter image description here

Then since I want to plot a time plot of total sales per period, I aggregated all sales to the level of C, so all rows ending with the same C aggregate together. I used this code for the aggregation.

MarkP11 <- MarkP %>% 
  group_by(C) %>% 
  summarise(Sales=sum(Sales))

This is what MarkP11 looks like.

enter image description here

The problem i'm having is that the row's are out of order so when I plot them, it gives me an incorrect plot. It has period 10 coming after period 1.

I've done some research and discovered that the sprintf function may work but i'm not sure how I can incorporate that into the code for my data frame.

The code below is how my C column is created by merging two columns. I believe I need to edit this line with the 'sprintf' function but i'm not sure how to get that to work.

R programming

MarkP$C = paste(MarkP$FY, MarkP$FP, sep="0")

I expect the ordering of the MarkP dataframe to look something like this:

MrFlick
  • 195,160
  • 17
  • 277
  • 295
Chris
  • 23
  • 3
  • 1
    It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. It's not helpful to post pictures of data. See the provided link for better alternatives. – MrFlick Aug 29 '19 at 15:47
  • 2
    Please do not post an image of code/data/errors: it cannot be copied or searched (SEO), it breaks screen-readers, and it may not fit well on some mobile devices. Ref: https://meta.stackoverflow.com/a/285557/3358272 (and https://xkcd.com/2116/). Please just include the code or data (e.g., `dput(head(x))` or `data.frame(...)`) directly. – r2evans Aug 29 '19 at 15:49
  • Thank you for the feedback, I won't post images anymore. – Chris Aug 29 '19 at 19:16

2 Answers2

2

sprintf is indeed what you want:

sprintf("%0.0f%02.0f", 2019, c(1,10))
# [1] "201901" "201910"

This assumes that FP's range is 0-99. It would not be incorrect to use sprintf("%d%02d", 2019, c(1,10)) since you're intending to use integers, but sometimes I find that seemingly-integer values can trigger Error ... invalid format '%02d', so I just strong-arm it. You could also use as.integer on each set of values ... another workaround.

r2evans
  • 141,215
  • 6
  • 77
  • 149
0

I was speaking with a colleague of mine and he helped me figure out the solution. Like r2evans commented, sprintf is the correct function. The syntax that worked for me was:

MarkP$C = paste(MarkP$FY, sprintf("%02d", MarkP$FP), sep-"")

What that did in my code was concatenate the two cells FY and FP together in a new cell titled "C". -It first added my FY column to the new cell. -Then, since sep="" there was no separator character so FY and FP were simply merged together. -Since I added the sprintf function with

("%02d",

it padded the FP column with 0 zero prior to tacking on my FP column.

Chris
  • 23
  • 3