Questions tagged [padr]

The padr tag refers to the package(padr), in the R programming language. Use this tag for questions regarding this package. The subjects of questions generally asked about this tag are: missing data in time series, N/A values in time series, time interval checks, padding data, and date time formatting & analysis. Use this tag if you are using the functions pad() or thicken(). Please use this tag in conjunction with the tag, r.

The package "padr" is very useful for date-time data sets. It can be used to "fill in the blanks" (with N/A values), in time and date series.

The function pad() allows the user to search through their data frame by many variables, ex. 15 minutes, 1 day, 1 month etc. When the function finds a change in time longer than the user specified interval it assigns it an N/A value.

These N/A values can be replaced later on using functions such as "na.locf" from the "zoo" package, or can just be used to see where ones' data is missing.

Example:

library(padr)

#Example Dates
datetime<-c(Sys.Date(),Sys.Date()+2,Sys.Date()+3,Sys.Date()+5)

#Example Data
data<-c(1,2,3,4)

#Create Df
Df<-data.frame(datetime,data)


#We can see we are missing two dates here!
>Df
    datetime data
1 2017-08-16    1
2 2017-08-18    2
3 2017-08-19    3
4 2017-08-21    4

#Default Pad function has interval='day'
pad(Df)

#Now padded data

pad applied on the interval: day

>Df
datetime data
1 2017-08-16    1
2 2017-08-17   NA
3 2017-08-18    2
4 2017-08-19    3
5 2017-08-20   NA
6 2017-08-21    4

Here is a slightly more complex example using a 15 minute interval

#Example Dates
Dates<-c("2012-09-28 08:00","2012-09-28 08:15","2012-09-28 08:45")

#Since this is an example we must convert the character dates to POSIXct
Dates<-as.POSIXct(Dates, format="%Y-%m-%d %H:%M")

#Example Data
Data<-c(1,2,3)

#Creat Df
DF<-data.frame(Dates,Data)

#We can see we are missing  a 15 min interval at 8:30
>DF
                Dates Data
1 2012-09-28 08:00:00    1
2 2012-09-28 08:15:00    2
3 2012-09-28 08:45:00    3

#Pad on interval= 15 min
PaddedDF<-pad(DF, interval='15 min')

>PaddedDF
                Dates Data
1 2012-09-28 08:00:00    1
2 2012-09-28 08:15:00    2
3 2012-09-28 08:30:00   NA
4 2012-09-28 08:45:00    3
23 questions
4
votes
3 answers

padr in R: padding at user-defined interval

I'm working with time series data at 5-minute time intervals. Some of the 5-minute time series are missing. I'd like to resample the dataset to fill in the missing 5-minute periods with NaN values. I found great information on how to approach this…
Guy
  • 310
  • 2
  • 9
2
votes
1 answer

Pad within grouped dates in R

library(tidyverse) library(lubridate) library(padr) df <- tibble(`Action Item ID` = c("ABC", "DEF", "GHI", "JKL", "MNO", "PQR"), `Date Created` = as.Date(c("2019-01-01", "2019-01-01", …
Display name
  • 4,153
  • 5
  • 27
  • 75
2
votes
1 answer

How to use fill_by_function() with na.approx() [linear interpolation] inside dplyr

I'm going through the documentation for padr: https://cran.r-project.org/web/packages/padr/vignettes/padr.html. Changing the vignette example slightly to make use of linear interpolation (zoo::na.approx()) on the data is generating an…
Dan
  • 1,711
  • 2
  • 24
  • 39
1
vote
0 answers

Using do(pad(.)) in R and keeping all column names

I am attempting to fill in the missing dates in my dataset using the pad function. If I use regular pad such as data %>% pad(group = GROUP2) then it works fine and keeps the column values such as brand, device, etc. However, some of my data occurs…
mfalcon
  • 11
  • 1
1
vote
1 answer

Using padr::thicken() with an uneven timestamp interval

I have a dataset that looks like this: structure(list(Fish_ID = c("Fork1", "Fork10", "Fork15", "Fork20", "Fork21", "Fork22", "Fork23", "Fork4", "Fork5", "Fork7", "Fork9", "Fork12", "Fork13", "Fork14", "Fork16", "Fork17", "Fork18", "Fork19",…
David Smith
  • 305
  • 1
  • 8
1
vote
1 answer

Fill data gaps with Pad using group returns error

I have time-series data that starts and ends during the calendar year and most fill functions (like pad, package padr) fill gaps between start and end dates. However I need a complete annual record. For example if my data start date is 2016-01-03…
DAY
  • 91
  • 6
1
vote
1 answer

Thicken date range using padr where starting value is the same as one of the dates in data frame

I am not totally sure if this is a bug or am I actually doing something wrong. But I will ask the question here and go from there. Suppose we have a dummy data set of number of calls: df_calls = data.frame(Call_date= c("2019-02-18", …
1
vote
3 answers

R Padding multiple time series from within an unbalanced panel data set

I have a panel data set for daily revenue (and other variables) by ID, where the day with 0 revenue go unreported. I want to fill in these blanks with 0 for my analysis, meaning that for each ID's time series, I need to make sure there is an…
Analyst Guy
  • 115
  • 13
1
vote
0 answers

Padding multiple columns in a data frame or data table

I have a data frame like the following and would like to pad the dates. Notice that four days are missing for id 3. df = data.frame( id = rep(1,1,1,2,2,3,3,3), date = lubridate::ymd("2017-01-01","2017-01-02","2017-01-03", …
ATMA
  • 1,450
  • 4
  • 23
  • 33
1
vote
2 answers

Using padr with thicken results in error "missing value where TRUE/FALSE needed"

I've been trying to get padr to work with my dataset without much success, although I can get the examples to work: # I have a few datetime columns so I convert all to POSIXct with UTC. > df <- mutate_at(DATABASE, vars(ends_with("time")),…
Daren Eiri
  • 137
  • 12
1
vote
1 answer

R - Using padr on Multiple Fields

I am using padr 0.3.0 to pad out any missing timestamps for server statistics and it works great. I am currently only padding by timestamp. My question is if I want to pad by "timestamp" and another field I'll call "diskname", can I do that the same…
0
votes
1 answer

Padr Function for incomplete data

Here is a portion of my data: >head(state1) ># A tibble: 6 x 5 date TYP_INT WEATHER pedday intersection 1 2019-01-02 1 2 0.204 1 2 2019-01-04 1 10 0.204 …
Moe
  • 13
  • 4
0
votes
1 answer

Is there a way in R to fill half-hourly nighttime data gaps?

I have a set of 10 years (2009-2020), 30-min interval meteorological datasets, but the data has missing values during night (~17:00 to ~08:00 next day) for 2 two years (2015-2017) due to battery failure of the instrument. Variables are: air…
0
votes
0 answers

R's padr package claiming the "datetime variable does not vary" when it does vary

library(tidyverse) library(lubridate) library(padr) df #> # A tibble: 828 x 5 #> Scar_Id Code Type Value YrMo #> #> 1 0070-179 AA Start_Date …
Display name
  • 4,153
  • 5
  • 27
  • 75
0
votes
1 answer

How can I pad a vector of dates by units of less than 1 second in R?

I am trying to figure out how to add time intervals to a dates vector in R using units of less than 1 second. As you can see, padding one second intervals works, however decimals fail. I have looked around and it seems there is no "milliseconds"…
Stonecraft
  • 860
  • 1
  • 12
  • 30
1
2