1

I am trying to pull out a subset of my data in r for another question I have. I am not sure exactly how to pull a subset of the data out that is read in by folder.

My data is currently read in by the following code:

library(data.table, warn.conflicts = FALSE)
library(lubridate, warn.conflicts = FALSE)

################
## PARAMETERS ##
################

# Set path of major source folder for raw transaction data
in_directory <- "C:/Users/NAME/Documents/Raw Data/"

# List names of sub-folders (currently grouped by first two characters of 
CUST_ID)
in_subfolders <- list("AA-CA", "CB-HZ", "IA-IL", "IM-KZ", "LA-MI", "MJ-MS",
                  "MT-NV", "NW-OH", "OI-PZ", "QA-TN", "TO-UZ",
                  "VA-WA", "WB-ZZ")

# Set location for output
out_directory <- "C:/Users/NAME/Documents/YTD Master/"
out_filename <- "OUTPUT.csv"

# Set beginning and end of date range to be collected - year-month-day format
date_range <- interval(as.Date("2018-01-01"), as.Date("2018-05-31"))

# Enable or disable filtering of raw files to only grab items bought within 
certain months to save space.
# If false, all files will be scanned for unique items, which will take 
longer and be a larger file.
date_filter <- TRUE

I am looking to give a data set with the question I have so that i can give a reproducible example.

I deal with large amounts of data so I pull info from a database and store it by date in folders. I then have it set so that I can pull whatever dates I need from the data.

I gave a little more than necessary in the code but that is the first part before I use code to manipulate.

Prany
  • 2,078
  • 2
  • 13
  • 31
J fast
  • 53
  • 8

0 Answers0