I have many txt
files that contain the same type of numerical data in columns separated by ;. But some files have column headers with spaces and some don't (created by different people). Some have extra columns which that I don't want.
e.g. one file might have a header like:
ASomeName; BSomeName; C(someName%)
whereas another file header might be
A Some Name; B Some Name; C(someName%); D some name
How can I clean the spaces out of the names before I call a "read" command?
#These are the files I have
filenames<-list.files(pattern = "*.txt",recursive = TRUE,full.names = TRUE)%>%as_tibble()
#These are the columns I would like:
colSelect=c("Date","Time","Timestamp" ,"PM2_5(ug/m3)","PM10(ug/m3)","PM01(ug/m3)","Temperature(C)", "Humidity(%RH)", "CO2(ppm)")
#This is how I read them if they have the same columns
ldf <- vroom::vroom(filenames, col_select = colSelect,delim=";",id = "sensor" )%>%janitor::clean_names()
Clean Headers script
I've written a destructive script that will read in the entirety of the file, clean the header of spaces, delete the file and re-write (vroom complained sometimes of not being able to open X thousands of files) the file using the same name. Not an efficiency way of doing things.
cleanHeaders<-function(filename){
d<-vroom::vroom(filename,delim=";")%>%janitor::clean_names()
#print(head(d))
if (file.exists(filename)) {
#Delete file if it exists
file.remove(filename)
}
vroom::vroom_write(d,filename,delim = ";")
}
lapply(filenames,cleanHeaders)