I have a large list of tables each including multiple variables. As a simple example, to understand the structure of my data, please see below:
dat1 <- structure(list(NR = c("DBD-0006", "DBD-0057",
"DBD-0095", "GHP-0169", "GHP-0237", "NNB-0243", "NNB-0303",
"NNB-0306", "NNB-0359", "NNB-0364"), DATE = c("13-07-2011",
"15-12-2010", "09-03-2011", "14-09-2011", "30-06-2010", "16-05-2016",
"04-07-2012", "11-07-2012", "05-12-2012", "12-12-2012"), CODE= c("1",
"1", "1", "1", "1", "1", "1", "1", "1", "1"), DATE2 = c("18-07-2011",
"15-12-2010", "09-03-2011", "14-09-2011", "05-01-2012", "11-05-2016",
"05-07-2012", "11-07-2012", "06-12-2012", "12-12-2012"), type = c("YY.90.01",
"50.19", "50.37", "50.37", "50.37", "YY.90.00",
"50.37", "YY.50.01", "YY.82.01", "YY.50.02"), center = c("DBD",
"DBD", "DBD", "GHP", "GHP", "NNB", "NNB", "NNB", "NNB",
"NNB")), row.names = c(NA, -10L), class = c("tbl_df", "tbl",
"data.frame"))
dat2 <- dat3 <- dat1
tables <- list(df1 = dat1,
df2 = dat2,
df3 = dat3)
In my data there are several non-ASCII characters and I need to identify where in each dataset they appear. I have written a for loop which yields the columns that include these characters. However, it take hours to run the loop! I need to accelerate it and for that I think a fucntion from the apply family can speed up. Below you can see the code:
nonUTF8<- list()
for (table in names(tables)) { ##this gives the table names
for (item in names(tables[[table]])){ #this gives the variables within each table
nonUTF8[[table]] <- tables[[table]] [,(grepl("[^\x01-\x7F]", tables[[table]]))]
}}
I would appreciate your advice.it