0

I've created a script which goes through set directory and merge all Excel files into one csv

everything is fine but I got one additional column with the header of

<?xml version="1.0" encoding="UTF-8" standalone="yes"?> 

and NA values

How to get rid of that? I mean, ideally how to make that it never appears rather than delete it afterwards

My code is simple

fulldata <- dir("path", full.names = T) %>% map_df(read.csv)            

write.table(fulldata, file = "path/Merge.csv", sep = ",", col.names = T, row.names = F, quote = F)

Additional questions:

Can I do that all in xlsx instead of csv? I've tried read.xlsx from openxlsx library however it didnt worked out, I had error in this line

fulldata <- dir("path", full.names = T) %>% map_df(read.xlsx)
rainbowthug
  • 67
  • 1
  • 8
  • It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. What exactly was the error when you ran "read.xlsx". Maybe try the `readxl` package instead. – MrFlick Jan 21 '21 at 16:38
  • Hey. I can't provide exact example because it is my work data and it is sensitive. Imagine you have 32 columns in all files from directory, but when you merge them by using my piece of code it suddenly becomes 33, where 33'th is the `` filled with NA. What I would like to achieve is: get rid of the last column in the moment of merging files, not by removing it after all (ideally) and (if possible) getting rid of NA from entire dataframe, however the column situation is crucial for me at the moment. Thank you in advance – rainbowthug Jan 21 '21 at 17:05
  • I really can't imagine how that would happen. Maybe there was a problem when you converted your excel files to CSV files. How exactly did you do that? – MrFlick Jan 21 '21 at 17:07
  • I've done that by using a loop `setwd("path") require(openxlsx) file_list <- list.files(getwd()) for (file in file_list) { file.xl <- read.xlsx(file) write.csv(file.xl, file = sub("xlsx$", "csv", file),row.names = FALSE, quote = FALSE) }` – rainbowthug Jan 21 '21 at 17:09
  • @MrFlick sorry I forget to tag you – rainbowthug Jan 21 '21 at 17:14

0 Answers0