I have to collect data in R, that has been given to me in a xls format, but when I open it with Excel it says that the extension and the format don't match, the file suggests I should save it as a .txt file.
The file I have to use typically contains 3 sections, with different tables in them, which have different sizes and column names. The sections are announced by a title between square brackets. This is a simplified version of my file.
I am only interested in the third section, called '[DATA]'. So far I have manually saved it as an xlsx file and worked my way to use the data I was interested in using read_excel. After reading the whole sheet in R I collected the row where the title '[DATA]' was (it can vary from file to file, I can't select a row number as in readLines), then I could select the table underneath after taking the column names (T, Time, Tension etc.) as my new dataframe's column names. I'd like to be able to do something similar starting from a txt file, because I have a lot of files to work with and they are formatted exactly the same way.
I've tried several functions to read the file as a .txt, like
1A = data.table::fread(file, header = F, fill=F, sep = '\t')
2) A = read.delim(file)
3)A = data.frame(readLines(file))
4) A = read.table(file)
It saves the first table from SETUP and stops early, with this error message "Stopped early on line 25. Expected 24 fields but found 1. Consider fill=TRUE and comment.char=. First discarded non-empty line: <<Number of Duts: 24>>" . If I type fill = TRUE I get the same result as 3.
It makes a big column of all the cells, line after line and cell by cell. It becomes difficult to rearrange the data in a table from there.
It makes a big column again but each line of the file is a cell in the dataframe, and the content of the cell is a string of all the numbers, separated by \t . Example for line 8: experiment1\group1\t0\t7200\t0.001\t"
I get this error message : Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec, : line 4 did not have 2 elements
I don't know which of these functions is the best lead for this task.
NB : The numbers dispayed in the error messages might be different of what I would get with the example, but I don't even get the error messages with the example (when i made it, Excel required me to put an apostrophy in the cell so the 'minus' sign wouldn't be seen as a formula, so I did. I then saved the file in txt and xls, and even added the xls extension to the txt file to create an incoherence of extension like in my original file. It works in any case.)
Thanks for your help !