I need to organize the information from a long (and old) text file containing thousands of items into a dataframe. The information in the text file follows the same structure in all the items. My goal is to arrange each item in a different row of the dataframe.
Structure of the text file:
Title (number of books) Country
Date time (author) Page number CODES letter,letter...
Notes
An example of the content, showing the first 3 items:
Pride and Prejudice (5) United Kingdom
1981 10:23 h (Jane Austen) Page 241 CODES OB,IT,CA
Deposited by the G.M.W.
Brave New World (2) United Kingdom
1977 09:14 h (Aldous Huxley) Page 205 CODES OB,PU
Deposited by the E.L.
Wide Sargasso Sea (1) Jamaica
1989 16:51 h (Jean Rhys) Page 183 CODES OB,CA
Sent to the N.U.C.
I need to extract the first 6 elements of each item (title, number, country, date, time, author) and ignore the rest. The desired dataframe would be:
Title | NoBooks | Country | Date | time | Author |
---|---|---|---|---|---|
Pride and Prejudice | 5 | United Kingdom | 1981 | 10:23 | Jane Austen |
Brave New World | 2 | United Kingdom | 1977 | 09:14 | JAldous Huxley |
Wide Sargasso Sea | 1 | Jamaica | 1989 | 16:51 | Jean Rhys |
I have just found two similar posts (converting multiple lines of text into a data frame and Converting text file into dataframe in R) but my database doesn't have key characters to be used as separators.
Is there a way to separate my elemets? I've found a solution using Python libraries, but I would like to do it with R. Any suggestions?