0

This is my sample output table which i want to parse and get this the image i will post.

     dput(head(tbl2,20))
structure(list(PMCID = c("PMC7362563", "PMC7362563", "PMC7362563", 
"PMC7362563", "PMC7362563", "PMC7362563", "PMC7362563", "PMC7362563", 
"PMC7362563", "PMC7362563", "PMC7362563", "PMC7362563", "PMC7362563", 
"PMC7362563", "PMC7362563", "PMC7362563", "PMC7362563", "PMC7362563", 
"PMC7362563", "PMC7362563"), table = c("Table 1", "Table 1", 
"Table 1", "Table 1", "Table 1", "Table 1", "Table 1", "Table 1", 
"Table 1", "Table 2", "Table 2", "Table 2", "Table 2", "Table 2", 
"Table 2", "Table 2", "Table 2", "Table 2", "Table 2", "Table 2"
), row = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 1L, 2L, 3L, 4L, 
5L, 6L, 7L, 8L, 9L, 10L, 11L), text = c("subheading=Achieved CR; Best overall response, n (%)=CR; Glasdegib + LDACN = 78=15 (19.2); LDAC aloneN = 38=1 (2.6)", 
"subheading=Did not achieve CR; Best overall response, n (%)=CRi; Glasdegib + LDACN = 78=4 (5.1); LDAC aloneN = 38=1 (2.6)", 
"subheading=Did not achieve CR; Best overall response, n (%)=PR; Glasdegib + LDACN = 78=5 (6.4); LDAC aloneN = 38=0", 
"subheading=Did not achieve CR; Best overall response, n (%)=PRi; Glasdegib + LDACN = 78=2 (2.6); LDAC aloneN = 38=0", 
"subheading=Did not achieve CR; Best overall response, n (%)=MLFS; Glasdegib + LDACN = 78=2 (2.6); LDAC aloneN = 38=0", 
"subheading=Did not achieve CR; Best overall response, n (%)=MR; Glasdegib + LDACN = 78=4 (5.1); LDAC aloneN = 38=4 (10.5)", 
"subheading=Did not achieve CR; Best overall response, n (%)=SD; Glasdegib + LDACN = 78=14 (17.9); LDAC aloneN = 38=9 (23.7)", 
"subheading=Did not achieve CR; Best overall response, n (%)=Treatment failure; Glasdegib + LDACN = 78=9 (11.5); LDAC aloneN = 38=7 (18.4)", 
"subheading=Did not achieve CR; Best overall response, n (%)=Not evaluable; Glasdegib + LDACN = 78=23 (29.5); LDAC aloneN = 38=16 (42.1)", 
"subheading=Age (years), n (%); Characteristic=45–64; Achieved CR: Glasdegib + LDACN = 15=0; Achieved CR: LDAC aloneN = 1=0; Did not achieve CR: Glasdegib + LDACN = 63=1 (1.6); Did not achieve CR: LDAC aloneN = 37=1 (2.7)", 
"subheading=Age (years), n (%); Characteristic=≥ 65; Achieved CR: Glasdegib + LDACN = 15=15 (100); Achieved CR: LDAC aloneN = 1=1 (100); Did not achieve CR: Glasdegib + LDACN = 63=62 (98.4); Did not achieve CR: LDAC aloneN = 37=36 (97.3)", 
"subheading=Age (years), n (%); Characteristic=Median (range); Achieved CR: Glasdegib + LDACN = 15=74 (65–87); Achieved CR: LDAC aloneN = 1=78 (78–78); Did not achieve CR: Glasdegib + LDACN = 63=77 (64–92); Did not achieve CR: LDAC aloneN = 37=76 (58–83)", 
"subheading=Sex, n (%); Characteristic=Female; Achieved CR: Glasdegib + LDACN = 15=5 (33.3); Achieved CR: LDAC aloneN = 1=1 (100); Did not achieve CR: Glasdegib + LDACN = 63=14 (22.2); Did not achieve CR: LDAC aloneN = 37=14 (37.8)", 
"subheading=Sex, n (%); Characteristic=Male; Achieved CR: Glasdegib + LDACN = 15=10 (66.7); Achieved CR: LDAC aloneN = 1=0; Did not achieve CR: Glasdegib + LDACN = 63=49 (77.8); Did not achieve CR: LDAC aloneN = 37=23 (62.2)", 
"subheading=ECOG PS, n (%); Characteristic=0; Achieved CR: Glasdegib + LDACN = 15=0; Achieved CR: LDAC aloneN = 1=1 (100); Did not achieve CR: Glasdegib + LDACN = 63=10 (15.9); Did not achieve CR: LDAC aloneN = 37=2 (5.4)", 
"subheading=ECOG PS, n (%); Characteristic=1; Achieved CR: Glasdegib + LDACN = 15=5 (33.3); Achieved CR: LDAC aloneN = 1=0; Did not achieve CR: Glasdegib + LDACN = 63=21 (33.3); Did not achieve CR: LDAC aloneN = 37=17 (45.9)", 
"subheading=ECOG PS, n (%); Characteristic=2; Achieved CR: Glasdegib + LDACN = 15=10 (66.7); Achieved CR: LDAC aloneN = 1=0; Did not achieve CR: Glasdegib + LDACN = 63=31 (49.2); Did not achieve CR: LDAC aloneN = 37=18 (48.6)", 
"subheading=ECOG PS, n (%); Characteristic=Not reported; Achieved CR: Glasdegib + LDACN = 15=0; Achieved CR: LDAC aloneN = 1=0; Did not achieve CR: Glasdegib + LDACN = 63=1 (1.6); Did not achieve CR: LDAC aloneN = 37=0", 
"subheading=Cytogenetic risk, n (%); Characteristic=Good/intermediate risk; Achieved CR: Glasdegib + LDACN = 15=12 (80.0); Achieved CR: LDAC aloneN = 1=0; Did not achieve CR: Glasdegib + LDACN = 63=41 (65.1); Did not achieve CR: LDAC aloneN = 37=22 (59.5)", 
"subheading=Cytogenetic risk, n (%); Characteristic=Poor risk; Achieved CR: Glasdegib + LDACN = 15=3 (20.0); Achieved CR: LDAC aloneN = 1=1 (100); Did not achieve CR: Glasdegib + LDACN = 63=22 (34.9); Did not achieve CR: LDAC aloneN = 37=15 (40.5)"
)), row.names = c(NA, -20L), class = c("tbl_df", "tbl", "data.frame"
))

So i want to get this data frame back into this table 1 and table 2 form.

I tried splitting columns without much success apparently I'm not able to define the pattern which is "=" and ":" in all the tables.

Question:

  1. How to split the txt column in the data frame as my above figure.?
  2. After separating the column I would like to write each table with respective PMCIDs with folder with same PMCID name if there are 3 table in a paper then 3 table must be written separately in the same folder.

Suggestion or help would be highly and really appreciated.

PesKchan
  • 868
  • 6
  • 14
  • That's a lot of things at once. What is end goal? Do you want to get a LaTeX table? – s_baldur Jul 30 '20 at 10:34
  • No I want to have a csv or tsv table which i can use for downstream. My end goal to be specific is to curate data for metaanalysis. So I need data from multiple papers. – PesKchan Jul 30 '20 at 10:49
  • @sindri_baldur this was the question which output is this a part of it https://stackoverflow.com/questions/63138416/saving-tidypmc-output-which-forms-a-list-object-and-saving-it-into-individual-fi – PesKchan Jul 30 '20 at 11:15
  • There's something tricky with the `=` in text that is also used as a separator. The following code tidies the table a bit: `library(tidyverse); tbl2 %>% separate_rows(text, sep = "; ") %>% separate(text, c("name", "value"), sep = "(?<=\\S)=(?=\\S)")` – Bas Jul 30 '20 at 18:57
  • let me give it a try. – PesKchan Jul 30 '20 at 19:33
  • @Bas if you have seen the table you would have perhaps what Im trying to get as output ..but if doesn't work .The I will directly save the list out put. I suppose.. – PesKchan Jul 30 '20 at 19:43

0 Answers0