3

I read in a file and R returns a list, like this:
list 1: "1)" "Seo " "agad " "na " "ciad " "faclan " "a "
list 2: "cannteil " "(canntail) " "a-staigh " "dhan " "
list 3: "2)" "Seo " "sinn, " "sin " "direach " "fuirich…"

What I want is to get a vector where, if the 1st element in [[i]] has a number, then the other elements in [[i]] also get the same number and if the 1st element in [[i]] doesn't have a number, then all elements in [[i]] will have the number shown in the previous line, like this:

"1\t1)" "1\tSeo " "1\tagad " "1\tna " "1\tciad " "1\tfaclan " "1\ta " .... "2\t2)" "2\tseo", 2\tsinn....

Could anyone tell me the code for this? And, is there a way to get a vector containing only the number corresponding each word without pasting it before each word?

Thank you

My code was the following, but it didn't give me what I want (all the elements get the number 1, even those in the list starting with number 2.) What part of the code is wrong?

word="" 
temp="" 
for (i in 1:length(file)) { 
       if (grepl('\\d+\)',file[[i]][1])) {       
       snum=grep('\\d+',file[[i]][1]) 
       temp=paste(snum, file[[i]], sep="\t") 
         } else { 
       temp=paste(snum,utter.short[[i]],sep="\t") 
         }   word=c(word,temp) 
     }
charlotte
  • 107
  • 1
  • 1
  • 11
  • 3
    Welcome to Stack Overflow! Please provide us with a reproducible example http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – Roman Luštrik Oct 07 '12 at 08:22
  • Questions of the form "Could you tell me the code for this?" are not generally answered here. This site works best for answering questions regarding specific problems in good you are already working on. As @RomanLuštrik says, if you have a block showing your attempt you're much more likely to get help. – Richard Oct 07 '12 at 08:43

2 Answers2

4

Assuming you have a list of lists like...

 list1 = list("1)" "Seo " "agad " "na " "ciad " "faclan " "a ")
 list2 = list("cannteil " "(canntail) " "a-staigh " "dhan ")
 list3 = list("2)" "Seo " "sinn, " "sin " "direach " "fuirich…")

 biglist = list(list1, list2, list3)

Here is a non-elegant / non-efficient solution to work with this setup

 counter = 1
 for (i in 1:length(biglist){
 if (gsub("\\D", "", biglist[[i]][[1]])>0){
     counter = gsub("\\D", "", biglist[[i]][[1]]
     biglist[[i]] = biglist[[i]][2:length(biglist[[i]])]
     }
 lapply(counter, paste, biglist[[i]], sep="\t")
 }

This could deal with any number of lines and length of lines, as long as the first term has 1 digit, and that the lines are ordered after one another.

Depending on what this is for, there is probably a better way you can read and store the data.

Fridiculous
  • 338
  • 3
  • 15
1

More flexibility and easier to understand (elegance?). It handles numbers in any order, missing first term, and is easily changed/maintained.

# sample data
list0a = list("cannteil " ,"(canntail) ", "a-staigh " ,"dhan ")
list0b = list("cannteil " ,"(canntail) ", "a-staigh " ,"dhan ")
list1 = list("3)","Seo ","agad ","na ", "ciad ", "faclan " ,"a ")
list2 = list("cannteil " ,"(canntail) ", "a-staigh " ,"dhan ")
list3 = list("2)", "Seo ", "sinn, ", "sin " ,"direach ", "fuirich…")

# separate lists to test on
biglist = list(list1,list2,list3)
biglist2 = list(list0a,list0b,list1, list2, list3)

# get number vector
numlist <- sapply(biglist,function(x){
  as.numeric(gsub('[^0-9]','',x[1]))
})

# fill in gaps with indexing, drops leading items without numbers
numorder <- cumsum(!is.na(numlist))
numreplaced <- na.omit(numlist)[numorder]

# handle missing first numbers however you want. omit if guaranteed first element has number
numfinal <- c(rep('0',times = sum(numorder == 0)),numreplaced)

# make the strings as desired
Map(function(x,num){
  paste0(num,'\t',x)
},x = biglist,num = numfinal)
ARobertson
  • 2,857
  • 18
  • 24