1

I have one data frame which has column header but position of coulmn header is not fixed so can i read non empty value in 1st column to get the index of header to process the file.

mydata.txt

                      test   34   45
                      rt     45   56 
                      tet3     67   56
       Col1   Col2    Col3   Col4  Col5
        45    45      23     56    12 
        34    45      67     65    32 
        45    67      78     90    54 
        56    43      32     12    45


   mydata = read.table("mydata.txt")
   mydata[,1]     #how to find first non blank value in first  column?

In order simplify the about pblm:

df<-c("","","",34,23,45)

how to find fiest non blank value in df

  • Welcome to SO. You could improve your question. Please read [how to provide minimal reproducible examples in R](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example#answer-5963610). Then edit & improve it accordingly. A good post usually provides minimal input data, the desired output data & code tries - all copy-paste-run'able (not screeenshots) in a new/clean R session. – lukeA Oct 13 '16 at 13:46
  • From what you've posted it seems like the .txt file actually contains two data frames, one small with row names "test", "rt" and "tet3", and a larger one with column names "Col1" through "Col5". Is this the case? – AkselA Oct 13 '16 at 14:30
  • one file contains data but i need to find the index of first non blank value in columumn (Here col1). – manish kumar Oct 13 '16 at 14:33

2 Answers2

0

Ok, for example

writeLines(tf <- tempfile(fileext = ".txt"), text = "
             test   34   45
              rt     45   56 
              tet3     67   56
Col1   Col2    Col3   Col4  Col5
45    45      23     56    12 
34    45      67     65    32 
45    67      78     90    54 
56    43      32     12    45")
mydata = read.table(tf, fill = TRUE, stringsAsFactors = FALSE)
idx <- which.min(mydata[,4]=="")
df <- mydata[-(1:idx), ]
df <- as.data.frame(lapply(df, type.convert))
names(df) <- unlist(mydata[idx, ],F,F)

gives you

str(df)
# 'data.frame': 4 obs. of  5 variables:
#  $ Col1: int  45 34 45 56
#  $ Col2: int  45 45 67 43
#  $ Col3: int  23 67 78 32
#  $ Col4: int  56 65 90 12
#  $ Col5: int  12 32 54 45
lukeA
  • 53,097
  • 5
  • 97
  • 100
0

Trying to answer your "simplified" problem:

df <- c("", "", "", 34, 23, 45)

The purrr package provides such functions with detect() and detect_index():

install.packages("purrr", repos = "https://cloud.r-project.org")
library(purrr)
detect_index(df, function(x) x != "")
Aurèle
  • 12,545
  • 1
  • 31
  • 49