0

[1] genos IDV_V 24 0.506472 14.0206 1.17 0 P [2] Lcrop IDV_V 6 0.768434E-06 0.212724E-04 0.00 0 B [3] Lgenos IDV_V 24 0.768434E-06 0.212724E-04 0.00 0 B [4] Residual SCA_V 160 1.00000 27.6828 8.83 0 P

Hey, I have a text and would like to convert it into a data frame (in total 7 columns), how should I do it? Hope to hear from you!

  • check `read.table` – Maël Feb 22 '23 at 14:23
  • Thank you! It was a irregular .txt file so read.table did not work at first place. I used read.csv and then extract the wanted rows (those four). It is now a one-column data frame and in each cell thre are two many elements. I would like to seperate them into different columns. @Maël – Aoxiang Xin Feb 22 '23 at 14:24

2 Answers2

1

You could use readr::read_table() like this:

library(readr)
library(dplyr)

read_table(data, col_names=F) %>% mutate(X1=paste(X1,X2)) %>% select(-X2)

Output:

# A tibble: 4 × 7
  X1                X3          X4         X5    X6    X7 X8   
  <chr>          <dbl>       <dbl>      <dbl> <dbl> <dbl> <chr>
1 genos IDV_V       24 0.506       14.0        1.17     0 P    
2 Lcrop IDV_V        6 0.000000768  0.0000213  0        0 B    
3 Lgenos IDV_V      24 0.000000768  0.0000213  0        0 B    
4 Residual SCA_V   160 1           27.7        8.83     0 P  

Input:

data = c("genos IDV_V 24 0.506472 14.0206 1.17 0 P", "Lcrop IDV_V 6 0.768434E-06 0.212724E-04 0.00 0 B", 
"Lgenos IDV_V 24 0.768434E-06 0.212724E-04 0.00 0 B", "Residual SCA_V 160 1.00000 27.6828 8.83 0 P"
)
langtang
  • 22,248
  • 1
  • 12
  • 27
  • Thank you!!! Should I also put in the "%>%"? Sorry I am new with R so not sure if this is an operator or is the symbol of Stackoverflow. – Aoxiang Xin Feb 22 '23 at 15:06
  • yes, that is a pipe for chaining operations (see `magrittr`, a package that loads with `dplyr`) – langtang Feb 22 '23 at 15:23
0

Are you looking for something like this:

# make a vector of the text data
  mytext <- c("[1] genos IDV_V 24 0.506472 14.0206 1.17 0 P",
                 "[2] Lcrop IDV_V 6 0.768434E-06 0.212724E-04 0.00 0 B",
                 "[3] Lgenos IDV_V 24 0.768434E-06 0.212724E-04 0.00 0 B",
                 "[4] Residual SCA_V 160 1.00000 27.6828 8.83 0 P")
  
  # Converting text into a data frame
  df <- read.table(text = mytext, sep = " ", col.names = c("id", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8"))
  
  df[,c(1,8)]<-NULL
  df

        v1    v2  v3             v4            v5   v6 v8
1    genos IDV_V  24 0.506472000000 14.0206000000 1.17  P
2    Lcrop IDV_V   6 0.000000768434  0.0000212724 0.00  B
3   Lgenos IDV_V  24 0.000000768434  0.0000212724 0.00  B
4 Residual SCA_V 160 1.000000000000 27.6828000000 8.83  P
S-SHAAF
  • 1,863
  • 2
  • 5
  • 14
  • Thank you so muhc Salar!!! Between the elemnts in the string, my text is not always seperated by one space, some times there are more. How should I deal with it? – Aoxiang Xin Feb 22 '23 at 14:54
  • But once I copy that into Stackoverflow, it is automatically adjusted into one space interval. – Aoxiang Xin Feb 22 '23 at 14:54
  • If you have some time one or more spaces then try change sep=" " to sep="" without any space between double quotes in the read.table() function – S-SHAAF Feb 22 '23 at 15:10
  • Thank you! It works in some way ; ) But the frist row is gone and instead only the name of the columns name. Also there comes a warning "header and 'col.names' are of different lengths". – Aoxiang Xin Feb 22 '23 at 15:32
  • I randomly changed your text with one and more spaces and it worked without any problem. – S-SHAAF Feb 22 '23 at 15:35
  • 1
    Thank you! I made a mistake in my code ; ) Thank you so much!!! – Aoxiang Xin Feb 22 '23 at 15:38