0

I am very new to R and trying to learn by doing some simple manipulation of my own data (Raman spectra). The .txt files I have each have two columns but different numbers of rows. The first column is Raman shift and the second is intensity. I want to create a single dataframe where the column names will be "Raman shift", "Sample 1","Raman shift", "Sample 2","Raman shift", "Sample 3" ... Where the column "Sample #" contains the intensity data from the file of that name.

I've tried to use the read_bulk function but the data isn't arranged the way I was expecting it.

all_data=read_bulk(directory = ".", extension = "txt", fun = read.delim)

Is there a clean and easy way of doing this or am I thinking about this the wrong way?

CDWard
  • 1
  • I don't know the `read_bulk` function and it isn't base R. Perhaps you could make this question a bit more reproducible? This includes sample code you've attempted (including listing non-base R packages, and any errors/warnings received), sample *unambiguous* data (e.g., `dput(head(x))` or `data.frame(x=...,y=...)`), and intended output given that input. Refs: https://stackoverflow.com/q/5963269, [mcve], and https://stackoverflow.com/tags/r/info. – r2evans Sep 18 '20 at 03:30
  • I'll try and give more information on what I'm doing. This is a sample of data from three samples (Raman Shift, Intensity). Raman Intensity Raman Intensity Raman Intensity 1466.9 45.6144 1466.02 45.6065 1465.14 45.5986 1466.74 17.1049 1464.27 48.4402 1465.87 28.5032 1463.39 59.8275 1464.99 19.9488 1465.14 17.1049 1462.51 39.8781 1464.11 19.9453 1464.27 28.5032 You can see that the three samples do not have the same number of rows. Samples 1 and 3 have matching Raman Shift values but sample 2 does not. – CDWard Sep 25 '20 at 15:59
  • I'm using lapply from tidyverse to create the list of data from sample files: `filelist=list.files(pattern = "*.txt") datalist=lapply(filelist,function(x)read.table(x,header = T)) alldata=rbindlist(datalist,use.names = T,fill = T)` The problem I'm having is that the data from Sample 3 is getting appended to the columns of Sample 1 because they share the same Raman Shift values. – CDWard Sep 25 '20 at 16:01
  • *"You can see"* ... unfortunately I can't, data in comments is usually pretty bad. Please [edit] your question and include it (in a code-block) there. FYI, `lapply` is base R, and `rbindlist` is from `data.table` (neither are from tidyverse). If you have a column in most but not in all, then use `rbindlist(..., fill=TRUE)` and the missing column will be `NA` for that particular frame/table (and valued correctly everywhere else). – r2evans Sep 25 '20 at 16:29
  • For instance, see `rbindlist(list(data.table(a=1:2,b=2:3),data.table(a=4)))` fails but `rbindlist(list(data.table(a=1:2,b=2:3),data.table(a=4)), fill = TRUE)` succeeds. (Assuming you have loaded `data.table` at some point :-) – r2evans Sep 25 '20 at 16:30
  • Sorry about the data before, this is another attempt. This is the data where S1R is sample 1 Raman Shift and S1I is sample 1 intensity. In reality each sample is a .txt file with two columns, tab delimited. `# sample data S1R=c(1466.9,1466.02,1465.14,1464.27,1463.39,1462.51) S1I=c(45.6144,45.6065,45.5986,48.4402,59.8275,39.8781) S2R=c(1466.74,1465.87,1464.99,1464.11) S2I=c(17.1049,28.5032,19.9488,19.9453) S3R=c(1465.14,1464.27) S3I=c(17.1049,28.5032)` – CDWard Sep 28 '20 at 20:27

0 Answers0