I have some data, that looks like the following:
"Name","Length","Startpos","Endpos","ID","Start","End","Rev","Match"
"Name_1",140,0,138,"1729",11,112,0,1
"Name_2",132,0,103,"16383",23,232,0,1
"Name_3",102,0,100,"1729",22,226,1,1
"Name_4",112,0,130,"16383",99,992,1,1
"Name_5",132,0,79,"1729",81,820,1,1
"Name_6",112,0,163,"16383",81,820,0,1
"Name_7",123,0,164,"1729",54,542,1,1
"Name_8",123,0,65,"16383",28,289,0,1
I have used the order
function to order according to first "ID then "Start".
"Name","Length","Startpos","Endpos","ID","Start","End","Rev","Match"
"Name_1",140,0,138,"1729",11,112,0,1
"Name_3",102,0,100,"1729",22,226,1,1
"Name_7",123,0,164,"1729",54,542,1,1
"Name_5",132,0,79,"1729",81,820,1,1
"Name_2",132,0,103,"16383",23,232,0,1
"Name_8",123,0,65,"16383",28,289,0,1
…
Now I need to do two things: First I need to create a table that includes pairwise couples out of each ID group. For a group in one ID containing the names (1,2,3,4,5), I need to create the pairs (12,23,34,45). So for the above example, the pairs would be (Name_1+Name_3, Name_3+Name_7, Name_7+Name_5).
My output for the above example, would look like this:
"Start_Name_X","Start_Name_Y","Length_Name_X","Length_Name_Y","Name_Name_X","Name_Name_Y","ID","New column"
11, 22, 140, 102, "Name_1", Name_3", 1729,,
22, 54, 102, 123, "Name_3", Name_7, 1729,,
54, 81, 123, 132, "Name_7", Name_5, 1729,,
23, 28, 132, 123, "Name_2", "Name_8", 16383,,
…
So I need to create pairs through ascending "Start", but within each "ID".
I am thinking it should be done with a for
loop, but I am a newbie, so pulling the data to a new table with the for loop confuses me in itself, and especially the constraint of doing it within each unique "ID", I have no idea how to do.
I have experimented with splitting the data into groups according to ID using split
, but it doesn't really get me further with creating the new data table.
I have created the ned data-table with the following code:
column_names = data.frame(Start_Name_X ="Start_Name_x",
Start_Name_Y="Start_Name_Y", Length_Name_X ="Length_Name_X",
Length_Name_Y="Length_Name_Y", Name_X="Name_X", Name_Y="Name_Y", ID="ID",
New_Column="New_Column")
write.table(column_names, file = "datatabel.csv", row.names=FALSE, append =
FALSE, col.names = FALSE, sep=",", quote=TRUE)
And this is the table, I would like to write to. Is a for loop the write way to handle this, and if so, can you give me a few clues on how to start?