I have two files. Each file contains a certain # of rows and columns. The second column in each file contains a unique GENE ID that are identical in both files (they are just in different rows). So for example
File 1 could have the gene "LINC00273" in row 3 (in the second column) and File 2 could have "LINC00273" in row 4002 (in the second column).
What I want to do is I want to sort File 2 so that if the first GENE in row 1 of File 1 is "VAGAB" then the first GENE of row 1 of File 2 is "VAGAB" with its entire row intact (the row has coverage information so I need the numbers to be moved accordingly).
Example: File 1 [#] corresponds to the row number
[1] GENE100X VAGAB VAGAB 1.0 1.0 1.0 5.0 11198.0
...
[5000] GENE900X ZZZZ ZZZZ 1.0 0 4.2 1.0 78.0
File 2
[1] GENE44X AAAA AAAA 1.0 1.0 1.0 1.0 1.0
...
[5000] GENE106X VAGAB VAGAB 5.0 5.0 5.0 1.0 55.0
The expected output for file 2 would be:
[1] GENE106X VAGAB VAGAB 5.0 5.0 5.0 1.0 55.0
...
[5000] GENE300X ZZZZ ZZZZ 88.0 88.0 1.0 1.0 1.0
All files are tab delimited. All the numbers in examples (row numbers and column numbers) are made up. The only column that never changes is columns 2 and 3 these are the identical ones in both files so I want to sort using these.
I dont want to create a third file if it can be helped. I simply want the rows in file 2 to be identical to the rows in file 1 and the way to do this is to match them by their "Gene" name found in column 2 of both files.
R solutions preferred as that is what i'm currently learning, but I will take ANY solutions in any language and then I can use that as a template to come up with my own R solution as practice later.