I have a long (~700,000 rows) CSV with 2 columns. One column with locations, which are written as cg and 8 digits following (ex. cg12345678) and the corresponding column has a r value (just a digit between -1 and 1). The other CSV is much smaller (~20 rows) and only contains one column with some cg locations. I want to print only the r values of the big excel that correspond to the cg locations of the small excel.
Here is an abbreviated example of the 2 column CSV:
cg07881041 -0.0192398465425986
*cg03513874 -0.339360471677652
cg25458538 0.0451334622844003
*cg09261072 0.208770797055665
cg02404579 -0.0166889943192668
cg22585117 -0.340873841270817
*cg25552317 -0.0372823043801581
Here is an example of the one column CSV:
cg08829765
*cg25552317
*cg09261072
cg14370485
*cg03513874
cg10855276
cg12406992
I added stars to the 3 locations that match in this example. So I would want to print the following matrix:
Matching cg corresponding rvalue
cg03513874 -0.339360471677652
cg09261072 0.208770797055665
cg25552317 -0.0372823043801581