Approach without merge (not suggested)
This is a pure pythonic approach, but not suggested due to inefficiency.
def splitBar(strValue):
return strValue.strip().split("|")
with open("df.csv") as f:
f.readline() # to remove the first line
df = f.readlines()
df = list(map(splitBar, df))
idDf = [x[0] for x in df]
age = [x[1] for x in df]
with open("df_id.csv") as f:
f.readline() # to remove the first line
df_id = f.readlines()
df_id = list(map(splitBar, df_id))
idDf_id = [x[0] for x in df_id]
value = [x[1] for x in df_id]
[[id,value[index], age[index]] for index, id in enumerate(idDf_id) if id in idDf]
Output
[['11', '100', '23'],
['22', '109', '21'],
['33', '400', '25'],
['44', '90', '20'],
['55', '1000', '30']]
Note that, I assumed the name of the files you are dealing with are df.csv
and df_id.csv
. Also, note that the output matrix has three columns: the first one is the id, the second is the values and the last one is the age.
Merge approach (suggested)
If you are using pandas module, you should consider using merge
function:
df_id.merge(df, on="id")