I have two csv files:
sample1.csv
serial,name,surname,dob,address,phone,zip,country
1,john,smith,1985-12-13,Add1,1111,11,NPL
2,david,anderson,1975-1-23,Add2,2222,22,NPL
3,shyam,luke,1981-2-16,Add3,3333,33,NPL
4,donald,shaw,1972-7-9,Add4,4444,44,NPL
5,steve,singh,1980-11-1,Add5,5555,55,NPL
6,mike,shrestha,1983-5-19,Add6,6666,66,NPL
7,harry,phelp,1979-9-27,Add7,7777,77,NPL
8,sam,butler,1988-3-19,Add8,8888,88,NPL
sample2.csv
name,surname,dob,codenum
david,smith,1981-12-13,ds1213
john,smith,1985-12-13,js1213
donald,phelp,1972-7-9,dp79
donald,shaw,1972-7-9,ds79
mike,shrestha,1983-5-19,ms519
mike,butler,1981-5-19,mb519
shyam,luke,1981-2-16,sl216
shyam,luke,1980-1-16,sl116
I want to match the columns name
, surname
and dob
in these two csv files and generate a new csv such that:
- all columns from sample2 are present
- specific columns from sample1 (
serial
,phone
,zip
) are present
Final csv should look like:
final.csv
serial,name,surname,dob,codenum,phone,zip
1,john,smith,1985-12-13,ds1213,1111,11
4,donald,shaw,1972-7-9,ds79,4444,44
6,mike,shrestha,1983-5-19,ms519,6666,66
3,shyam,luke,1981-2-16,sl216,3333,33
I searched for various answers but couldn't find any promising solution that fits my requirement.
How can I do this in effective way?
(PLEASE SUGGEST ONLY PANDAS SOLUTION)