Let's say I have a csv file that is such that:
Dog
Cat
Bird
is a common column with a txt file I have that has two columns of interest:
Cat 8.9
Bird 12.2
Dog 2.1
One column being the identifiers (species name), the other being let's say, speed in mph.
I want to parse through the rows of the csv file, and lookup the species speed from my txt file, and then add it as an extra column to the csv file. Ie, I want to join these two files based on the common species key, and I specifically JUST want to append the speed from the txt (and not other columns that may be in that file) to the csv file. The csv file also has superfluous columns that I would like to keep; but I don't want any of the superfluous columns in the txt file (let's say it had heights, weights, and so on. I don't want to add those to my csv file).
Let's say that this file is very large; I have 1,000,000 species in both files.
What would be the shortest and cleanest way to do this? Should I write some type of a nested loop, should I use a dictionary, or would Pandas be an effective method?
Note: let's say I also want to compute the mean speed of all the species; ie I want to take the mean of the speed column. Would numpy be the most effective method for this?
Additional note: All the species names are in common, but the order in the csv and txt files are different (as indicated in my example). How would I correct for this?
Additional note 2: This is a totally contrived example since I've been learning about dictionaries and input/output files, and reviewing loops that I wrote in the past.
Note 3: The txt file should be tab seperated, but the csv file is (obviously) comma separated.