Hello I am still very new to Python, and am not very adapt with Programming.
My relevant data looks something like this:
Specimen | Fam_Genus |
---|---|
1 | A |
2 | B |
3 | F |
4 | G |
5 | U |
6 | A |
7 | B |
8 | D |
Just with about 4000 Specimens. Since the data is old the Genus is not up to modern standard as a lot of them are now underclasses of the Family Genus.
So I want to be able to create a new row in which, based of the Family Genius data the new Genius is displayed.
Like:
Specimen | Fam_Genus | Modern Genus |
---|---|---|
1 | A | A |
2 | B | B |
3 | F | C |
4 | G | C |
5 | U | A |
6 | A | A |
7 | B | B |
8 | D | C |
I tried :
df["Modern_Genus"] = ""
data['Modern_Genus'] = np.where(data.Fam_Genus.str.contains("A"), "A")
But I get this Error back: ValueError: either both or neither of x and y should be given
From what I found online, this seems to be the best way, but as I said I am new to python, especially numpy and panda. So any ideas or suggestions on what I am doing wrong?