1

I created an informational table, which is very long (think nearly 100 observations or just under). I have a main table (around 70K onservations) where I need to create a new column and fill it based on matching values between my two dataframes but I need to fill the new column the cells with data from my information table.

I created a small dataset but my real dataset (which I can't share because my prof signed a non-diclosure) has like 70K observations.

data_1 (info table)

 Animal          Food
  Dog            Stake
  Cat            Fish
  Rabbit         Carrot

data 2 (original table)

 Name     Animal     Age
 Binxy    Dog         1
 Al       Rabbit      4
 Sam      Dog         11
 Dexter   Cat         9
 Dory     Hamster     6
 Chloe    Cat         5

desired data_frame

 Name     Animal     Age    Fed [new column]
 Binxy    Dog         1     Stake
 Al       Rabbit      4     Carrot
 Sam      Dog         11    Stake
 Dexter   Cat         9     Fish
 Dory     Hamster     6     NaN
 Chloe    Cat         5     Fish

my gut tells me it probably has to do with iloc or loc and using some bloonean values. Verbally I would say:

  1. If data_2["Animal"] == data_2["Animal"]
  2. Then fill new column data_2["Fed"] with corresponding food found in data_1["Food"]

I think merge might work but I'm not certain if it would fill it for every matching value. I'm not very good at merges because I struggle to understand the join feature but I don't think values would be inserted where I need them because my dataframes are not the same length.

edit: I've don't manually before but I had like only two or three values in fill in and I don't want to manually do this for 100. But this is as far as my basic understanding goes.

 New_Categorized_Full.loc[
      (New_Categorized["Produce"] == "Apple"), "Fruit"] = "Fuji"
Emm
  • 123
  • 8

1 Answers1

1

Use series.map

df2['Fed'] = df2.Animal.map(dict(df1[['Animal','Food']].to_numpy()))


Out[10]:
     Name   Animal  Age     Fed
0   Binxy      Dog    1   Stake
1      Al   Rabbit    4  Carrot
2     Sam      Dog   11   Stake
3  Dexter      Cat    9    Fish
4    Dory  Hamster    6     NaN
5   Chloe      Cat    5    Fish
Andy L.
  • 24,909
  • 4
  • 17
  • 29
  • 1
    ~ THANK YOU! IT WORKED WITHOUT A PROBLEM! Much Appreication! – Emm Sep 20 '20 at 00:33
  • Out of interest - how do you typically convert the example data to a dataframe? It seems less straightforward when it's posted in this way, but I'm wondering if there's a quick approach to converting the output you have there to a dataframe, for example. – baxx Sep 20 '20 at 00:49
  • 1
    @baxx: high-light the whole example data and copy by `ctrl + C` or by mouse. Within the interactive python prompt run this command: `df = pd.read_clipboard(r'\s+')`. Read this question for more info: https://stackoverflow.com/questions/31610889/how-to-copy-paste-dataframe-from-stack-overflow-into-python – Andy L. Sep 20 '20 at 02:41