3

I have a column in my dataframe with barcodes and created a dictionary to map barcodes to item ids.

I am creating a new column:

df['item_id'] = df['bar_code']

A dictionary (out of a second dataframe - imdb -)

keys = (int(i) for i in imdb['bar_code'])
values = (int(i) for i in imdb['item_id'])
map_barcode = dict(zip(keys, values))

map_barcode (first 5 e.g.)

{0: 1000159, 9000000017515: 11, 7792690324216: 16, 7792690324209: 20, 70942503334: 33}

And then mapping the item id with the dict

df = df.replace({'item_id':map_barcode})

Here I am hoping to obtain the item ids in the column

(Going back to the dict examples:)

df['item_id'][0] = 1000159
df['item_id'][1] = 11
df['item_id'][2] = 16
df['item_id'][3] = 20
df['item_id'][4] = 33

But end up getting this error:

Cannot compare types 'ndarray(dtype=int64)' and 'int64' 

I tried to change the type of the dictionary to np.int64

keys = (np.int64(i) for i in imdb['bar_code'])
values = (np.int64(i) for i in imdb['item_id'])
map_barcode = dict(zip(keys, values))

But got the same error.

Is there anything I am missing here?

eyllanesc
  • 235,170
  • 19
  • 170
  • 241
Franco D'Angelo
  • 115
  • 3
  • 9

1 Answers1

4

replace example

Firstly, I cannot reproduce your error. This works fine:

map_dict = {0: 1000159, 9000000017515: 11, 7792690324216: 16, 7792690324209: 20, 70942503334: 33}

df = pd.DataFrame({'item_id': [0, 7792690324216, 70942503334, 9000000017515, -1, 7792690324209]})

df = df.replace({'item_id': map_dict})

Result:

   item_id
0  1000159
1       16
2       33
3       11
4       -1
5       20

Use map + fillna instead

Secondly, manually iterating Pandas series within generator expressions is relatively expensive. In addition, replace is inefficient when mapping via a dictionary.

In fact, creating a dictionary is not even necessary. There are optimized series-based methods for these tasks:

map_series = imdb[['bar_code', 'item_id']].astype(int).set_index('bar_code')['item_id']

df['item_id'] = df['item_id'].map(map_series).fillna(df['item_id'])

See also:

jpp
  • 159,742
  • 34
  • 281
  • 339