-2

Good day,

I have a column from a data frame here:

 A
 23
 10
 11 
 22

My objective is to create a new column and associate the numbers like this:

A     file_number
23        8
10        6
11        6
22        8

As seen above both numbers 22, 23 are associated with the number 8 and numbers 10 and 11 are associated with number 6. How can I create such a column? Thanks in advance

Deepak M
  • 1,124
  • 2
  • 18
  • 28

1 Answers1

1

I think need if need create new values by first value of number with map by dictionary:

print (df['A'].apply(type))
0    <class 'int'>
1    <class 'int'>
2    <class 'int'>
3    <class 'int'>
Name: A, dtype: object

df['new'] = (df['A'] // 10).map({1:6, 2:8})
print (df)
    A  new
0  23    8
1  10    6
2  11    6
3  22    8

Detail:

print ((df['A'] // 10))
0    2
1    1
2    1
3    2
Name: A, dtype: int64

Another solution works with strings:

df['new'] = df['A'].astype(str).str[0].map({'1':6, '2':8})

print (df['A'].apply(type))
0    <class 'str'>
1    <class 'str'>
2    <class 'str'>
3    <class 'str'>
Name: A, dtype: object

df['new'] = df['A'].str[0].map({'1':6, '2':8})

If need convert positive number to first numeric is possible use this solution converted to numpy/pandas:

df['new'] = df['A'] // 10 ** np.log10(df['A'].values).astype(int)

print (df)
        A  new
0       2    2
1   10000    1
2     110    1
3  220000    2
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • What if its a case with more values grouped and a thousand rows? @jezrael – Deepak M Jun 13 '18 at 09:29
  • @DeepakM - add them to `dictionary` – jezrael Jun 13 '18 at 09:29
  • Ahh I see. Thanks. Lastly, is there a shorter way rather than to write out all the steps in the `.map`. What if im working with a bigger data frame with grouping alot of numbers to one number and there are several of these cases. Is there a way to index or slice or another way...? @jezrael – Deepak M Jun 13 '18 at 09:36
  • @DeepakM - If need good performance solution `map` by dictioanry is what need. Do you understand it correct, `df['A'] // 10` is not possible use because different length of numbers like `456`, `41`, `7` ? – jezrael Jun 13 '18 at 10:04