-1

I'd like to create a random numeric Pandas series and assign to the DataFrame. My DataFrame has an id column, however, it is alphanumeric which causes some issues when doing querying the data from a SQL database.

Therefore, I'd like to create a randomly generated numeric column.

import pandas as pd

df = pd.DataFrame({'name': ['A', 'B', 'C'],
                   'id': [1, 2, 3] 
                 })

The randomly generated numeric id column should be of length 6.

Expected output:

name  id  rid
A     1   731721
B     2   831273
C     3   831212
kms
  • 1,810
  • 1
  • 41
  • 92
  • 2
    Why a random ID - as long as it's unique - why does randomness matter (plus it's more work when you want to add any new rows - as you've gotta check you haven't already picked that random id etc...) - also - what issues is an alphanumeric ID causing - it's just a column...!? – Jon Clements Apr 02 '22 at 20:41
  • I don't know SQL, but this sounds like it might be an [XY problem](https://meta.stackexchange.com/q/66377/343832). Maybe the real issue is that your database setup can't handle alphanumeric IDs. – wjandrea Apr 02 '22 at 20:43
  • @JonClements good point. It doesn't have to be random I guess. I am unable to set the Alphanumeric ID as PK in MySQL without adding additional complexity: https://stackoverflow.com/questions/1827063/mysql-error-key-specification-without-a-key-length#:~:text=Sometimes%2C%20even%20though%20you%20don,its%20length%20or%20characters%20size – kms Apr 02 '22 at 21:07
  • 1
    @kms so - can you (optionally) keep a unique constraint on the existing column and alter the table to have a new bigint auto incrementing primary key? – Jon Clements Apr 02 '22 at 21:09

1 Answers1

0

Generate random sample between 100000 and 999999 with DataFrame shape

import random
df['rid'] = random.sample(range(100000, 999999), df.shape[0])
  name  id     rid
0    A   1  953858
1    B   2  244870
2    C   3  306826
pyaj
  • 545
  • 5
  • 15