2

I want to fill column that doesn't have data with random values.

853                           None
854                   cheese empty
855                   cheese other
856                   yogurt empty
857                   yogurt other
858                   yogurt empty
859                   yogurt other
860                   butter empty
861                   butter other
862                           None
863                           None

To want to get something like:

853                           ASDFGHJAS
854                         cheese empty
855                         cheese other
856                         yogurt empty
857                         yogurt other
858                         yogurt empty
859                         yogurt other
860                         butter empty
861                         butter other
862                           DFGHJRTYT
863                           ERTYUIOIO
864                           TYUIOPPWE
865                           QWERTYUUI
866                           CBNMTYUIO

I have tried to do something like:

df1 = df[['english_name']].fillna(''.join(choice(ascii_uppercase) for i in range(12)), axis=1)



853                          ASDFGHJAS
854                         cheese empty
855                         cheese other
856                         yogurt empty
857                         yogurt other
858                         yogurt empty
859                         yogurt other
860                         butter empty
861                         butter other
862                           ASDFGHJAS
863                           ASDFGHJAS
864                           ASDFGHJAS
865                           ASDFGHJAS
866                           ASDFGHJAS

The problem I get same value with each row, and I need unique random value on each row.

Night Walker
  • 20,638
  • 52
  • 151
  • 228

3 Answers3

5

Use the lambda to apply random choice for nan values.

In [243]: df[['english_name']].apply(lambda x: x.fillna(''.join(choice(ascii_upper
     ...: case) for i in range(12))), axis=1)
Out[243]:
     english_name
853  BIZLLWLFGUSD
854  cheese empty
855  cheese other
856  yogurt empty
857  yogurt other
858  yogurt empty
859  yogurt other
860  butter empty
861  butter other
862  NMHDRQMTWZXF
863  EGPCZFWEDOFR

Or, precreate a series of same length with random names, then use df.name.fillna(s)

In [259]: s = pd.Series([''.join(choice(ascii_uppercase) for i in range(12)) for _
     ...:  in range(len(df))], index=df.index)

In [260]: df.english_name.fillna(s)
Out[260]:
853    BRFERJPGVDXP
854    cheese empty
855    cheese other
856    yogurt empty
857    yogurt other
858    yogurt empty
859    yogurt other
860    butter empty
861    butter other
862    NYYTRCSSCPWT
863    ZYBNJQIPIWEF
Name: english_name, dtype: object
Zero
  • 74,117
  • 18
  • 147
  • 154
1

Using this answer, you can define a function to return a random string with a given size:

def random_string(N=9):
    return ''.join(random.SystemRandom().choice(string.ascii_uppercase) for _ in range(N))


df[['english_name']].apply(lambda x: x.fillna(random_string()),axis=1)
Community
  • 1
  • 1
Mahdi
  • 3,188
  • 2
  • 20
  • 33
1

generic solution for dataframes with more than one column

df = pd.DataFrame([
        ['a', np.nan, 'b'],
        [np.nan, 'c', np.nan],
        ['d', np.nan, 'e'],
        [np.nan, 'f', np.nan]
    ])

     0    1    2
0    a  NaN    b
1  NaN    c  NaN
2    d  NaN    e
3  NaN    f  NaN

  • Stack df to get a series
  • count nulls

dfs = df.stack(dropna=False)
wherenull = dfs.isnull().values
n = wherenull.sum()

generate fill values

np.random.seed([3,1415])
fills = pd.DataFrame(
    np.random.choice(
        list(ascii_uppercase),
        (n, 12)
    )).sum(1).values

fill missing

dfs.loc[wherenull] = fills
dfs.unstack()

              0             1             2
0             a  QLCKPXNLNTIX             b
1  AWYMWACAUZHT             c  NSMEDTNWHXNU
2             d  FDXFZLYHMGEH             e
3  WSOGGOVSIXKF             f  PYEPNHGRMMPO
piRSquared
  • 285,575
  • 57
  • 475
  • 624