I want to select a random row during a vector operation on a DataFrame. this is what my inpDF
looks like:
string1 string2
0 abc dfe
1 ghi jkl
2 mno pqr
3 stu vwx
I'm trying to find the function getRandomRow()
here:
outDF['string1'] = inpDF['string1']
outDF['string2'] = inpDF.getRandomRow()['string2']
so that the outDF
ends up looking (for example) like this:
string1 string2
0 abc jkl
1 ghi pqr
2 mno dfe
3 stu pqr
EDIT 1:
I tried using the sample()
function as suggested in this answer, but that just causes the same sample to get replicated accross all rows:
outDF['string1'] = inpDF['string1']
outDF['string2'] = inpDF.sample(n=1).iloc[0,:]['string2']
which gives:
string1 string2
0 abc pqr
1 ghi pqr
2 mno pqr
3 stu pqr
EDIT 2:
For my particular use case, even picking the value from 'n' rows down would suffice. So, I tried doing this (I'm using inpDF.index
based on what I read in this answer):
numRows = len(inpDF)
outDF['string1'] = inpDF['string1']
outDF['string2'] = inpDF.iloc[(inpDF.index + 2)%numRows,:]['string2']
but it just ends up picking the value from the same row, and the outDF
comes out to be this:
string1 string2
0 abc dfe
1 ghi jkl
2 mno pqr
3 stu vwx
whereas I'm expecting it should be this:
string1 string2
0 abc pqr
1 ghi vwx
2 mno dfe
3 stu jkl