How to select random rows in a dataset and modify their values?

Question

I have a dataset of 12000 rows and 3 columns; I want to randomly select 600 rows from this dataset and modify their values ( the values of a specific column). Is there any method in python that can do this?

score 2 · Answer 1 · answered Sep 27 '22 at 11:48

2

You could try something like this:

import pandas as pd
import numpy as np
import random

def random_select():
    df = pd.read_csv('data.csv') # read the dataset
    df = df.sample(n=600) # randomly select 600 rows
    df['column'] = np.random.randint(1, 100, size=len(df)) # modify the values of a specific column
    df.to_csv('data.csv', index=False) # save the modified dataset
    return df # return the modified dataset
    
if __name__ == '__main__':
    random_select()

answered Sep 27 '22 at 11:48

Ben

79
8

can you please explain what you mean by this line of code if __name__ == '__main__': random_select() – Soumia Bibi Sep 27 '22 at 13:32
@SoumiaBibi You should read this one: https://stackoverflow.com/questions/419163/what-does-if-name-main-do – Ben Sep 27 '22 at 14:21
But, sir, the output of your method is a dataset with only 600 rows. I want the dataset to be 12000 rows, including the 600 modified rows (600 rows modified + 600 non-modified) – Soumia Bibi Sep 27 '22 at 15:22

raja · Answer 2 · 2022-09-27T11:47:35.027

0

By the help of this :

df.sample(n=600)
# and then you can call the colum you want by using 
pandas.loc function and edit according to your wish

edited Sep 27 '22 at 11:47

answered Sep 27 '22 at 11:34

raja

123
7

How to select random rows in a dataset and modify their values?

2 Answers2