I am doing a stratified sample with two variables. What do I need to alter when I am doing this with a dataframe?
Here is the code:
import pandas as pd
Games_Sales=pd.read_csv('C:\\Users\\Jon\\Desktop\\data
science\\GamesSales_Data.csv')
Games_Sales
grouped_game_data=Games_Sales.groupby(['Year','Genre'])
year_gender=Games_Sales.Year.unique()
Games_Sample=pd.DataFrame()
for g in year_gender:
g_data=grouped_game_data.get_group(g)
g_data=g.grouped_game_data(n=5,replace=True, random_state=1)
Games_Sample=Games_Sample.append(sample_data, ignore_index=True)
print(Game_Sample)
I am fairly sure this would be wrong