I am working on a project for daily fantasy sports.
I have a dataframe containing possible lineups in it (6 columns, 1 for each player in a lineup).
As part of my process, I generate a possible fantasy point value for all players.
Next, I want to total the points scored for a lineup in my lineups dataframe by referencing the fantasy points dataframe.
For reference:
- Lineups Dataframe: columns = F1, F2, F3, F4, F5, F6 where each column is a player's name + '_' + their player id
- Fantasy Points Dataframe: columns = Player + ID, Fantasy Points
I go column by column for the 6 players to get the 6 fantasy points values:
for col in ['F1', 'F2', 'F3', 'F4', 'F5', 'F6']:
lineups = lineups.join(sim_data[['Name_SlateID', 'Points']].set_index('Name_SlateID'), how='left', on=f'{col}', rsuffix = 'x')
Then, in what I thought would be the simplest part, I try to sum them up and I get Segmentation Fault: 11
sum_columns = ['F1_points', 'F2_points', 'F3_points', 'F4_points', 'F5_points', 'F6_points']
lineups = reduce_memory_usage(lineups)
lineups[f'sim_{i}_points'] = lineups[sum_columns].sum(axis=1, skipna=True)
reduce_memory_usage comes from this article: https://towardsdatascience.com/6-pandas-mistakes-that-silently-tell-you-are-a-rookie-b566a252e60d
I have reduced the memory of the dataframe by 50% before running this line by choosing correct dtypes, I have tried using pd.eval() instead, I have tried summing the columns one by one via a for loop and nothing ever seems to work.
Any help is greatly appreciated!
Edit: Specs: OS - MacOS Monterey 12.2.1, python - 3.8.8, pandas - 1.4.1
Here are the details of my lineups dataframe right before the line causing the error:
Data columns (total 27 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 F1 107056 non-null object
1 F2 107056 non-null object
2 F3 107056 non-null object
3 F4 107056 non-null object
4 F5 107056 non-null object
5 F6 107056 non-null object
6 F1_own 107056 non-null float16
7 F1_salary 107056 non-null int16
8 F2_own 107056 non-null float16
9 F2_salary 107056 non-null int16
10 F3_own 107056 non-null float16
11 F3_salary 107056 non-null int16
12 F4_own 107056 non-null float16
13 F4_salary 107056 non-null int16
14 F5_own 107056 non-null float16
15 F5_salary 107056 non-null int16
16 F6_own 107056 non-null float16
17 F6_salary 107056 non-null int16
18 total_salary 107056 non-null int32
19 dupes 107056 non-null float32
20 over_600_frequency 107056 non-null int8
21 F1_points 107056 non-null float16
22 F2_points 107056 non-null float16
23 F3_points 107056 non-null float16
24 F4_points 107056 non-null float16
25 F5_points 107056 non-null float16
26 F6_points 107056 non-null float16
dtypes: float16(12), float32(1), int16(6), int32(1), int8(1), object(6)
memory usage: 10.3+ MB