I have a baseball stats df wherein some of the columns names are:
- "Games" - which indicates the number of games a player has played in that year.
- "Year" - which contains values between 1990 and 2020
- "PlayerID" - not unique. Basically shows which player played for which "Team" in what "Year"
- "Home Runs" - Number of nome Runs hit by that player in that year.
I want to find out:
- who has the max number of home runs where the number of games played is greater than 100.
- which column/ feature has the highest correlation with "Home Run"
Data is in the form of a csv file.