1

I want to calculate the median of the "MP" and "FG" column. But I came across the problem that I cant calculate it when some rows in the array/table has strings in it. Like in row 8,18,19,20,21,24. As you can see in the picture.

https://i.stack.imgur.com/SybWM.png

List of NBA Stats

I tried different ways to do it. But my latest idea was this:

resultfg1920 = morantSeason1920.loc[morantSeason1920['FG'] == float]

print (resultfg1920)

Output

Empty DataFrame
Columns: [Rk, G, Date, Age, Tm, Unnamed: 5, Opp, Unnamed: 7, GS, MP, FG, FGA, FG%, 3P, 3PA, 3P%, FT, FTA, FT%, ORB, DRB, TRB, AST, STL, BLK, TOV, PF, PTS, GmSc, +/-]
Index: []

This is what I get from it, like just the first row, so all column names.

wjandrea
  • 28,235
  • 9
  • 60
  • 81
krix
  • 11
  • 2
  • 1
    How can a value here be equal to the type of the value? – roganjosh May 23 '22 at 17:49
  • Welcome to Stack Overflow! Please take the [tour]. [Please don't post pictures of text](https://meta.stackoverflow.com/q/285551/4518341). Instead, copy the text itself, [edit] it into your post, and use the formatting tools like [code formatting](/editing-help#code). See also [How to make good reproducible pandas examples](/q/20109391/4518341). For more tips, see [ask]. – wjandrea May 23 '22 at 17:52

2 Answers2

1

You can't directly compare an object to its type using ==. Try

resultfg1920 = morantSeason1920.loc[morantSeason1920['FG'].apply(type) == float]

or

resultfg1920 = morantSeason1920.loc[morantSeason1920['FG'].apply(type) != str]

print (resultfg1920)
Zorgoth
  • 499
  • 3
  • 9
  • Thank your for your help. That is the easy solution i was searching for. I still get the answer that the dataframe ist empty but i guess, that has nothing to do with that. – krix May 23 '22 at 18:39
  • 1
    The other values look like `int`s. Try the `resultfg1920 = morantSeason1920.loc[morantSeason1920['FG'].apply(type) != str]` variant. – Zorgoth May 23 '22 at 18:43
-1

I would start by either removing the rows with string values or replacing them with null. If you replace it with null, you will still have to remove them to return the correct answer.

Do this for whatever column you need calculations from:

df[df.columnName != 'text']

Once you remove the strings, you can use df.describe() to print out all of the data you need including median.

There are many ways to do this, this method is considered more manual in terms of handling a dataframe.

wjandrea
  • 28,235
  • 9
  • 60
  • 81
  • As it’s currently written, your answer is unclear. Please [edit] to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community May 23 '22 at 18:08