1

I am trying to bar graph my csv file and try to graph only the top 20 of the data, below I tried to delete data from 0 to 175 to graph the remaining data but what happens is that it only deleted the rows 0 and 175. How do I make it from 0 to 175, I tried doing 0:175 but it marked as an error

import pandas as pd
from matplotlib import pyplot as plt

df = pd.read_csv('MAY_2021_COVID_CLEANED.csv')


df = df.drop([0,175],axis=0)

print(df)

#assigning variables
x = df['Country']
y = df['Death Count']

#Graphing
plt.bar(x,y)
plt.xticks(rotation=90,fontsize=5)
plt.title("Number of Death per Country")
plt.ylabel("Number of Death")
plt.xlabel("Country")
plt.show()
Himaa
  • 21
  • 4
  • Are you trying to plot the last 20 rows of the data? Or the 20 highest values from `df["Death Count"]` ? Are these the same thing? – Alex Jul 14 '21 at 09:30
  • You could sort the data frame in descending order of the number of deaths and specify the top 20 cases. For example, let's say `x=df['country'].head(20)`. – r-beginners Jul 14 '21 at 09:31
  • 1
    @r-beginners I think they want the last 20 rows of the df, so tail would work here. There's also [`nlargest`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.nlargest.html) – Alex Jul 14 '21 at 09:35
  • The last 20 rows, because I already sorted it out from lowest to highest (last row is the highest) sorry for not putting it out – Himaa Jul 14 '21 at 09:36
  • 1
    [`df.tail(20)`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.tail.html) will get you the last 20 rows – Alex Jul 14 '21 at 09:36

3 Answers3

1

How do I make it from 0 to 175, I tried doing 0:175 but it marked as an error

You might use range built-in for this task, consider following simple example:

import pandas as pd
df = pd.DataFrame({'value':[0,1,2,3,4,5,6,7,8,9]})
df = df.drop(range(0,5),axis=0)
print(df)

output

   value
5      5
6      6
7      7
8      8
9      9

Note that range is inclusive-exclusive therefore rows 0,1,2,3,4 were removed - start (0) is inclusive, end (5) is exclusive.

Daweo
  • 31,313
  • 3
  • 12
  • 25
1

Have you tried slicing? This should work:

import pandas as pd
from matplotlib import pyplot as plt

df = pd.read_csv('MAY_2021_COVID_CLEANED.csv')
df = df[:20]

#assigning variables
x = df['Country']
y = df['Death Count']

#Graphing
plt.bar(x,y)
plt.xticks(rotation=90,fontsize=5)
plt.title("Number of Death per Country")
plt.ylabel("Number of Death")
plt.xlabel("Country")
plt.show()
Alfie Grace
  • 222
  • 1
  • 9
1

If you want to just plot the last 20 rows you can use DataFrame.tail

df = df.tail(20)

If you want the top 20 values from a series use DataFrame.nlargest

df = df.nlargest(20, "Death Count")
Alex
  • 6,610
  • 3
  • 20
  • 38