I was trying to look for the amount of houses with the 5 types of data in ocean proximity (NEAR BAY, <1HR OCEAN, INLAND, NEAR OCEAN, ISLAND). The problem is I couldn't figure on how to make it work.
I tried using a code to count the amount of houses with NEAR BAY, 1HR OCEAN, INLAND, NEAR OCEAN, and ISLAND attached to the excel file. Here is the code below:
import pandas as pd
proximities = []
for i in range(20640):
if not data.ocean_proximity[i] in proximities:
proximities.append(data.ocean_proximity[i])
proximities
df = pd.DataFrame(data.ocean_proximity)
print(df.count())
And here is the result:
ocean_proximity 20640
dtype: int64
And this is the result I was expecting (note the numbers are just random, the numbers can be completely different in the result):
NEAR BAY 927
<1HR OCEAN 2284
INLAND 2060
NEAR OCEAN 9482
ISLAND 4133
So is there any way to be able to make the code work like this, or is Google Colab bugged?
By the way, here is the Excel file used: file:///C:/Users/Minerva%20Panganiban/Downloads/housing.pdf