I'm trying to analysis a dataset by python. The dataset has many types (int, float, string), I converted all types except 2 attributes called (source port , destination port) whose dtype is object.
when explore this attributes in python :
Column Non-Null Count Dtype
--- ------ -------------- -----
0 sport 668522 non-null object
1 dport 668522 non-null object
The values are:
sport dport
0 6226 80
1 6227 80
2 6228 80
3 6229 80
4 6230 80
In my view, there are just number values, why does python deal with the port as an object?
I tried also using the Weka tool, but the program can't read values, can anyone explain to me the reason, or how to solve the problem.
The port is an important feature, it is useful in mining the data, I don't want to drop it from a dataset.
update: The dataset format (CSV). The sample of values above up. There are 2 features ( source port, in short "sport" ) and ( destination port, in short "dport")
In python, to read values :
import pandas as pd
dt = pd.read_csv("port.csv")
when print dt
show values but when using ML algorithm like k-means can't deal with it.
on the other hand, in Weka, after importing the csv file, was displayed the following message "Attribute is neither numeric nor norminal"