I have a dataset that shows the number of visits a user done during a year from each page.
For example:
0: means no visit from the page
27: means 27 times visit during a year by a user
I want to cluster the users based on their visits from pages. The problem is that more than half of the values in variables are zeros and when I plot them with a box plot the numbers greater than 20 looks like outliers. but I think they are not outliers and they are actual data because visiting a page 27 times during a year by a user is very normal.
In this scenario how can I deal with outliers?
Thanks in advance