I have a problem. I have the following dataframe. I want to count all the unique values. As you can see the problem is, that some of the words are uppercase or lowercase but are compleately the same thing i want to count. So in my case "Wifi" and "wifi" should be counted as 2. Same for the others. Is there a way i can do that by for example ignore the upper and lower case? And as you can see there are different writings for wifi (for example "Wifi 230 mb/s") is there a way to count the wifis when wifi is in the string?
d = {'host_id': [1, 1, 1, 2, 2, 3, 3, 3, 3],
'value': ['Hot Water', 'Wifi', 'Kitchen',
'Wifi', 'Hot Water',
'Coffe Maker', 'wifi', 'hot Water', 'Wifi 230 mb/s']}
df = pd.DataFrame(data=d)
print(df)
print(len(df[df['value'].str.contains("Wifi", case=False)]))
print(df['value'].unique())
print(len(df['value'].unique()))
[out]
host_id value
0 1 Hot Water
1 1 Wifi
2 1 Kitchen
3 2 Wifi
4 2 Hot Water
5 3 Coffe Maker
6 3 wifi
7 3 hot Water
8 3 Wifi 230 mb/s
4 # count wifi
['Hot Water' 'Wifi' 'Kitchen' 'Coffe Maker' 'wifi' 'hot Water'] # unique values
6 # len unique values
What [out] should look like:
value count
0 Hot Water 3
1 Wifi 4
2 Kitchen 1
3 Coffe Maker 1