What you are looking for is the pd.factorize
function which encodes the different patterns of objects as an enumerated type (with different serial numbers). You can use it as follows:
df['Col1'] = 'C' + pd.Series(pd.factorize(df['Col1'])[0] + 1, dtype='string')
or if your Pandas version does not support string
dtype, use:
df['Col1'] = 'C' + pd.Series(pd.factorize(df['Col1'])[0] + 1).astype(str)
Demo
Data Input
data = {'Col1': ['XXXXXXXXXXXXXX', 'YYYYYYYYYYYYYY', 'XXXXXXXXXXXXXX', 'YYYYYYYYYYYYYY', 'XXXXXXXXXXXXXX', 'ZZZZZZZZZZZZZZ']}
df = pd.DataFrame(data)
print(df)
Col1
0 XXXXXXXXXXXXXX
1 YYYYYYYYYYYYYY
2 XXXXXXXXXXXXXX
3 YYYYYYYYYYYYYY
4 XXXXXXXXXXXXXX
5 ZZZZZZZZZZZZZZ
Output:
print(df)
Col1
0 C1
1 C2
2 C1
3 C2
4 C1
5 C3