I have a dataset as follows:
id paramgroup1
1 CURRENCY=USD~COUNTRY=USA~CUSTCATEGORY=REGULAR
2 CURRENCY=USD~COUNTRY=USA~CUSTCATEGORY=GUEST
3 CURRENCY=INR~COUNTRY=IND~CUSTCATEGORY=REGULAR
Now i want to add a count column here which count the parameter seperated by the delimiter (~). So the final dataset after the transformation operation of Spark,
id paramgroup1 count
1 CURRENCY=USD~COUNTRY=USA~CUSTCATEGORY=REGULAR 3
2 CURRENCY=USD~COUNTRY=USA~CUSTCATEGORY=GUEST 3
3 CURRENCY=INR~COUNTRY=IND 2
Any help would be appreciated....