Grouping by character matching & string length

Question

Suppose I have a column in a dataframe with strings. I want to create a grouping technique so that the length of the string is matched and then the character of the string is also matched to acknowledge it as a specific group.

The output should be grouped like the below provided sample:

Rule                      Group
x                           1
x                           1
xx                          2
xx                          2
xy                          3
yx                          3
xx                          2
xyx                         4
yxx                         4
yyy                         5
xyxy                        6   
yxyx                        6
xyxy                        6

I have been able to derive a function to provide me with desired output in Python.But I am unable to get desired output with R Programming. — NiMbuS, Apr 18 '19 at 09:35
Please read https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example and reformat your quesion accordingly. We can't help you without a clear description of your problem and knowing what you have done. — Julian_Hn, Apr 18 '19 at 09:38
Suppose the column in the data-frame is similar to column "Rule" mentioned in the sample.I want to group the column based on the string length and string characters — NiMbuS, Apr 18 '19 at 09:53

score 2 · Accepted Answer · answered Apr 18 '19 at 09:56

You can split the Rule, sort and paste back together. Matching the result with the unique result will then give you what you need. In R,

v1 <- sapply(strsplit(df$Rule, ''), function(i)paste(sort(i), collapse = ''))
match(v1, unique(v1))
#[1] 1 1 2 2 3 3 2 4 4 5 6 6 6

Grouping by character matching & string length

1 Answers1