I have several rows of records (id, query,count
) that I would like to collapse by observing the query
on multiple rows. I want to retain the row with the longest query
and the sum of the count
field for the rows that are collapsed.
Sample input:
24, que, 2
24, querie, 1
24, queries, 1
25, term1, 3
25, term1+term2, 11
25, term1+term2+term3, 1
26, inventory, 5
26, issues, 10
27, close, 1
27, sclosed, 2
28, abcde, 2
28, abcfe, 2
Required output:
24, queries, 4
25, term1+term2+term3, 15
26, inventory, 5
26, issues, 10
27, close, 1
27, sclosed, 2
28, abcde, 2
28, abcfe, 2
I'm only after a special case of substrings: 24,25
is collapsed but 27
is not, due to the prefix on close
. 26
is also not collapsed as the query
field in the second row is not a substring (not prefixed) by the first.
Edit: Added id 28
which is another case where the records should not be collapsed.