0

Column A contains texts. Each text has values in the neighboring cells. Now I want to check for each row whether the other rows also contain these values. Restriction: At least three values must match and regardless of the order in the row. The rows, which belong together because of at least 3 common values, should be grouped. A group name should be assigned to them dynamically. An example table might look like this: enter image description here INPUT

Word Value01 Value02 Value03 Value04
banana A B C D
sweet banana L N O J
cheap banana B D A F
best banana D C A H
banana banana N L I O
how to make banana bread O A I K
I love banana B C A D
banana bread I O A C

EXPECTED OUTPUT

Word Value01 Value02 Value03 Value04 group name
banana A B C D banana group 1
sweet banana L N O J banana group 2
cheap banana B D A F banana group 1
best banana D C A H banana group 1
banana banana N L I O banana group 2
how to make banana bread O A I K banana group 3
I love banana B C A D banana group 1
banana bread I O A C banana group 3

I tried the following formula:

=ArrayFormula(IF(B1:E1=B1:E8,"banana group 1","banana group 2"))

But the problems are, only the exact match is recognized and I am limited to two group names. Here is a link to the example table: Example Table

TheMaster
  • 45,448
  • 6
  • 62
  • 85
user3392296
  • 644
  • 4
  • 16
  • Your question can be greatly improved if you add input table and expected output table to the question. [Tables](https://webapps.stackexchange.com/a/161855/) are a better alternative than spreadsheets to show your data structure. If you share spreadsheets, make sure to also add images of your sheet to avoid closure of your question, as questions here must be [self](https://meta.stackoverflow.com/a/260455) [contained](https://meta.stackexchange.com/a/149892). [Your email address can also be accessed by the public](https://meta.stackoverflow.com/questions/394304/), when you share Google files. – TheMaster Sep 28 '22 at 10:12
  • @TheMaster Thanks for the tipp, I've added a link to the example table – user3392296 Sep 28 '22 at 10:24
  • Could you add it as a markdown table(help link in my previous comment)? – TheMaster Sep 28 '22 at 10:41
  • my knowledge does not go so far that i can write a script – user3392296 Sep 28 '22 at 11:02
  • What are the groupings indicating? – Osm Sep 28 '22 at 12:33
  • @Osm Sorry, think I didn't understand your question. Any word that shares at least 3 values with other words belongs to a group together with these words. For example "banana" shares the values A,D,B with the word "cheap banana", so they belong both to the group "banana group 1" – user3392296 Sep 28 '22 at 13:03
  • Share the same values regardless of order? right – Osm Sep 28 '22 at 13:05
  • @Osm right! This is the reason while IF ist not working here. – user3392296 Sep 28 '22 at 13:11
  • Are the values representative? Will the `Value01` be 1 character letter or can it be coerced to a 1 character letter? – TheMaster Sep 28 '22 at 14:00
  • @TheMaster No, in reality the values are URLs. The URLs each Keyword is ranking on the first site of Google. – user3392296 Sep 28 '22 at 14:06
  • @user3392296 `"banana" shares A,D,B with "cheap banana"` ..... `D` is not shared ? no – Osm Sep 28 '22 at 14:31
  • @Osm Sorry, I don't understand. Both Share A,B and D – user3392296 Sep 28 '22 at 14:37
  • See this [Image](https://i.imgur.com/CjwvCz7.png) Typo(Columns)x A2 Cell I mean-----if cell A2 share at least 3 characters with the characters in the range `B2:E2` then what – Osm Sep 28 '22 at 14:43
  • @TheMaster I believe I understand how to execute it, but I don't comprehend the desired outcome. – Osm Sep 28 '22 at 15:13
  • The desired outcome is to group the keywords. Example: If keyword A ranks for at least 3 of the same URLs as keyword B, then these keywords belong together in one group. – user3392296 Sep 28 '22 at 15:19
  • @osm Now I understand. No, it is not a question of whether the values are contained in the words. In real the values are URLs. – user3392296 Sep 28 '22 at 15:24
  • 1
    @user3392296 Something like [This?](https://i.imgur.com/AaIvMiI.png) – Osm Sep 28 '22 at 15:40
  • 1
    @Osm Rows 2(`ABCD`),4,5 and 8 belong to the same group, because they all contain any 3 of `ABCD`. So, all those rows belong to `group#1`. Get it? – TheMaster Sep 28 '22 at 15:48

2 Answers2

1

There must be a easier way to do this, but this is one (expensive) way to do this. SORT each of row in ascending order, then get all the permutations of each row(for 4 items, that is 24 permutations per row). Check if any permutation match any of the previous row. If it matches, take the group it belongs to(F:F). If not, create a new group based on row number.

F2:

=ARRAYFORMULA(IFNA(INDEX(F$1:F1, MATCH( true,BYROW(B$1:E1,LAMBDA(r, OR(BYROW(REDUCE(TRANSPOSE(SORT(TRANSPOSE(B2:E2))),SEQUENCE(PERMUT(4,4)-1),LAMBDA(a,c,{a;GET_NEXT_LEX(INDEX(a,c,0))})),LAMBDA(perm, SUMPRODUCT(perm=r)>=3 ))))),0)),"group#"&ROW()))

Drag fill down. It gets more expensive based on number of rows.

For this to work, you need to install GET_NEXT_LEX function and it's dependencies from here.

Word Value01 Value02 Value03 Value04 output
banana A B C D group#2
sweet banana L N O J group#3
cheap banana B D A F group#2
best banana D C A H group#2
banana banana N L I O group#3
how to make banana bread O A I K group#7
I love banana B C A D group#2
banana bread I O A C group#7
TheMaster
  • 45,448
  • 6
  • 62
  • 85
1

I thought I would take a look at all similarities. But you could get only three if you change just one line. I will point it out below:

function sortgroups() {
  const ss = SpreadsheetApp.getActive();
  const sh = ss.getSheetByName("Sheet0");
  const vs = sh.getRange("A2:E9").getValues();
  let obj = { pA: [] };
  vs.forEach((r, i) => {
    let a = [r[1], r[2], r[3], r[4]];
    obj[r[0]] = { chars: a, sub: [] };
    obj.pA.push(r[0]);
    vs.forEach((s, j) => {
      if (j != i) {
        let b = [s[1], s[2], s[3], s[4]];
        let m = [];//matches
        b.forEach((c, k) => {
          let idx = a.indexOf(c);
          if (~idx) {
            m.push(c);
          }
        })
        if (m.length == 3) {//this is the line to change to limit similarites
          obj[r[0]].sub.push({ p: s[0], chars: m.join(',') });
        } 
      }
    });

  });
  let r = [];
  obj.pA.forEach((p, i) => {
    r.push([p, obj[p].chars[0], obj[p].chars[1], obj[p].chars[2], obj[p].chars[3], '', '']);
    obj[p].sub.forEach((o, j) => {
      r.push(['', '', '', '', '', o.p, o.chars]);
    })

  })
  r.unshift(["Words","Value1","Value2","Value3","Value4","Similar Words","Similarities"]);
  sh.getRange(sh.getLastRow() + 2, 1, r.length, r[0].length).setValues(r);
}

Output:

Words Value1 Value2 Value3 Value4 Similar Words Similarities
banana A B C D
cheap banana B,D,A
best banana D,C,A
how to make banana bread A
I love banana B,C,A,D
banana bread A,C
sweet banana L N O J
banana banana N,L,O
how to make banana bread O
banana bread O
cheap banana B D A F
banana A,B,D
best banana D,A
how to make banana bread A
I love banana B,A,D
banana bread A
best banana D C A H
banana A,C,D
cheap banana D,A
how to make banana bread A
I love banana C,A,D
banana bread A,C
banana banana N L I O
sweet banana L,N,O
how to make banana bread O,I
banana bread I,O
how to make banana bread O A I K
banana A
sweet banana O
cheap banana A
best banana A
banana banana I,O
I love banana A
banana bread I,O,A
I love banana B C A D
banana A,B,C,D
cheap banana B,D,A
best banana D,C,A
how to make banana bread A
banana bread A,C
banana bread I O A C
banana A,C
sweet banana O
cheap banana A
best banana C,A
banana banana I,O
how to make banana bread O,A,I
I love banana C,A

Only Three:

Words Value1 Value2 Value3 Value4 Similar Words Similarities
banana A B C D
cheap banana B,D,A
best banana D,C,A
sweet banana L N O J
banana banana N,L,O
cheap banana B D A F
banana A,B,D
I love banana B,A,D
best banana D C A H
banana A,C,D
I love banana C,A,D
banana banana N L I O
sweet banana L,N,O
how to make banana bread O A I K
banana bread I,O,A
I love banana B C A D
cheap banana B,D,A
best banana D,C,A
banana bread I O A C
how to make banana bread O,A,I
Cooper
  • 59,616
  • 6
  • 23
  • 54
  • If you go back to the script editor, if possible, let me know the logged total execution time for this script run. – TheMaster Sep 28 '22 at 21:30