0

I have a table with Long Words like 'Condemnation' and 'Income' in column A, and Shorter Words such as 'Con' and 'Come' in column B.

I'd like to create a cell to the right which will search through the 'LONG WORD' column if it contains the text of the 'SHORTER WORD' column and print them as a pair.

I only need it to return the first instance it comes across as it goes down.

I have looked at various MATCH and LOOKUP commands, but none seem quite to be able to do the 'return one matching word from a whole column' bit.

Thanks

Tardy

tardy pigeon
  • 225
  • 6
  • 18
  • 1
    It would help if you could include an example spreadsheet with some sample data in it. Out of curiosity, would a script based solution work for you, or do you need this to be a spreadsheet function? – Douglas Gaskell Apr 23 '16 at 21:07
  • @douglasg14b Sorry, didn't realise you could @ reply here! I'd be happy to try a script-based solution. I've been messing around with ROW and SORT and even REGEXREPLACE but to no avail! – tardy pigeon Apr 24 '16 at 08:51
  • I'll see if I can throw together a script based solution for you when I get up. If you can provide a small sample of your data in a public sheet that would be great. Otherwise I'll have to make assumptions on how it's laid out. – Douglas Gaskell Apr 24 '16 at 09:43
  • Add some sample of the input data and the result that you expect. – Rubén Apr 24 '16 at 15:28

4 Answers4

1

I've thrown together a script based solution for you. Other solutions that require a formula on every line where you might have partials will end up bogging down the sheet by quite a bit for large data sets. This should generate a range of matches after a couple seconds for data several tens of thousands of rows long.

Note: Since you opted to not provide a sample dataset, I had to assume how it's laid out. However, this will work regardless of where your columns are, as long as they are titled as Full Words, Partials, and Matches.

Link to spreadsheet (Must be signed into a google account to use the button): Google Sheet

Just click the Get Matches button to have it generate the matches.

The source is a bit more complex/dynamic than it needs to be, but I had a bunch of functions already laying around that I just reused.

Source:

//Retrieves all the necessary word matches
function GetWordMatches() {
  var spreadsheet = SpreadsheetApp.openById('1s0S2iJ7L0wEXgVsKrpuK-aLysaxfHYRDQgp3ShPR8Ns').getSheetByName('Matches');
  var dataRange = spreadsheet.getDataRange();
  var valuesRange = dataRange.getValues(); 
  var columns = GetColumns(valuesRange, dataRange.getNumColumns(), 0);

  var fullWordsData = GetColumnAsArray(valuesRange, columns.columns['Full Words'].index, true, 1);
  var partialsArray = GetColumnAsArray(valuesRange, columns.columns['Partials'].index, true, 1);
  var partialsData = GeneratePartialsRegexArray(partialsArray);

  var matches = GenerateMatches(fullWordsData, partialsData); 

  WriteMatchesToSheet(spreadsheet, columns.columns['Matches'].index, matches, partialsArray);  
}

//Writes the matches to the sheet
function WriteMatchesToSheet(spreadsheet, matchesColumnIndex, matches, partialsArray){
  var sortedMatches = SortByKeys(matches, partialsArray);
  var dataRange = spreadsheet.getRange(2, matchesColumnIndex+1, sortedMatches.length);
  dataRange.setValues(sortedMatches);
}

//Generates an array of matches for the full words and partials
function GenerateMatches(fullwordsData, partialsData){
  var output = [];
  var totalLoops =  0;

  for(var  i = 0; i < fullwordsData.length; i++){
    totalLoops++;
    for(var ii = 0; ii < partialsData.length; ii++){
      totalLoops++;
      var result = fullwordsData[i].match(partialsData[ii].regex)
      if(result){
        output.push([fullwordsData[i], partialsData[ii].value]);
        partialsData.splice(ii, 1);
        break;
      }
    }
  }
  if(partialsData.length > 0){
    var missedData = GenerateMissedPartialsArray(partialsData);
    output = output.concat(missedData);
  }  
  return output;
}

//Generates a missed partials array based on the partials that found no match.
function GenerateMissedPartialsArray(partialsData){
  var output = [];
  for(var  i = 0; i < partialsData.length; i++){
    output.push(['No Match', partialsData[i].value])
  }
  return output;
}

//Generates the regex array for the partials
function GeneratePartialsRegexArray(partialsArray){
  var output = [];
  for(var  i = 0; i < partialsArray.length; i++){
    output.push({regex: new RegExp(partialsArray[i], 'i'), value: partialsArray[i]});
  }
  return output;
}

//http://stackoverflow.com/a/13305008/3547347
function SortByKeys(itemsArray, sortingArray){
  var itemsMap = CreateItemsMap(itemsArray), result = [];
  for (var i = 0; i < sortingArray.length; ++i) {
    var key = sortingArray[i];
    result.push([itemsMap[key].shift()]);
  }
  return result;  
}

//http://stackoverflow.com/a/13305008/3547347
function CreateItemsMap(itemsArray) {
  var itemsMap = {};
  for (var i = 0, item; (item = itemsArray[i]); ++i) {
    (itemsMap[item[1]] || (itemsMap[item[1]] = [])).push(item[0]);
  }
  return itemsMap;
}

//Gets a column of data as an array
function GetColumnAsArray(valuesRange, columnIndex, ignoreBlank, startRowIndex){
  var output = [];
  for(var  i = startRowIndex; i < valuesRange.length; i++){
    if(ignoreBlank){
      if(valuesRange[i][columnIndex] !== ''){
        output.push(valuesRange[i][columnIndex]);        
      }
      continue;
    }
    output.push(valuesRange[i][columnIndex]);
  }
  return output;
}

//Gets a columns object for the sheet for easy indexing
function GetColumns(valuesRange, columnCount, rowIndex)
{
  var columns = {
    columns: {},
    length: 0
  }

  Logger.log("Populating columns...");
  for(var i = 0; i < columnCount; i++)
  {
    if(valuesRange[0][i] !== ''){
      columns.columns[valuesRange[0][i]] = {index: i ,value: valuesRange[0][i]};
      columns.length++;      
    }
  }  
  return columns;
}

A note on some decisions: I opted to not use map, or other more concise array functions for the sake of performance.

Douglas Gaskell
  • 9,017
  • 9
  • 71
  • 128
  • Just a note: The doc you shared is writable by anyone and has subsequently been corrupted by anonymous changes. You might want to restore it to an older version and then set comment-only permissions so that others need to copy the sheet before making changes :) – Motin May 23 '18 at 11:55
  • Thanks for letting me know, I've restored it – Douglas Gaskell May 23 '18 at 17:56
0

MATCH and LOOKUP doesn't work for partial matches.

One alternative is to use SEARCH or FIND together with other functions in an array formula.

Example:

  • Column A contains a list of long strings
  • Cell B1 contain a short string
  • Cell C1 contain a formula that returns the first long string in column a that contains the short string in B1
=ArrayFormula(INDEX(A1:A,SORT(IF(search(B1,A1:A),ROW(A1:A),),1,TRUE)))

Data

+---+--------------+-------+-------------+
|   |      A       |   B   |      C      |
+---+--------------+-------+-------------+
| 1 | Orange juice | apple | Apple cider |
| 2 | Apple cider  |       |             |
| 3 | Apple pay    |       |             |
+---+--------------+-------+-------------+
Rubén
  • 34,714
  • 9
  • 70
  • 166
  • Thanks Ruben. I think the problem with this formula is, although it does output what looks like the right kind of data, it doesn't do it in the square adjacent to the cell with the word in. If I could explain a little better, I have a huge list of words, and if somewhere in that list is 'concept', I want the cell next to the word 'con' to extract 'concept' and put that next to 'con'. – tardy pigeon Apr 23 '16 at 17:27
  • @tardypigeon What if that list contains `concept` and `congregate`? Is there a preference to which word you want to choose when searching for `con`, or do you want to list all words out that match next to `con`? – Douglas Gaskell Apr 23 '16 at 21:10
  • Hi Douglas. I only need it to return the first instance it comes across as it goes down, as I'm going to be randomising the list of longer words. – tardy pigeon Apr 23 '16 at 23:40
  • Also, I would have included an example spreadsheet but couldn't figure out how to paste what I've got in... – tardy pigeon Apr 23 '16 at 23:42
  • I mean, what surprises me is that there must be a way of getting the spreadsheet to return the actual result of the first match found (and not just 'TRUE' or the cell reference). – tardy pigeon Apr 24 '16 at 08:55
0

OK, I think I've found an answer. I'll post it here in case it's of use to anyone else.

To give credit where's credit's due, I found it here

This does what I was looking for:

=INDEX($D$1:$D$3,MATCH(1,COUNTIF(A1,"*"&$D$1:$D$3&"*"),0))

It does slow EVERYTHING down a lot because everything is cross-referencing like mad (I had 3000 lines on my spreadsheet), but if there's a list of words in D1-3 it will see if cell A1 contains one of those words and print the word it matches with.

Thanks to everyone who offered solutions, particularly @douglasg14b - if there is one that is less taxing in terms of memory, that would be great, but this does the trick in a slow kind of way!

Thanks

Tardy

Max Makhrov
  • 17,309
  • 5
  • 55
  • 81
tardy pigeon
  • 225
  • 6
  • 18
0

This works too:

=QUERY(FILTER($D$1:$D$3,REGEXMATCH(A1,"(?i)"&$D$1:$D$3)),"limit 1")

we use REGEXMATCH and (?i) makes the search case-insensitive. limit 1 in query gives only first occurrence.

Max Makhrov
  • 17,309
  • 5
  • 55
  • 81