1

I have an array of Strings in JavaScript. I am trying to develop a function that

  1. Takes a substring as an input.
  2. Searches through the array.
  3. Returns strings from the array close to the substring. The list will be provided as suggestions to the caller.

For example:-

Array contains the below entries.

Hello
What is hello
World
Spacearenotthereinthishello
HELLO
Highway to hell
JavaScript
StackOverflow

I invoke the function as shown below

var result[] = searchFunc('hell');

The result array should contain

Hello
What is hello
Spacearenotthereinthishello
HELLO
Highway to hell

It is possible that the array could contain atleast 100 strings ( or more). I am looking for a scalable solution.

Initially, i figured i should sort and then do a binary search but its cumbersome to do if you wanna pull of all the suggestions from the master array for a particular string input. I am looking for algorithms that can help me achieve a faster search. I am not that worried about insertion timecomplexity in master array.

I did look up multiple stack overflow posts. They do speak about searching a big book for specific strings. None of them talk about returning suggestions from an array for a substring.

Your help is appreciated.

NishM
  • 1,706
  • 2
  • 15
  • 26
  • 2
    If you're looking on the order of 100 strings, and each string is maybe a couple of dozen characters long, a simple filter using indexOf should scale just fine. If you're going to end up with 100,000 elements, then you might want to start looking into advanced data structures and algorithms for ngram searches. – StriplingWarrior Apr 07 '17 at 22:58
  • 1
    Yes. The purpose of this question was to see if others know of an algorithm i can implement to create a solution that is scalable . ( even open source it if its good enough ) . Do you know any that can help me process huge string arrays ? – NishM Apr 10 '17 at 19:42
  • 1
    You could use [a tool like this](https://www.npmjs.com/package/n-gram) to get character combos found in each of your phrases, and build an object/map from each distinct character combination to the phrases it's found in. Or you could use [something like this](https://lunrjs.com/guides/getting_started.html) which just lets you do full-text searches generally. – StriplingWarrior Apr 11 '17 at 15:21
  • 1
    @StriplingWarrior I will checkout Lunr. I owe you a beer for even suggesting it. Thanks! – NishM Sep 28 '17 at 05:10

3 Answers3

3

YourArray.filter() will do the job. Quick prototype:

var results = arrayName.filter(function(value) {
  return value.toLowerCase().indexOf(searchStr.toLowerCase()) >= 0;
});
Cameron
  • 1,049
  • 12
  • 24
Daniel AG
  • 320
  • 1
  • 9
  • 1
    Might want to evaluate `searchStr.toLowerCase()` outside of the filter to avoid unnecessary extra work. – StriplingWarrior Apr 07 '17 at 23:01
  • there's a 2nd arg you can use with filter to pass the lowerCase: .`filter(fn, searchStr.toLowerCase())`, then inside it's `.indexOf(this)`. add `"use strict" to avoid extra type-casting, but it still works as is... – dandavis Apr 08 '17 at 00:47
  • Thanks. I am gonna test it out a bit. I am all up for using native functions but a little skeptical about scalability. What if i have more than 500 entries in the array? – NishM Apr 10 '17 at 17:27
  • 1
    @NishM With 500 you will be OK, but you can easily test it. Make a loop, push 100s or even 1000s of strings to the array, it can be the same string, it won't make a difference. Truly scalable solutions for this type of query are better implemented at the database server, e.g. using full text indexes. – Daniel AG Apr 11 '17 at 14:02
0

Quick Answer (TL;DR)

  • Use Array.prototype.filter()

Detailed Answer

Context

  • Javascript
  • Array filtering

Problem

  • Scenario: Developer wishes to filter an array for matches on a substring

Solution

// declare array of string fragments
myArrayOfStringFragments = /* TODO: get the content [ ... ] */

// optional step, normalize all strings to lower-case
/* TODO: input normalization with Array.prototype.map() or something similar */

// declare comparison function
function strContainsMatch(slookfor) {
  return slookfor.includes("ello");
}

// iterate using Array.prototype.filter()
var filtered = myArrayOfStringFragments.filter(strContainsMatch);

Pitfalls

  • Data input normalization (are differences based on letter-case significant?)
  • string comparison based on non-ascii characters
  • string.includes() method may not be available, instead use string.indexOf()
  • avoiding false-postive matches based on chosen substring
    • (e.g., if you are looking for hello then hell will match more than you want, and it will miss Hello if your input is not normalized to lowercase)

See also

Community
  • 1
  • 1
dreftymac
  • 31,404
  • 26
  • 119
  • 182
-1

Alternate solution would be to create a regex to do the same.

var condition = new RegExp("hell", 'g');

var results = arrayName.filter(function(value) {
    return condition.test(value.toLowerCase())
});
Umang
  • 212
  • 5
  • 18