-1

I MAKE : I want to find special words from programming domain (HTML,JavaScript,Back-end...) in big text (1000+ words) and put all programming words that I find to the new array

PROBLEM : In my mind there is yet only one way to do this.

  1. Write VERY BIG array with programming words
  2. Write a cycle that will compare every word from BIG TEXT with words from BIG ARRAY (with programming words)

But I think there is some solution that greatly simplify the task.

Any idea how to make this MORE easier ?

I write on server JavaScript

EDIT : guys i know about indexOf and etc. thank you but i want to find

  1. Very fast algorithm to do it
  2. How can I avoid writing the programming words (500+)

"SOLVED :" I found underscore.js Example :

var tagsObject = {
  "Java":"JAVA",
  "J2EE":"J2EE"
}

var words = "Java is a big language ! ! "

  var words = content.split(/\b/); // make array
  words = _.uniq(words); // make array with uniq words
  console.log(  _.intersection(words,_.keys(tagsObject))); // Computes the list of values that are the intersection of all the arrays, return array
//_.keys(tagsObject) - return array with keys
nl pkr
  • 656
  • 1
  • 11
  • 21
  • JavaScript has hashtables (maybe they're called "dictionaries"?) Just put all your programming words into a hashtable (as keys; the values you pair them with aren't important) and then loop through each word in the "big text", looking it up in your hashtable. – j_random_hacker Jul 13 '15 at 21:36
  • What do you want to do when you find the words? Count them? Just acknowledge that the text contains them? There's some information missing from your question. – Andy Jul 13 '15 at 21:40
  • You *can* avoid writing the programming words, or at least have the computer help decide what they should be, but that is a big topic, and could take more time to implement than just writing the words by hand. I suggest you restrict this question to just the first part, which still needs some extra details from you to understand exactly what you are trying to do. Then if that goes well, and you understand the search part OK, come back to discovering the "programming words". – Neil Slater Jul 14 '15 at 07:25
  • @NeilSlater I understand first part , but what for second ? Should i make new question ? – nl pkr Jul 14 '15 at 09:49
  • @nlpkr: Yes I suggest asking a second question. You will need to make clear what data you have in order to build the word list, and show what you have researched/tried. Any short code or data examples that clarify what it is you are looking for would help make it a better question. Also, if you now understand the first part thanks to the comments and answers given, it is normal practice on Stack Overflow to accept (tick) the answer that helped you. – Neil Slater Jul 14 '15 at 10:22
  • @NeilSlater did this [new question](http://stackoverflow.com/questions/31404445/make-object-with-programming-terms) – nl pkr Jul 14 '15 at 11:47

3 Answers3

0

There are quite some fast string searching algorithms which come to mind, especially Rabin-Karp.

An implementation can be found in this gist, where you could also compare run times between the different functions:

function simpleSearch(text, str) {
   ...

function searchRabinKarp(text, str) {
   ...

The performance of standard methods (regex and indexOf) have been compared already in this post.

Community
  • 1
  • 1
adrianus
  • 3,141
  • 1
  • 22
  • 41
0

Well.. Actually what I like to recommend you is using a JS function named indexOf against your array. By this way, you would at least remove one of your loops. If the return value of the above JS func is -1 then the specific character is not found.

Var sourceArray=[a,b,c,d,e];

Var toBeFoundValues=[a,x,z,d];

For(var i=0;i<toBeFoundValues.length;I++)

If(sourceArray.indexOf(toBeFoundValues[I])!=-1)

// logic here

I hope the above code could help you. Sorry if the code does not look pretty as am answering via my smart phone!

Ali
  • 847
  • 2
  • 13
  • 37
-3

You could try using a regular expression. This one searches to see if HTML, JavaScript, or Back-end are in the string

var passingWords = "HTML,blah,otherWordsHere,JavaScript,Back-end";
var failingWords = "blah, otherWordsHere, h.tml, H.TML";
var re = new RegExp('(HTML)|(JavaScript)|(Back-end)');
console.log(re.test(passingWords));
console.log(re.test(failingWords));

It will return true if any of the words given in the new declaration are in the string given and false if they are not.

In your case you would probably want to test each word individually or write a different regular expression that allows for any combination of the words with any characters between each defined word.

This will only tell you if the word is in there or not, not where it is or any other information.

JavaScript RegExp reference

camiblanch
  • 3,866
  • 2
  • 19
  • 31