4

I have 5 html files and I have a search form that I would like to use to search for text in these html files .

<form>
   <input type ='text' />
   <input type ='submit' />
</form>

I have an idea of using xmlhttprequest to get the files

var xhr = new XMLHttpRequest();
xhr.open("GET", "file1.html", false);
xhr.send();
var guid = xhr.responseText;

var xhr = new XMLHttpRequest();
xhr.open("GET", "file2.html", false);
xhr.send();
var guid = xhr.responseText;

...

then search for text in these files but I don't know how to search in the files using javascript.

How to search the files after getting it using xmlhttprequest ? Or Is there is another way to do the search using javascript ?

tommy
  • 45
  • 1
  • 1
  • 2
  • 1
    Regular expressions could work, or indexOf() ... but you need to wait for the request to finish before you try to process the content. Google XMLHttpeRequest events for more info on how to do that. – theGleep Oct 18 '17 at 18:06
  • @theGleep , Please if you have an example or a resource to get what I want tell me – tommy Oct 18 '17 at 18:30

2 Answers2

5

I'd use the DOMParser to make sure we're doing some "smart" searching. Let's say you are looking for texts about the word "viewport"; you don't want any HTML file that has the <meta> tag "viewport" to return as a valid result, would you?

Step one is parsing the string to a Document instance:

const parseHTMLString = (() => {
  const parser = new DOMParser();
  return str => parser.parseFromString(str, "text/html");
})();

Put a valid HTML string in here, and you'll get a document in return that behaves just like window.document! This means we can do all kinds of cool stuff like using querySelector and properties like innerText.

The next step is to define what we want to search. Here's an example that joins in a document's title and body text:

const getSearchStringForDoc = doc => {
  return [ doc.title, doc.body.innerText ]
   .map(str => str.toLowerCase().trim())
   .join(" ");
};

Pass your parsed document to this function, and you'll get a plain string in return that features just content, without attributes, tag names and meta data.

Now, it's a matter of defining the right search method. Could be a RegExp based match, or just a (less fast) split & includes:

const stringMatchesQuery = (str, query) => {
  return query
    .toLowerCase()
    .split(/\W+/)
    .some(q => str.includes(q))
};

Chain those methods together and you got the conversion like:

String -> Document -> String -> Boolean

If you ever want to include more information in the search content, you just update the getSearchStringForDoc function using the standardized API.

A running example (that's a bit messy and could do with some refactoring, but hopefully gets the point across):

const htmlString =  (
`<!DOCTYPE html>
<html lang="en">
<head>
  <meta charset="UTF-8">
  <meta name="viewport" content="width=device-width, initial-scale=1.0">
  <title>The title</title>
</head>
<body>
  Some text about an interesting thing.
</body>
</html>`);

const parseHTMLString = (() => {
  const parser = new DOMParser();
  return str => parser.parseFromString(str, "text/html");
})();

const getSearchStringForDoc = doc => {
  return [
    doc.title,
    doc.body.innerText
  ].map(str => str.trim())
   .join(" ");
};

const stringMatchesQuery = (str, query) => {
  str = str.toLowerCase();
  query = query.toLowerCase();
  
  return query
    .split(/\W+/)
    .some(q => str.includes(q))
};

const htmlStringMatchesQuery = (str, query) => {
  const htmlDoc = parseHTMLString(str);
  const htmlSearchString = getSearchStringForDoc(htmlDoc);
  
  return stringMatchesQuery(htmlSearchString, query);
};

console.log("Match 'viewport':", htmlStringMatchesQuery(htmlString, "viewport"));
console.log("Match 'Interesting':", htmlStringMatchesQuery(htmlString, "Interesting"));
user3297291
  • 22,592
  • 4
  • 29
  • 45
  • thank you but I don't understand all the code so I would like to ask you somethings if you don't mind – tommy Oct 18 '17 at 19:01
  • There will be a text inserted into the input text , So for example make a function on clicking the input submit to get the input text value using javascript then search in all the 5 html files and find if the inserted text is exist in one of the html files or not , Then load the file that contain this text for example – tommy Oct 18 '17 at 19:05
  • To get the search query from the text input, you attach a listener to the `change` event and use `element.value`. Once you have the matching HTML snippet, you can load it using [this answer](https://stackoverflow.com/a/11984907/3297291) – user3297291 Oct 19 '17 at 08:04
0

First, change:

<input type ='text' />

To:

<input id= 'text' type='text' />

Then, the code below will create an array called 'files' made up of objects. The 'position' property of each object will contain either the position of 'text' within 'filename', -1 if the text is not found, or -2 if the file did not load.

var text = document.getElementById('text' )

loadCount = 0;
files = [];
files[ 0 ] = {};
files[ 0 ][ 'filename' ] = "file1.html";
files[ 1 ] = {};
files[ 1 ][ 'filename' ] = "file2.html";
files[ 2 ] = {};
files[ 2 ][ 'filename' ] = "file3.html";
files[ 3 ] = {};
files[ 3 ][ 'filename' ] = "file4.html";
files[ 4 ] = {};
files[ 4 ][ 'filename' ] = "file5.html";

function search( item, index ) {

  xmlhttp.onload = function () {
    var files[ index ][ 'contents' ]  = xhr.responseText;
    if ( typeof files[ index ][ 'contents' ] !== 'undefined' ) {
      files[ index ][ 'position' ] = str.indexOf( text );
    } else {
      files[ index ][ 'position' ] = -2;
    }
    loadCount = loadCount + 1;
    if ( loadCount == 5 ) {
      // do whatever you want here
    }
  }

  var xhr = new XMLHttpRequest();
  xhr.open( "GET", item[ 'filename' ], false );
  xhr.send();

}

files.forEach( search );
Ben Shoval
  • 1,732
  • 1
  • 15
  • 20
  • thanks but where could I use the variable that contain the text inserted into the search input ? and what if I want to load the file that contain the text that matches what was inserted to the search input? – tommy Oct 18 '17 at 19:26
  • @tommy I’ve updated the answer to get the text from the text input field and put the contents of each file in the ‘contents’ property of each object. You can access it there. – Ben Shoval Oct 18 '17 at 19:41