4

I am actually solving a problem(search problem) using regex patten matching and other stuffs. I thought I will have a look at how Google, Yahoo, Bing, Ask etc., behave.

Considering that Firefox, Chrome, Opera and other browsers also have a URL bar or a search bar, I started trying out different words, then symbols.

In Firefox I see a lot of different results. Here are some screenshots-

^ Symbol - Gives some random results.

^ symbol

$ Symbol - too gives random results. I also tried adding a string with this, it gave no results.

Dollar Symbol

() parentheses - when used gives proper results. Its considered as a symbol to compare like strings.

parenthesis symbols

* Symbol - This also gave a set of results with no match of the symbol itself. Not sure why those different results.

Star symbol

~ Symbol - This also gave a set of results with no match of the symbol itself. Not sure why those different results.

Tilde symbol

I am interested to know why there is such difference in behaviour for a lot of symbols, whereas, other strings and/or symbols work as expected.

-

@thanksd Not really a duplicate. That is about how string matching works. I know that part already. I am clearly asking about symbols. How are symbols considered for matching.

Community
  • 1
  • 1
bozzmob
  • 12,364
  • 16
  • 50
  • 73
  • 2
    I'm pretty sure that it's not regex. – Bergi Dec 19 '15 at 08:47
  • @Bergi Oh ok. But, if I start typing(string I mean), the pattern recognition(filtering and bold fonts for the match) looks like they are using regex to match and filter results. I may be wrong, but, I thought so. Any idea what they are using? – bozzmob Dec 19 '15 at 08:49
  • And symbols should match right? As per my observation, I see that, lot of my history dump of URLs have the above symbols posted in the question. – bozzmob Dec 19 '15 at 08:51
  • 1
    Don't they use a [`trie`](https://en.wikipedia.org/wiki/Trie) that's been loaded with your history and favorites? – Jonny Henly Dec 19 '15 at 08:51
  • 2
    Firefox uses `textFromtheWholeOfTheInternet.match(new RegExp('.*' + urlbar.value + '.*'));` - pretty fast considering it's matching the complete contents of the internet with every keystroke – Jaromanda X Dec 19 '15 at 09:11
  • That's good to know Jonny. Will read more on Trie.. looks interesting... – bozzmob Dec 19 '15 at 09:40
  • Jaromanda, Wow! That's one hell of an operation! Good to know the operation being used. Will read more upon it. Thanks. – bozzmob Dec 19 '15 at 09:41
  • 1
    Possible duplicate of [How does Firefox's 'awesome' bar match strings?](http://stackoverflow.com/questions/540725/how-does-firefoxs-awesome-bar-match-strings) – thanksd Jan 20 '16 at 14:25
  • @thanksd Not really. That is about how string matching works. I know that part already. I am clearly asking about symbols. How are symbols considered for matching. – bozzmob Jan 20 '16 at 15:49
  • 1
    @JaromandaX Not true. That would be vulnerable to regex injection, so `urlbar.value` is first sanitized with some kind of [`RegExp.escape`](https://github.com/benjamingr/RegExp.escape). Then the complete contents of the internet can be matched safely :P – Oriol Jan 20 '16 at 19:15

1 Answers1

1

No mainstream browser interprets what you type in its location bar as a regex, bcause the average user does not know regex.

This is how Firefox works (basically):

  1. Choose what to search in. This is done by checking the browser.urlbar.default.behavior preference as well as looking for special characters in the query:

    You can restrict what kind of results are shown in the drop down list by using customizable characters. Include the character anywhere in the address bar separated by spaces to have it restrict what results are displayed.

    The characters are as follows:

    • #: Returns results that match the text in the title.
    • @: Returns results that match the text in the URL.
    • *: Returns only results that are from the bookmarks.
    • ^: Returns only results that are from the browser’s history.
    • +: Returns only results that have been tagged.
    • ~: Returns only results that have been typed.
    • %: Returns only open tabs (visible tabs, not active tab), available in Firefox 4 (SeaMonkey 2.1) and later
  2. When searching in something, each whitespace-separated sequence of characters (except the special characters above) must be present in its text (website title, URL, etc.), case insensitively. (Sequences may overlap.)

lydell
  • 1,147
  • 12
  • 16