5

I have a plan to make search from txt file that I prepare, the txt file content similar like this below

a.txt

Amy Jefferson
Nathalie Johnson
Emma West
Donna Jefferson
Tanya Nathalie
George West
Emma Watson
Emma Jefferson

If the code was like this

a.php

$filename = "a.txt";
$example = file($filename, FILE_IGNORE_NEW_LINES);
$searchword = 'Emma Jefferson';
$matches = array();
foreach($example as $k=>$v) {
    if(preg_match("/\b$searchword\b/i", $v)) {
        $matches[$k] = $v;
        echo $matches[$k]."<br>";
    }
}

The result will only "Emma Jefferson"

Then if i use this code

b.php

$filename = "a.txt";
$example = file($filename, FILE_IGNORE_NEW_LINES);
$searchword = 'Emma Jefferson';
$matches = array();
foreach($example as $k=>$v) {
    $searchword2 = str_ireplace(" ", "|", $searchword);
    if(preg_match("/\b$searchword2\b/i", $v)) {
        $matches[$k] = $v;
        echo $matches[$k]."<br>";
    }
}

The result will be like this

Amy Jefferson
Emma West
Donna Jefferson
Emma Watson
Emma Jefferson

Unique result, but "Emma Jefferson" in the last result

So the question is how I can search Emma Jefferson, the result sort was like this

Emma Jefferson
Emma Watson
Emma West
Amy Jefferson
Donna Jefferson

So basically it search for "Emma Jefferson" entire word first, then "Emma" and the last one is "Jefferson"

UPDATE I vote for Don't Panic code for this problem, but i wanna say thank you for all contributor here Don't Panic, RomanPerekhrest, Sui Dream, Jere, i-man, all of you are the best!!

Pattygeek

pattygeek
  • 103
  • 5

5 Answers5

1

I don't know of a way to take the position of the matches into account with a regex solution, but if you convert the search string and the terms to arrays of words, it can be done.

With this approach, we iterate the text items and build an array of position matches for each word in the search term, then sort the result by number of matches, then position of matches.

$search_words = explode(' ', strtolower($searchword));

foreach ($example as $item) {
    $item_words = explode(' ', strtolower($item));

    // look for each word in the search term
    foreach ($search_words as $i => $word) {
        if (in_array($word, $item_words)) {

            // add the index of the word in the search term to the result
            // this way, words appearing earlier in the search term get higher priority
            $result[$item][] = $i;
        }
    }
}

// this will sort alphabetically if the uasort callback returns 0 (equal)
ksort($result);

// sort by number of matches, then position of matches    
uasort($result, function($a, $b) {
    return count($b) - count($a) ?: $a <=> $b;
});

// convert keys to values    
$result = array_keys($result);
Don't Panic
  • 41,125
  • 10
  • 61
  • 80
  • Hi Don't Panic, thanks for the respond :) but i still get the sort not like I want, this is what i get Emma Jefferson Emma Watson Donna Jefferson Emma West Amy Jefferson, meanwhile the result I want just like on the list I wrote above – pattygeek Sep 22 '17 at 21:33
  • Ah, I see, so it's not just number of matches, it's order of matches as well? – Don't Panic Sep 22 '17 at 21:34
  • Yes :D that part is bit confusing me :D – pattygeek Sep 22 '17 at 21:40
  • @pattygeek I rewrote the answer after (hopefully) correctly understanding the problem. – Don't Panic Sep 22 '17 at 22:11
  • Hi Don't Panic, sorry for late reply, I almost there with ur code, just got error on this part return count($b) - count($a) ?: $a <=> $b; its said PHP Parse error: syntax error, unexpected '>' in /Applications/MAMP/htdocs/a.php on line 26 – pattygeek Sep 23 '17 at 07:18
  • 2
    @Don'tPanic, you should have mentioned that `<=>` only available since PHP 7 – RomanPerekhrest Sep 23 '17 at 07:30
  • Thx Roman to let me know that spaceship is an php 7 feature, i just check and its work like what I want, Ill vote for this code then, Thanks to Don't Panic for this beautiful code, but just in case Don't Panic have a solution for php 5.6 ill very thank you for that :) – pattygeek Sep 23 '17 at 16:07
0

You currently echo results immediately, so they ordered as they are in text.

You can search full string and partial matches, and then concatenate results.

foreach($example as $k=>$v) {
    if(preg_match("/\b$searchword\b/i", $v)) {
        $fullMatches[] = $v;
    }
    if(preg_match("/\b$searchword2\b/i", $v)) {
        $matches[] = $v;
    }
}
$matches = array_unique(array_merge($fullMatches, $matches));
foreach($matches as $k => $v)
    echo $v . "<br>";

Update:

Multiple words variant:

$words = ['Emma', 'Jefferson'];
$matches = array();
foreach($example as $k => $v) {
    $fullStr = implode(' ', $words);
    if(preg_match("/\b$fullStr\b/i", $v))
        $matches[0][] = $v;
    $str = "";
    $i = 1;
    foreach($words as $word) {
        if ($str === "")
            $str = $word;
        else
            $str .= '|' . $word;
        if(preg_match("/\b$str\b/i", $v))
            $matches[$i][] = $v;
        $i++;
    }
}
$result = array();
foreach($matches as $firstKey => $arr) {
    foreach($arr as $secondKey => $v) {
        $result[] = $v;
    }
}
$result = array_unique($result);
foreach($result as $k => $v)
    echo $v . "<br>";
Sui Dream
  • 530
  • 5
  • 15
  • Thanks Sui for the respond :) , its work for Emma Jefferson on the top, but the rest still not like I want, the result of the code you give it to me was like this Emma Jefferson Amy Jefferson Emma West Donna Jefferson Emma Watson Meanwhile the result I want like this one Emma Jefferson Emma Watson Emma West Amy Jefferson Donna Jefferson – pattygeek Sep 22 '17 at 21:12
  • Is it possible to make it dynamic, I mean the search can be any words not only 2 – pattygeek Sep 22 '17 at 21:25
  • I am sorry Sui, but why when i try your code, all data on a.txt reveal? – pattygeek Sep 22 '17 at 22:10
  • @pattygeek , you put in $words your search parts, right? This variant should sort answers in order of word appearing in search array. If you want sorting considering words count, you could combine that with preg_match_all() variant. – Sui Dream Sep 22 '17 at 22:10
  • yes, this is the code i use https://pastebin.com/aVVq8a7V still showing all data Sui. – pattygeek Sep 23 '17 at 07:16
  • @pattygeek , I was implying that `$words` is an array: `$words = ['Emma', 'Jefferson'];` – Sui Dream Sep 23 '17 at 08:16
0

Complex solution:

$lines = file('a.txt', FILE_IGNORE_NEW_LINES);
$name = 'Emma';
$surname = 'Jefferson';
$emmas = $jeffersons = [];

foreach ($lines as $l) {
    if (strpos($l, $name) === 0) {
        $emmas[] = $l;
    } elseif ( strrpos($l, $surname) === (strlen($l) - strlen($surname)) ) {
        $jeffersons[] = $l;
    }
}

usort($emmas, function($a,$b){
    return strcmp(explode(' ', $a)[1], explode(' ', $b)[1]);
});
usort($jeffersons, function($a,$b){
    return strcmp($a, $b);
});

$result = array_merge($emmas, $jeffersons);
print_r($result);

The output:

Array
(
    [0] => Emma Jefferson
    [1] => Emma Watson
    [2] => Emma West
    [3] => Amy Jefferson
    [4] => Donna Jefferson
)
RomanPerekhrest
  • 88,541
  • 4
  • 65
  • 105
  • Hi Roman, thanks for the respond :) the result just like I want, but the question, is it possible to make $name and $surname dynamic, because the search can be more than 2 words here – pattygeek Sep 22 '17 at 21:38
  • @pattygeek, if *the search can be more than 2 words* - how will you sort by 4 or more words separately? That will be pretty difficult situation for you. My solution solves the current issue – RomanPerekhrest Sep 23 '17 at 07:25
0

You would have to write a new loop or start sorting your Array afterwords, because the foreach-loop takes one element name at the time, tests if it matches your search word and if it does, the name goes at the end of your new Array $matches[]. So the

    if(preg_match("/\b$searchword2\b/i", $v)) {
    $matches[$k] = $v;
    echo $matches[$k]."<br>";
}

part does not know anything about the names that are or aren't already inside of $matches[].

So my suggestion would be:

$filename = "a.txt";
$example = file($filename, FILE_IGNORE_NEW_LINES);
$searchword = 'Emma Jefferson';
$matches = array();



$searchword2 = array($searchword, explode(" ", $searchword)[0], explode(" ", $searchword)[1]);
$isThisNameAlreadyInTheList;

foreach($searchword2 as $actualSearchword) {

    foreach($example as $k=>$v) {

        $isThisNameAlreadyInTheList = false;
        foreach($matches as $match) {   
            if(preg_match("/\b$match\b/i", $v)) {
                $isThisNameAlreadyInTheList = true;
            }
        }

        if (!$isThisNameAlreadyInTheList) {
            if(preg_match("/\b$actualSearchword\b/i", $v)) {
                $matches[$k] = $v;
                echo $matches[$k]."<br>";
            }
        }
    }

}
Jere
  • 1,196
  • 1
  • 9
  • 31
  • Thanks Jere for the respond :) for this part $searchword2 = array("Emma Jefferson", "Emma", "Jefferson"); can it make to be dynamic, i try $searchword2 = explode("|", $searchword); but still confuse where to put "Emma Jefferson" – pattygeek Sep 22 '17 at 21:26
  • yep that was a mistake I made when I uploaded the code. It should be the right one now. – Jere Sep 22 '17 at 22:23
  • Sorry for late reply, Jere its work! but how if the result more than 2 words? can search without edit the $searchword2? – pattygeek Sep 23 '17 at 07:11
0

I would use a preg_match_all solution like so:

$searchName = "Emma Jefferson";
$searchTerms = explode(' ', $searchName);

$pattern = "/(\b$searchTerms[0]\b \b$searchTerms[1]\b)|(\b$searchTerms[0]\b \w+)|(\w* \b$searchTerms[1]\b)/i";

$output = [];
preg_match_all($pattern, implode(' | ', $example), $out);

foreach($out as $k => $o){
    if($k == 0){
        continue;
    }

    foreach($o as $item){
        if(!empty($item)){
            $output[] = $item;
        }
    }
}

print_r($output);

You could also bring the file in as a string and avoid the implode portion.

i-man
  • 558
  • 3
  • 14