3

I have an array like this:

['ball', 'football', 'volleyball', 'football player', 'football league', 'tennis']

I want to sort it like the following based on "football" keyword:

['football', 'football player', 'football league', 'ball', 'volleyball', 'tennis']

How can I achieve this?

Don't Panic
  • 41,125
  • 10
  • 61
  • 80
Shoaib
  • 167
  • 2
  • 14
  • 3
    Whats the order logic there? so `football` first, then anything with `ball` after, or just football stuff first, and last doesn't matter? Can you just put your array in code blocks so we have something to work with? not sure if it has keys, or what.. but in your example, keys will be auto starting at 0. http://stackoverflow.com/questions/3701855/how-do-i-sort-a-php-array-by-an-element-nested-inside – blamb Jun 22 '16 at 17:13
  • 1
    Like Brian said, be more specific what the order logic is, because it's totally not clear at all. – ksbg Jun 22 '16 at 17:15
  • 2
    Why is football player before football league? Isn't is supposed to be alphabetical order? – Andreas Jun 22 '16 at 17:16
  • actually i want to implement a search system. the system searches for the string on more than one column. So i would like to put the strings which are the closest to the search term at the beganning of the array – Shoaib Jun 22 '16 at 17:44
  • Regarding your comment rather than the original question: there are many and varied methods of determining "closeness" for this type of thing, and determining the method(s) you want to use can be a complex problem. – Don't Panic Jun 22 '16 at 17:51
  • I given the solution as you wanted. – Niklesh Raut Jun 22 '16 at 18:06

4 Answers4

4

You need to make a custom sort function, and then use it with usort.

$array=["ball","football","volleyball","football player","football league","tennis"];

function footsort($a,$b) {
    $afoot=substr($a,0,8)=="football";
    $bfoot=substr($b,0,8)=="football";
    if ($afoot==$bfoot) return strcmp($a,$b);
    /*else*/ 
    if ($afoot) return -1;
    if ($bfoot) return 1;
}

usort($array,"footsort");

print_r($array);

Response:

Array
(
    [0] => football
    [1] => football league
    [2] => football player
    [3] => ball
    [4] => tennis
    [5] => volleyball
)
Keith Tyler
  • 719
  • 4
  • 18
  • I think I added more sorting than you were looking for. If you don't like the sub-segment sort, just remove the "if ($afoot==$bfoot)" line. Thing is, you won't get the preserved order. The `usort` routine seems to place later items before newer items that are "weighted" the same by the comparison method. A fix that might work would be to change the "if ($afoot==$bfoot)" case to always return 1, meaning the newer one is "lower" than the older one. – Keith Tyler Jun 22 '16 at 17:25
2

If you want to sort based on whether the keyword is found anywhere in the word rather than just at the beginning, you can use strpos in the comparison function.

$keyword = 'football';

usort($things, function($a, $b) use ($keyword) {
    $x = strpos($a, $keyword) === false;
    $y = strpos($b, $keyword) === false;
    if ($x && !$y) return 1;
    if ($y && !$x) return -1;

    // use this if you want to sort alphabetically after the keyword sort:
    return strcmp($a, $b);

    // or if you only want to sort by whether or not the keyword was found:
    return 0;
});

If you have a more general goal of sorting the array based on "closeness" of its terms to the keyword, the comparison must become more complex, and the way it should be done really depends on what aspects of "closeness" are most important to you. Here is an example of a more complex sort, probably not exactly what you want, but just to show what I mean about the possible complexity of determining "closeness":

$keyword = 'football';
usort($things, function($a, $b) use ($keyword) {

    // prioritize exact matches first
    if ($a == $keyword) return -1;
    if ($b == $keyword) return 1;

    // prioritize terms containing the keyword next
    $x = strpos($a, $keyword);
    $y = strpos($b, $keyword);
    if ($x !== false && $y === false) return -1;
    if ($y !== false && $x === false) return 1;
    if ($x !== false && $y !== false) {  // both terms contain the keyword, so...

        if ($x != $y) {  // prioritize matches closer to the beginning of the term
            return $x > $y ? 1 : -1;            
        }

        // both terms contain the keyword at the same position, so...
        $al = strlen($a);
        $bl = strlen($b);
        if ($al != $bl) { // prioritize terms with fewer characters other than the keyword
            return $al > $bl ? 1 : -1;
        }
        // both terms contain the same number of additional characters
        return 0;
        // or sort alphabetically with strcmp($a, $b);
        // or do additional checks...
    }
    // neither terms contain the keyword

    // check the character similarity...
    $ac = levenshtein($keyword, $a);
    $bc = levenshtein($keyword, $b);
    if ($ac != $bc) {
        return $ac > $bc ? 1 : -1;
    }

    return 0;
    // or sort alphabetically with strcmp($a, $b);
    // or do additional checks, similar_text, etc.
});
Don't Panic
  • 41,125
  • 10
  • 61
  • 80
1

I tried to understant your question, and tried to solve like this

 <?php 
 $abc =["ball","football","volleyball","football player", "football league", "tennis"];
 $word ="football";
 $final = array();
 // collect complete match
 foreach($abc as $key=>$value){
  if($value==$word){
  $final[] = $value;
  unset($abc[$key]);
    }
  }
  //collect if word found in another string
  foreach($abc as $key=>$value){
  if(strpos($value,$word)!==false){
   $final[] = $value;
   unset($abc[$key]);
   }
 }
// collect if another string have some part of word
foreach($abc as $key=>$value){
  if(strpos($word,$value)!==false){
   $final[] = $value;
  unset($abc[$key]);
  }
}
// collect rest of the elements
$final = array_merge($final,$abc);
print_r($final);
?>

Ouput is

 Array
(
[0] => football
[1] => football player
[2] => football league
[3] => ball
[4] => volleyball
[5] => tennis
)

Check here: https://eval.in/593747

Niklesh Raut
  • 34,013
  • 16
  • 75
  • 109
0

This is a fun little problem where you need to assign some sort of score to each element of your haystack (the array of words). I think the best way to score each element is based on a classic dynamic programming problem "longest common substring". The longer this substring, the higher the sort score.

//find the longest commson substring between 2 strings, return all substrings of that length
function longestCommonSubstring($string1, $string2) {
    $helper = array();
    //create two dimensional array, to keep track
    for($i =0; $i < strlen($string1); $i++) {
        $helper[$i] = array();
        for($j=0; $j< strlen($string2); $j++) {
            //intialize all values to 0
            $helper[$i][] = 0;
        }
    }

    $max= 0;
    $ans = array();
    for($i =0; $i <strlen($string1); $i++) {
        for($j =0; $j < strlen($string2); $j++) {
            if ($string1[$i] == $string2[$j]) {
                if($i==0 || $j==0) {
                    $helper[$i][$j] = 1;
                } else {
                    $helper[$i][$j] = $helper[$i-1][$j-1] + 1;
                }
                if ($helper[$i][$j] > $max) {
                    $max = $helper[$i][$j];
                    $ans = array(substr($string1, $i-$max+1, $max));
                } elseif($helper[$i][$j] == $max) {
                    $ans[] = substr($string1, $i-$max+1, $max);
                }
            } else {
                $helper[$i][$j] = 0;
            }
        }
    }
    return $ans;
}

Now that the function is written we need to use it.

foreach($words as $word) {
    $lcs = longestCommonSubstring($keyword, $word);
}

Well, that is all and good, but just using the function is only half the battle, now we need to apply some logic to results. Let's save the results in an array and give each word a score. A good score would be the length of longest substring. football would be a better match than ball because it has a longer string in common. But what about football and football player, they both have the same length of longest common substring? To solve this problem, we could use length as a percentage of the total word length. Combining these two idea, longest substring length and percentage, we get the code below.

//an associative array to save the scores
// $wordsMeta[$word] = array(lengthOfCommonSubstring, percentageOfWordMatched)
$wordsMeta = array();
//go through each word and assign a score
foreach($words as $word) {
    $lcs = longestCommonSubstring($keyword, $word);

    if (count($lcs) ==0 ) {
        $wordPercentage = 0;
        $wordLength = 0;
    } else {
        $wordLength = strlen($lcs[0]);
        $wordPercentage = $wordLength/strlen($word);

    }
    $wordsMeta[$word] = array(
            "percentageOfWordMatched" => $wordPercentage,
            "lengthOfCommonSubstring" => $wordLength
    );
}

Now we just need a sorting function that looks at length first, if they are equal, it will look at percentage and return an appropriate integer.

//our special sorting function
//checks length, if that is equal, then it checks percentage of word matched
//if both are eqaul, then those two elements are considered equal
$sort = function($a, $b) {
    $ans = $a["lengthOfCommonSubstring"] - $b["lengthOfCommonSubstring"];
    if ($ans == 0) {
        $ans = $a["percentageOfWordMatched"] - $b["percentageOfWordMatched"];
    }

    if ($ans < 0) {
        $ans = -1;
    } elseif ($ans > 0){
        $ans = 1;
    } else {
        $ans = 0;
    }
    //higher number = lower sort order
    $ans *= -1;
    return $ans;
};

Now the easy part: uasort($wordsMeta) and $answer= array_keys($wordsMeta)

Beware of Demons - This algorithm is slow. Very slow. lcs is O(n*m), and we are calling it count($words) times. Making the scoring process O(n*m*x) where:

  • n is strlen($keyword)
  • m is strlen($word)
  • x is count($words)

Additionally we are sorting, which is O(n * log(n)). So in total this algorithm is O(n*m*x + n*log(n)) which is not good. Keeping the wordlist short, the words in the word list short, and the keyword short will keep speed down.

Scott
  • 12,077
  • 4
  • 27
  • 48