1

I would like the song titles in $Array2 to be sorted in the same order of $Array1 without losing the values from $Array2. The values from $Array2 should follow the new order of each key in $Array2. I believe the current function I have provided is a solid start...

I have two arrays (please notice the differences in each):

  1. $Array1 is the user entered data.
  2. $Array2 is the data looked up on as external source that is SIMILAR to $Array1 but not EXACT.

For example...

$Array1 contains:

Array
(
    [0] => 3oh!3 - Don't Trust me
    [1] => Taylor Swift - You Belong with me
    [2] => Sean Kingston - Fire Burning
    [3] => Green Day - Know Your Enemy
    [4] => Kelly Clarkson - Gone
)

$Array2 contains:

Array
(
    [Taylor Swift - You Belong With Me] => bbbbbb
    [Sean Kingston - Fire Burning] => cccccc
    [Kelly Clarkson - Gone] => eeeeee
    [3OH!3- Don't Trust Me lyrics] => aaaaaa
    [Green Day Know Your Enemy Official] => dddddd
)

I already have a function started that I found on this website:

function sortArrayByArray(array $toSort, array $sortByValuesAsKeys)
{
    $commonKeysInOrder = array_intersect_key(array_flip($sortByValuesAsKeys), $toSort);
    $commonKeysWithValue = array_intersect_key($toSort, $commonKeysInOrder);
    $sorted = array_merge($commonKeysInOrder, $commonKeysWithValue);
    return $sorted;
}

However...

$sortArray = sortArrayByArray($Array2, $Array1);
print_r($sortArray);

$sortArray is only returning two results:

Array
(
    [Sean Kingston - Fire Burning] => cccccc
    [Kelly Clarkson - Gone] => eeeeee
)
Michael Ecklund
  • 1,206
  • 2
  • 17
  • 30
  • it's because that function matches the values specifically, and does not take into account different cases, like "3OH!3" vs. "3oh!3". This is really a regex issue, you also have one green day song with no dash and an extra word... there is no 4 line function for this in php – Trey Jun 18 '11 at 00:17
  • A potential approach: (1) strip all non-alphanumeric characters from all strings (2) convert to lower case (3) find longest common substrings, and reject anything where the LCS is shorter than 50% of the shortest string being considered. This will probably work for your example, but will not work as well on messier data. – Frank Farmer Jun 18 '11 at 00:22
  • @Keoki Zee Here is the link: http://stackoverflow.com/questions/348410/sort-an-array-based-on-another-array @Trey I cannot control the song titles being looked up from the external source in $Array2, it is however they are formatted on the external source. – Michael Ecklund Jun 18 '11 at 00:44
  • @Frank Farmer What do you think about something like this? Follow the link, http://www.php.net/manual/en/function.similar-text.php#62715 – Michael Ecklund Jun 18 '11 at 00:47

2 Answers2

2

Here's a solutions:

<?php

$array1 = array(
  0 => '3oh!3 - Don\'t Trust me',
  1 => 'Taylor Swift - You Belong with me',
  2 => 'Sean Kingston - Fire Burning',
  3 => 'Green Day - Know Your Enemy',
  4 => 'Kelly Clarkson - Gone',
);

$array2 = array(
  'Taylor Swift - You Belong With Me' => 'bbbbbb',
  'Sean Kingston - Fire Burning' => 'cccccc',
  'Kelly Clarkson - Gone' => 'eeeeee',
  '3OH!3- Don\'t Trust Me lyrics' => 'aaaaaa',
  'Green Day Know Your Enemy Official' => 'dddddd'
);


// Find matching song titles (case insensitive).
$tmp = array_values(array_uintersect($array1, array_flip($array2), 'strcasecmp'));

if ( ! empty($tmp) )
{
  // Generate the array.
  $matches = array_flip(array_uintersect(array_flip($array2), $tmp, 'strcasecmp'));

  print_r($matches);
}
else
  echo 'No matches found.';

?>

This will output:

Array
(
    [Taylor Swift - You Belong With Me] => bbbbbb
    [Sean Kingston - Fire Burning] => cccccc
    [Kelly Clarkson - Gone] => eeeeee
)

The other 2 matches are not 100% identical. As other have suggested, you could use similar_text() or other functions to determine how similar two strings are. If you'd like to do this, you can change the 'strcasecmp' in the array_uintersect call and write your own function that than uses similar_text (or other functions) to decide wether or not the values do in fact intersect or not.

Francois Deschenes
  • 24,816
  • 4
  • 64
  • 61
0

Try using something like the Levenshtein distance or similar_text functions to compare the strings in the arrays. You would just need to determine a threshold that matches as accurately as possible with the least false positives.

Justin ᚅᚔᚈᚄᚒᚔ
  • 15,081
  • 7
  • 52
  • 64