-4

Considering I have one-dimensional ordered array of strings:

$arr = [
    'Something else',
    'This is option: one',
    'This is option: two',
    'This is option: ',
    'This is second option: 2',
    'This is second option: 3'
];

I would like to turn it into two-dimensional array, having the common beginning as the key. For example:

$target = [
    'Something else',
    'This is option:' => [
        'one',
        'two',
        ''
    ],
    'This is second option:' => [
        '2',
        '3'
    ]
];

It sounds simple, but I have gone completely blank.

function convertArr(array $src): array {
    $prevString = null;
    $newArray = [];
    foreach ($src as $string) {
        if ($prevString) {
            // stuck here
        }
        $prevString = $string;
    }
    return $newArray;
}

Pre-made fiddle: https://3v4l.org/eqGDc

How can I check if two strings start with the same words, without having to loop on each letter?

As of now I have written this overly-complicated function, but I wonder if there is a simpler way:

function convertArr(array $src): array {
    $prevString = null;
    $newArray = [];

    $size = count($src);
    for ($i = 0; $i < $size; $i++) {
        if (!$prevString || strpos($src[$i], $prevString) !== 0) {
            if ($i == $size - 1) {
                $newArray[] = $src[$i];
                break;
            }
            $nowWords = explode(' ', $src[$i]);
            $nextWords = explode(' ', $src[$i + 1]);
            foreach ($nowWords as $k => $v) {
                if ($v != $nextWords[$k]) {
                    break;
                }
            }
            if ($k) {
                $prevString = implode(' ', array_splice($nowWords, 0, $k));
                $newArray[$prevString][] = implode(' ', $nowWords);
            }
        } else {
            $newArray[$prevString][] = trim(substr($src[$i], strlen($prevString)));
        }
    }
    return $newArray;
}
Dharman
  • 30,962
  • 25
  • 85
  • 135
  • Do all the strings in `$arr` have a common beginning? What do you want as output if `$arr[2] = 'This is not an option: three'`? – Nick Sep 30 '19 at 21:24
  • 1
    What if the elements in the original array were 'This is option: two' and 'This is option: three'? Would the key of the new array then be 'This is option: t'? It isn't clear whether you have a defined split point, or if it varies depending on the input (match until the strings differ) – Patrick Q Sep 30 '19 at 21:24
  • @Nick No, only some of them might start the same. – Dharman Sep 30 '19 at 21:24
  • @Dharman Is there always a common delimiter like `:`? – Barmar Sep 30 '19 at 21:25
  • @PatrickQ Whitespace is the delimiter. – Dharman Sep 30 '19 at 21:26
  • @PatrickQ I meant in a greedy way. I do not want to split inside of a word. The splitting is word-based – Dharman Sep 30 '19 at 21:28
  • Here's a messy solution and only considers 2 values, https://3v4l.org/ignqt – user3783243 Sep 30 '19 at 21:41
  • @Barmar I don't think it's a dupe. All those strings had the same prefix, but OP's don't – Nick Sep 30 '19 at 21:56
  • If there's no common prefix, I think the answers there will simply return an empty string as the common prefix – Barmar Sep 30 '19 at 22:06
  • @Dharman Post your code in the question, not at a remote site. – Barmar Sep 30 '19 at 22:07
  • What about something like `["Prefix: one", "Other Prefix: two", "Prefix: three", "Other Prefix: four", "No prefix"]`? – Barmar Sep 30 '19 at 22:08
  • @Dharman for your example, you have `Something else` as a value in the result, but shouldn't it be a key pointing to an array with an empty string to be consistent? – Nick Sep 30 '19 at 22:09
  • 1
    @Barmar the array is sorted. – Nick Sep 30 '19 at 22:09
  • @Nick It doesn't matter, I made it so because it was easier for me. – Dharman Sep 30 '19 at 22:10
  • You can translate a sequence of words into a regexp like `/^((((This\s+)is\s+)option:\s+)one)/`. You can then match that against the next element, and find the first non-empty capture group to get the longest common prefix. – Barmar Sep 30 '19 at 22:14
  • What should the result be for `["A B w", "A B x", "A y", "A z"]`? Is it `["A B" => ["w", "x"], "A" => ["y", "z"]]` or `["A" => ["B w", "B x", "y", "z"]]`? – Barmar Sep 30 '19 at 22:20
  • @Barmar The first one. It is supposed to be as greedy as possible on the words, not rows. – Dharman Sep 30 '19 at 22:22

2 Answers2

1

I haven't got a complete solution, but maybe you can use this as a starting point: The following gets you the longest common starting sequence for the strings in an array of length 2:

var s=["This is option: one","This is option: two"]; 
var same=s.join('|').match(/(.*)(.*?)\|\1.*/)[1];
// same="This is option: "

In same you will find the longest possible beginning of the two strings in array s. I achieve this by using a regular expression with a greedy and a non-greedy wildcard group and forcing the first group to be repeated.

You could apply this method on slice()-d short arrays of your original sorted input array and monitor whether same stays the same for a number of these sub-arrays. You can then perform your intended grouping operation on sections with the same same.

[[ Sorry, I just realized I coded this in JavaScript and you wanted PHP - but the idea is so simple you can translate that easily into PHP yourself. ]]

Edit

When looking at the question and expected result again it seems to me, that what the OP really wants is to combine elements with similar parts before the colon (:) into a common sub-array. This can be done with the following code:

$arr = [
    'Is there anything new and',
    'Something old',
    'This is option: one',
    'This is option: two',
    'This is option: ',
    'This is second option: 2',
    'This is second option: 3',
    'Abc: def',
    'Abc: ghi',
    'the same line',
    'the same words'
];
foreach($arr as $v) {
 $t=array_reverse(explode(':',$v));
 $target[isset($t[1])?trim($t[1]):0][]=trim($t[0]);
}
print_r($target)

output:

Array
(
    [0] => Array
        (
            [0] => Is there anything new and
            [1] => Something old
            [2] => the same line
            [3] => the same words
        )

    [This is option] => Array
        (
            [0] => one
            [1] => two
            [2] => 
        )

    [This is second option] => Array
        (
            [0] => 2
            [1] => 3
        )

    [Abc] => Array
        (
            [0] => def
            [1] => ghi
        )

)

See a demo here https://rextester.com/JMB6676

Carsten Massmann
  • 26,510
  • 2
  • 22
  • 43
1

This might do the job:

function convertArray(array $values): array
{
    $newArray = [];
    foreach ($values as $value) {
        if (stripos($value, ':') !== false) {
            $key = strtok($value, ':');
            $newArray[$key][] = trim(substr($value, stripos($value, ':') + 1));
        }
    }

    return $newArray;
}

Essentially, based on the format of your array of strings, as long as each string only has one ":" character followed by the option value, this should work well enough.

I'm sure there will be a more advanced and more fail-safe solution but this may be a start.

blessing
  • 451
  • 4
  • 9