0

I am trying to write a function that takes the following 2 parameters:

  1. A sentence as a string
  2. A number of lines as an integer

So if I was to call formatLines("My name is Gary", 2); ...

The possible outcomes would be:

  • array("My name is", "Gary");
  • array("My name", "is Gary");
  • array("My", "name is Gary");

It would return: array("My name", "is Gary"); because the difference in character counts for each line is as small as possible.

So the part I am ultimately stuck on is creating an array of possible outcomes where the words are in the correct order, split over x lines. Once I have an array of possible outcomes I would be fine working out the best result.

So how would I go about generating all the possible combinations?

Regards

Joe

Joeseppi
  • 58
  • 6
  • 3
    permutations is the word you're looking for. – Tschallacka Mar 22 '18 at 15:08
  • "because the difference in character counts for each line is as small as possible" - is that what you want to achieve? – NoOorZ24 Mar 22 '18 at 15:10
  • i think you will find what you are looking for in this post: https://stackoverflow.com/questions/5506888/permutations-all-possible-sets-of-numbers – wayneOS Mar 22 '18 at 15:10
  • @NoOorZ24 yes, I want the function to return the combination where the average difference in character counts for each line is as small as possible. – Joeseppi Mar 22 '18 at 15:16
  • Thanks @wayneOS but that is no help as the possible permutations need to be in the correct word order – Joeseppi Mar 22 '18 at 15:18
  • Difference in string length or difference in word count? – Anthony Mar 22 '18 at 15:48
  • So for something like "Mississippi is so nice", would it return `["Mississippi is", "so nice"]` (equal word count) or `["Mississippi", "is so nice"]` (smallest difference in string lengths)? – Anthony Mar 22 '18 at 15:57
  • @Anthony it should return ["Mississippi", "is so nice"] as its the smallest difference in string lengths yes. – Joeseppi Mar 22 '18 at 16:11

2 Answers2

2

It seems like doing this by creating all possible ways of splitting the text and then determining the best one would be unnecessarily inefficient. You can count the characters and divide by the number of lines to find approximately the right number of characters per line.

function lineSplitChars($text, $lines) {
    if (str_word_count($text) < $lines) {
        throw new InvalidArgumentException('lines must be fewer than word count', 1);
    }

    $width = strlen($text) / $lines;                        // initial width calculation

    while ($width > 0) {

        $result = explode("\n", wordwrap($text, $width));   // generate result

        // check for correct number of lines. return if correct, adjust width if not
        $n = count($result);
        if ($n == $lines) return $result;
        if ($n > $lines) {
            $width++;
        } else {
            $width--;
        };
    }
}
Don't Panic
  • 41,125
  • 10
  • 61
  • 80
  • Thanks @Don't Panic but that function if given a 3 word string over 2 lines return an array with 3 words... the function must only return an array with a count the same as the number of lines – Joeseppi Mar 22 '18 at 15:39
  • Ah, yeah, I see how that would happen. I think it can still work this way, though. I'll make an adjustment. – Don't Panic Mar 22 '18 at 15:41
  • Thanks for your help – Joeseppi Mar 22 '18 at 15:42
  • @Joeseppi I modified the function to retry with increased line width until it ends up with the correct number of lines. – Don't Panic Mar 22 '18 at 15:57
  • Something to keep in mind would be that some scenarios couldn't be fulfilled, specifically : `formatLines("Hello World", 3)`. Should this throw an exception? Or return an array with 2 entries? Or return an array like `["Hello", "World", ""]`? – Anthony Mar 22 '18 at 16:04
  • That works better, but after putting through some test cases such as: **lineSplitChars("hukjfbvsjvbnvlsknv j kjds", 3);** where the first word is rather long, it only returns 2 lines! – Joeseppi Mar 22 '18 at 16:05
  • @Anthony, I've actually covered this outside the function to determine if the wordcount is less than lines then lines = wordcount and is then passed to the function. – Joeseppi Mar 22 '18 at 16:07
  • 1
    @Anthony yeah, it's true. I should mention that in the answer. I don't think this is really a finished product. Just more of an idea of another way to do it that can be expanded upon. I would still like to try to make it work, though. I think it's an interesting problem. – Don't Panic Mar 22 '18 at 16:07
  • @Joeseppi Oh yeah, I guess it should adjust the line width the other way if count < lines. – Don't Panic Mar 22 '18 at 16:08
  • That's good, but the function won't know that. How would you want it to react if it did encounter that scenario? – Anthony Mar 22 '18 at 16:09
  • @Anthony i've written a script that auto generates adverts of multiple sizes for 200+ products so given the advert sizes I have an alotted size for the product title to go, so the script will always know how many lines It has available, the script will then loop through the lines returned by the problem function and apply the text to the images. So in the case the words are less than the available lines then the available lines passed to the function will be equal to the wordcount – Joeseppi Mar 22 '18 at 16:15
  • @Joeseppi okay, new version haha. now reduces width if count < lines. Also, added an exception for invalid input. Not sure how you want to actually handle that, but at least the condition is there. – Don't Panic Mar 22 '18 at 16:21
  • @Don't Panic you have saved my day! Passed all the test cases I had set up. Awesome thanks! – Joeseppi Mar 22 '18 at 16:27
  • @Joeseppi great! I think I underestimated the complexity a bit. – Don't Panic Mar 22 '18 at 16:31
  • @Don't Panic you should see how much scribbling I have down on paper due to my original approach! I Appreciate your help. – Joeseppi Mar 22 '18 at 16:45
  • That's really impressive. One possible gotcha is how `str_word_count` determines word boundaries. From the example in their documentation `fri3nd` would be considered 2 words. – Anthony Mar 22 '18 at 16:47
  • @Anthony thanks! I'd never noticed that about `str_word_count`. Maybe it would be better to just count spaces, or use a regex for that part. (Speaking of regex, there's probably some way to do this whole thing with one preg_split or something that will make this approach look silly. Unfortunately I'm not much of a regex expert, so I wouldn't even know where to start.) – Don't Panic Mar 22 '18 at 17:00
1

An answer has been accepted here - but this strikes me as a rather cumbersome method for solving the problem when PHP already provides a wordwrap() function which does most of the heavy lifting:

 function format_lines($str, $lines)
 {
     $guess_length=(integer)(strlen($str)/($lines+1));
     do {
         $out=explode("\n", wordwrap($str, $guess_length));
         $guess_length++;
     } while ($guess_length<strlen($str) && count($out)>$lines);
     return $out;
 }

As it stands, it is rather a brute force method, and for very large inputs, a better solution would use optimum searching (adding/removing a larger initial interval then decreasing this in iterations)

symcbean
  • 47,736
  • 6
  • 59
  • 94
  • I like the idea of decreasing the guess length at the beginning with `$lines+1`, so you only have to increase guess length! That simplifies things. It kind of reads like you're saying my answer doesn't use `wordrap`, though. Did I misunderstand? – Don't Panic Mar 22 '18 at 19:19