Suppose the text consists of m words, which we will number from 1 to m. Define f(i, j) to be the maximum width (in characters) of any line in an optimal (minimal-width) solution to the subproblem consisting of just the first i words, under the restriction that exactly j line breaks are used. Then the width of the best possible sequence of breaks to the entire problem will be given by f(m, n). This can be solved fairly efficiently using dynamic programming.
Let the total length in characters of the fragment between word i and word j >= i, inclusive, be len(i, j). (This is easy to calculate: Just compute an array of m+1 values len0[j], for 0 <= j <= m, each giving the total length in characters of the fragment consisting of the first j words; then len(i, j) is just len0[j] - len0[i-1] - 1, with the convention that len0[0] = -1.)
The base case is easy:
f(i, 0) = len(0, i) (i.e., if there are no line breaks)
The recursive case is:
f(i, j) = the minimum over all 0 <= k < i of max(f(k, j-1), len(k+1, i))
That is, to find the best way to break the first i words into j+1 lines (i.e. using j line breaks), we can try the following for every shorter k-word prefix: determine the best way to break that k-word prefix into j lines (i.e. using j-1 line breaks), and compare the maximum width we get from that with the width that results from putting the rest of the i-k words on a single line at the end. Each prefix gives us a different candidate solution, so we can pick the best of all of them.
Constructing a solution
Now that we can calculate the optimal width f(m, n), how can we use this to actually construct a solution? Fortunately there is a standard technique in dynamic programming for this. The fastest way is to record, during calculation of f(i, j), the (actually a, since there may in general be multiple optimal solutions) value of k that produced the minimum in a predecessor table pred[i][j]. Having calculated f(m, n) and filled in the predecessor table, we can then construct an optimal solution by walking backwards through it: pred[i][j] tells us a value k such that we can produce an optimal solution by adding a line break after word k, so add a line break there, and then look at pred[k][j-1] to find the position of the previous line break, continuing on until j reaches 0.
Time and space complexity
If the recursion is memoised using dynamic programming, then there are at most O(mn) different parameter combinations that f() could be called with (i ranges between 0 and m, and j ranges between 0 and n), and the time spent outside of recursive calls is O(m) (k can range between 0 and m, and the computation for each value of k is O(1)), so the time complexity of this solution is O(nm^2). The space complexity is O(mn), but by switching to a bottom-up DP this can easily be reduced to O(m), since while computing f(i, j) we only ever need access to values of f() for which the second parameter is j-1, meaning it suffices to actually store just the size-(m+1) array of computed values f(q, j-1) for 0 <= q <= m.