2

Let's say I have a long block of text (saved in notepad), 326 characters long.

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nam et nibh augue. Sed dignissim eu odio nec efficitur. Nulla aliquam imperdiet ipsum, eu mollis lacus cursus quis. Nulla dictum sem sem in auctor erat imperdiet sed suscipit elit ut lacus vestibulum vitae consequat risus volutpat. Suspendisse suscipit velit id.

I want it to say this:

Lorem ipsum dolor sit amet,
consectetur adipiscing elit.
Nam et nibh augue.
Sed dignissim eu odio nec efficitur.
Nulla aliquam imperdiet ipsum,
eu mollis lacus cursus quis.
Nulla dictum sem sem in auctor erat imperdiet sed suscipit
elit ut lacus vestibulum vitae consequat risus volutpat.
Suspendisse suscipit velit id.

The steps I want it to take:

  • If there is a period, add a new line.

  • If there is a comma, add a new line.

  • If the line is still too long (eg above 60 characters) add a new line at the next space.

Chris Martin
  • 30,334
  • 10
  • 78
  • 137
Eleven-87
  • 53
  • 4
  • 1
    Have a look at the python module [`re`](https://docs.python.org/2/library/re.html), especially at [`re.sub`](https://docs.python.org/2/library/re.html#re.sub). – morxa Mar 17 '16 at 23:24
  • Yes, also see http://stackoverflow.com/questions/4427542/how-to-do-sed-like-text-replace-with-python – Alfredo Gimenez Mar 17 '16 at 23:24
  • If all you want to do is wrap a long string of text to a given width, the standard library's [`textwrap` module](https://docs.python.org/3/library/textwrap.html) will do it for you. If you have to meet those _exact_ requirements (for a homework assignment?), then you'll have to create your own function. – Kevin J. Chase Mar 18 '16 at 01:22

2 Answers2

2

This will do:

s = """Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nam et nibh augue. Sed dignissim eu odio nec efficitur. Nulla aliquam imperdiet ipsum, eu mollis lacus cursus quis. Nulla dictum sem sem in auctor erat imperdiet sed suscipit elit ut lacus vestibulum vitae consequat risus volutpat. Suspendisse suscipit velit id."""
result = "\n".join(re.findall(r"(.{,59}?(?:,|\.)|.{58}\S*)\s*", s))
print(result)

result:

Lorem ipsum dolor sit amet,
consectetur adipiscing elit.
Nam et nibh augue.
Sed dignissim eu odio nec efficitur.
Nulla aliquam imperdiet ipsum,
eu mollis lacus cursus quis.
Nulla dictum sem sem in auctor erat imperdiet sed suscipit
elit ut lacus vestibulum vitae consequat risus volutpat.
Suspendisse suscipit velit id.

re explanation:

.{,59}?(?:,|\.) matches any , or . that has less than 59 characters before it. .{58}\S* matches anything that has more than 58 chars till the next word. In the end, \s* matches any empty space in order to trim it off.

Bharel
  • 23,672
  • 5
  • 40
  • 80
0

A solution using native Python code would be to use replace for the first case, and then word wrap after. Using regex through re is probably more efficient, but this is far more readable to anyone trying to understand your code. It's also much longer, I don't know if you care about that. Anyway, here's my solution:

def wordwrap(text, limit=60):
    """Just a function to wrap words to line length 60."""
    words = text.split(" ")
    lines_out = [words[0]]
    for word in words[1:]:
        last_line = lines_out[-1]
        if len(last_line)+len(word) > limit:
            lines_out.append(word)
        else:
            lines_out[-1] = last_line+" "+word
    return "\n".join(lines_out)


def process(int):
    """Process your input to follow all you rules"""
    # Add newlines after commas and periods
    inp = inp.replace(".", ".\n").replace(",", ",\n")

    lines = inp.split("\n")
    lines = [l.strip() for l in lines]  # Remove any spaces before lines

    # Wrap remaining lines
    lines = [wordwrap(l) for l in lines]
    return "\n".join(lines)

# A test.
if __name__ == "__main__":
    text = "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nam et nibh augue. Sed dignissim eu odio nec efficitur. Nulla aliquam imperdiet ipsum, eu mollis lacus cursus quis. Nulla dictum sem sem in auctor erat imperdiet sed suscipit elit ut lacus vestibulum vitae consequat risus volutpat. Suspendisse suscipit velit id."
    print process(text)
Luke Taylor
  • 8,631
  • 8
  • 54
  • 92