Modify a line in a file after finding it with wildcard re.match

Question

I am trying to rewrite a super simple html page dynamically after using socket to retrieve a value. Essentially this is pulling a track name from my squeezebox and trying to write it to html. The first part of line is always the same, but the track title needs to change. I'm sure it's super simple but I've spent hours trawling different sites and looking at diff methods, so time to ask for help.

HTML has a line in it as follows, among more:

<p class="GeneratedText">Someone Like You</p>

I am then trying to run the following to find that line. It's always the same line number but I tried with read lines, and I read it reads everything in anyway:

import socket
import urllib
import fileinput
import re
# connect to my squeebox - retricve the track name and clean up ready for insertion
clientsocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
clientsocket.connect(("192.168.1.10", 9090))
clientsocket.send("00:04:00:00:00:00 title ?\n")
str = clientsocket.recv(100)
title=str.strip( '00%3A00%3A00%3A00%3A00%3A00 title' );
result = urllib.unquote(title)
#try and overwrite the line in we.html so it looks like <p class="GeneratedText">Now playing track</p>
with open('we.html', 'r+') as f:
        for line in f:
           if re.match("(.*)p class(.*)",line):
              data=line
              print data
              f.write( line.replace(data,'<p class="GeneratedText">'title'</p>'))

I think [this](http://stackoverflow.com/questions/5453267/is-it-possible-to-modify-lines-in-a-file-in-place) might be what you're going for. You'd still be rewriting the entire file, though. — AMacK, Apr 18 '15 at 00:05
I don't really understand your use case. Wouldn't it be better for user experience to bold the current playing song title or mark it with a * than replace the title? Anyway, a program edits any kind of text file by writing out a new copy of the entire file and then using `mv` to replace the old file. Also, instead of parsing html with regex, which is ugly and problematic and can [lead to insanity](http://stackoverflow.com/a/1732454/103081), in python you can use [beautiful soup](http://www.crummy.com/software/BeautifulSoup/), also known as bs4, an HTML parser and utility library for python. — Paul, Apr 18 '15 at 00:11

score 1 · Answer 1 · answered Apr 18 '15 at 00:50

A quick solution might be to use the fileinput module you tried importing.

Thus your code would look something like the this:

  for line in fileinput.input('we.html', inplace=True):
    if re.match("(.*)p class(.*)",line):
        print line.replace(line, '<p class="GeneratedText">' + title + '</p>')
    else:
        print line

Where you'd have to replace your with block with the one above

However, if you would like a cleaner solution, you should check out Beautiful Soup, which is a python library for manipulating structured documents.

You'll still need to install the module through pip, and import BeautifulSoup, but this code should get you running afterwards:

with open('we.html', 'r') as html:
    soup = BeautifulSoup(html)

for paragraph in soup.find_all('p', class_='GeneratedText'):
    paragraph.string = title

with open('we.html', 'w') as html:
    html.write(soup.prettify('utf-8'))

score 0 · Answer 2 · answered Apr 18 '15 at 00:17

If you have a single occurence of this in the whole page, you can simply do:

new_html = re.sub('(?<=<p class="GeneratedText">)(.*)(?=<\/p>)',
                  "WhateverYouWantGoesHere",
                   html_file_as_string)

It will replace everything between the 2 tags by whatever you want.

score 0 · Answer 3 · answered Apr 18 '15 at 00:21

0

with open('output.html', 'w') as o:
    with open('we.html', 'r') as f:
        for line in f:
            o.write(re.sub("(?:p\sclass=\"GeneratedText\">)(\w+\s?)+(:?</p>)", newTitle, line))

answered Apr 18 '15 at 00:21

Jules G.M.

3,624
1
21
35

Modify a line in a file after finding it with wildcard re.match

3 Answers3