0

I'm working on a Python script that when I run it will cobble together text from different files so that I can create alternative versions of a website to easily compare different designs and also make sure that they all have the same inherent data, viz. that the menu items are consistent across all versions.

One specific problem area is making sure that the menu, which is cornucopia of different meals, is always the same. Therefore I've made this function:

def insert_menu():
    with open("index.html", "r+") as index:
        with open("menu.html", "r+") as menu:
            for i in index:
                if "<!-- Insert Menu here -->" in i:
                    for j in menu:
                        index.write(j)

However, it doesn't behave the way I want it to because I have been unable to find a way to extract what I need from other answers here on Stack Overflow.

In it's current state it will append the text that I have stored in menu.html at the end of index.html.

I want it to write the text in menu.html below the line, which is in index.html (but not necessarily always at the same line number, therefore ruling out the option of writing at a specific line) , containing <!-- Insert Menu here -->. Then, after everything inside of menu.html has been written to index.html I want for index.html to "continue" so to speak.

Basically I mean to wedge the text from menu.html into index.html after the line containing <!-- Insert Menu here --> but there is more text underneath that comment that I must retain (scripts and et al.)

Copied from the index.html document, this is what surrounds <!-- Insert Menu here -->:

<html>
    <body>
        <div id="wrapper">
            <div id="container">

                <div class="map">
                </div>

                <!-- Insert Menu here -->

            </div><!-- Container ends here -->
        </div><!-- Wrapper ends here -->
    </body>
</html>

Note that in index.html the above block is all indented inside a larger div, and I cannot easily replicate this here on SO.

How can I change my code to achieve the desired result, or am I going about this in a very roundabout way?

And, how can I clarify this question to help you help me?

  • http://stackoverflow.com/questions/1325905/inserting-line-at-specified-position-of-a-text-file-in-python may help you. – Ashwin Iyengar Jul 31 '13 at 19:41
  • I've come across that earlier, and am currently trying to write something in the lines of that using it as guide however I have yet to succeed. –  Jul 31 '13 at 19:53
  • You may want to rename `line in index` and `line in menu` to something like `i in index` and `ix in menu`. Because you're not reading lines. – yurisich Jul 31 '13 at 19:57
  • Post more code. Generally, you'll want to use [BeautifulSoup](http://www.crummy.com/software/BeautifulSoup/) to generate HTML documents, not the standard library's `read` and `write` methods. – yurisich Jul 31 '13 at 19:58
  • @Droogans Amended the naming. Seeing as the function is standalone, what code would you like posted? Checking out BeautifulSoup now. It provides far more than this project stands to require and as such I'm not too keen on implementing it just yet. –  Jul 31 '13 at 20:23

4 Answers4

1

Trying to update the source file in place will not work. Writing to the "index" while reading it will not do what you think - it will overwrite the lines following your eye-catcher string.

Instead, you should treat both the 'index' and the 'menu' source files as inputs and create a third file as an output (you are basically merging the two input files into a combined output file). Using 'classic' syntax:

output = open('merged.html', 'w')
for line in open('index.html'):
    if '<!-- Insert Menu here -->' in line:
        for menuline in open('menu.html'):
            output.write(menuline)
    else:
        output.write(line)

It's trivial to change that to the "using" syntax if you prefer that.

ked
  • 11
  • 1
0

Can you try this and tell me if it works? Since I don't have your file, assume you have read it and it is stored in string.

string = "blablabla \n  blebleble \n <!-- Insert Menu here --> \n bliblibli"
pattern = ".*<!-- Insert Menu here -->(.*)"
print re.match(pattern, string, re.S).groups()

This will match whatever comes after the Insert Menu -->, including any spaces and the line jump. If you want to skip until the next line:

pattern = ".*<!-- Insert Menu here --> .*\n (.*)"

Update: Just realized this implies reading the whole file, is that acceptable? Cheers!

Jblasco
  • 3,827
  • 22
  • 25
  • This didn't work, sadly. I get this error, despite my best efforts: `print re.match(pattern, string, re.S).groups()` `AttributeError: 'NoneType' object has no attribute 'groups'` –  Jul 31 '13 at 20:17
  • With my code, my string, does it work for you? If so, check your file and see that is exactly what appears in your code. – Jblasco Jul 31 '13 at 20:21
  • Your code, viz. the code you posted, runs fine. Applied to my situation it doesn't. Probably because the pattern is indented. I've supplied a raw copy up above. –  Jul 31 '13 at 20:31
  • You need to change the code to match exactly what you have in your own code... If that text is there it has to work... hmmm. Can you copy and paste the line of the "Insert Menu here" +/- 1 to see what the problem might be? – Jblasco Jul 31 '13 at 20:34
  • Did. Check up above by my question. As noted there I couldn't adequately replicate the indentation that I have in my file here without supplying so much code it would only serve to confuse. If the above example that I have now edited in doesn't suffice let me know and I'll supply the indentation as is. –  Jul 31 '13 at 20:37
  • I am too confused... I copied and pasted the bit you put up there, including \t for the tabs and \n for the new lines and it perfectly finds the correct thing... You are reading the whole file into a single string, correct? and you have printed the string to check the line is indeed there? – Jblasco Jul 31 '13 at 20:44
  • I am certain that it is read as a single string. I have printed it, and it is indeed there. –  Jul 31 '13 at 20:49
0

Assuming that the files aren't huge, a simple string replacement will work:

def insert_menu():
    with open("index.html") as index:
        index_text = index.read()
    with open("menu.html") as menu:
        menu_text = menu.read()
    # I called it index2 so it doesn't overwrite... you can change that
    with open("index2.html", "w") as index2:
        index2.write(index_text.replace('<!-- Insert Menu here -->', menu_text))
tdelaney
  • 73,364
  • 6
  • 83
  • 116
0

Since the other one does not seem to be working, what about this option (less elegant, admittedly):

condition = False
new_string = ""
with open("index.html") as f:
    for lines in f:
        if condition == True:
            new_string = new_string + lines + "\n"
        if "<!-- Insert Menu here -->" in lines:
            condition = True

Then you can print new_string into the new file.

Does this work?

Jblasco
  • 3,827
  • 22
  • 25