3

I have a text file that contains something like this:

Cl1 Cl 0.21988(6) 0.2500 0.15016(5) 0.01587(14) Uani 1 2 d S T P . .
O1 O 1.05820(17) 0.2500 0.48327(16) 0.0206(3) Uani 1 2 d DS TU P . .
H2 H 1.1042 0.2224 0.3900 0.025 Uiso 0.5 1 calc DR U P . .
O2 O 0.78198(19) 0.2500 0.29119(17) 0.0306(4) Uani 1 2 d S TU P . .
N1 N 0.7887(2) 0.2500 0.92083(19) 0.0152(3) Uani 1 2 d DS TU P . .
H1 H 0.8568 0.2500 1.0305 0.018 Uiso 1 2 calc DR U P . .

I am trying to write a program that looks for parenthesis, and then removes the parenthesis and anything in between. So Line 1 would end up looking like

Cl1 Cl 0.21988 0.2500 0.15016 0.01587 Uani 1 2 d S T P . .

This is what I have so far, and it only seems to work for the 'Uiso' portion of the code, because there are no parentheses. It does not seem to take out the parentheses..

for line in myfile:

    if "Uani" in line:

        re.sub('\(\w*\)', '', line)
        print >> energy, line

    elif 'Uiso' in line:

        re.sub('\(\w*\)', '', line)
        print >> energy, line

print myfile.read()

Any tips would be appreciated!

msturdy
  • 10,479
  • 11
  • 41
  • 52
Michael R
  • 259
  • 5
  • 16
  • 11
    Welcome to Stack Overflow! It looks like you want us to write some code for you. While many users are willing to produce code for a coder in distress, they usually only help when the poster has already tried to solve the problem on their own. A good way to demonstrate this effort is to include the code you've written so far, example input (if there is any), the expected output, and the output you actually get (console output, stack traces, compiler errors - whatever is applicable). The more detail you provide, the more answers you are likely to receive. – Martijn Pieters Jul 15 '13 at 18:33
  • 1
    ...not to mention that the more code you write, and the more you try to investigate your problem, the more *you* will get out of this site! – msturdy Jul 15 '13 at 18:36
  • You'll want to use regular expressions — it's definitely worth reading up on them, because they're so useful (and universal). Try [this guide](http://www.zytrax.com/tech/web/regex.htm) or the [Python documentation](http://docs.python.org/2/library/re.html). – Jamie Niemasik Jul 15 '13 at 18:38
  • Any reason it has to be Python? Seems a good use of `sed` to me :) – Ben Jul 15 '13 at 18:43
  • Instead of the two `if` branches that contain identical code, you could do `if "Uani" in line or "Uiso" in line":` and save some typing. – SethMMorton Jul 15 '13 at 22:24

2 Answers2

4
output = re.sub('\(\w*\)', '', input)

EDIT:

There's a mistake in the code you recently put: you are not assigning the result of the re.sub function. Change re.sub(...) for line = re.sub(...).

Racso
  • 2,310
  • 1
  • 18
  • 23
3
import re

with open('file') as f:
    input = f.read()
    output = re.sub(r'\(\w*\)', '', input)
astrognocci
  • 1,057
  • 7
  • 16
  • I don't understand this accepted answer. I may be wrong, but it doesn't work as the `\W` class matches "anything that is NOT a word character". The correct regexp should use `\w` instead. – Racso Jul 15 '13 at 19:17