-1

I have a text file which contains a table like below. The part from other text which is not of much interest to me.

TMP   [%]        [KT]      [1/dm]      [SF]   
1    0.10020    -0.0000      -60.0     0.0000
2   14.12826     0.0000        0.0     0.0000
3    4.00802  -120.3636       -6.0   191.5646
4    4.80962     0.0000        0.0     0.0000
   .....

I wanted to extract only this portion of the text and only first 3 columns. I wrote a code something like:

import codecs
f = codecs.open("dmp.txt", "r",'utf-16-le')
fr = f.readlines()
f.close()
for line in fr:
  if line.startswith("TMP")...

However, I am not able to figure out how to read this data column- wise and that to first 3 columns only. Any ideas?

user741592
  • 875
  • 3
  • 10
  • 25

3 Answers3

0
for line in fr: 
    v = line.split()
    print " ".join(v[:3])

Gives:

TMP [%] [KT]
1 0.10020 -0.0000
2 14.12826 0.0000
3 4.00802 -120.3636
4 4.80962 0.0000
perreal
  • 94,503
  • 21
  • 155
  • 181
0

You can use regex:

import codecs
import re
f = codecs.open("dmp.txt", "r",'utf-16-le')
fr = f.readlines()
f.close()
for line in fr:
    if not line.startswith('TMP'):
        print re.findall('-?[0-9]+\.?[0-9]*', line)[:3]

This will output:

['1', '0.10020', '-0.0000']
['2', '14.12826', '0.0000']
['3', '4.00802', '-120.3636']
['4', '4.80962', '0.0000']
daouzli
  • 15,288
  • 1
  • 18
  • 17
  • Thanks for your thoughts. However, this also prints other numbers in the text file which are not part of this table. What if I know the number of TMP e.g. I know I have say 4 rows ( 1...4). Can I do something to read the block directly? I want to use those numbers further. – user741592 Jun 20 '14 at 09:25
  • @user741592 the end of the last line is `[:3]`, that mean take the first 3 element. You can replace by something like `[A:B]` where `A` is index (starting at 0) of first element and `B` is index after the last to keep. See [slicing](http://stackoverflow.com/questions/509211/pythons-slice-notation) for more informations – daouzli Jun 20 '14 at 09:30
  • I'm not sure about what you mean by *read the block directly* – daouzli Jun 20 '14 at 09:33
  • @user741592 if that helped don't hesitate to accept the answer and to vote up ;) – daouzli Jun 24 '14 at 11:20
0
with open("dmp.txt") as f:
    f.next()
    for x in range(4):
        lines += f.next().split()[0:3]
    print lines

['1', '0.10020', '-0.0000', '2', '14.12826', '0.0000', '3', '4.00802', '-120.3636', '4', '4.80962', '0.0000']
Padraic Cunningham
  • 176,452
  • 29
  • 245
  • 321