1

I have a las file and I am trying to read it in python using lasio library, one of the columns is TIME which is in the following format: 00:00:00.22-04-23

Sample of data copied from las file:

TIME               col1 col2
00:00:00.22-06-23  1010  20
00:00:05.22-06-23  1020  25
00:00:10.22-06-23  1015  32

My code to read the data:

df = lasio.read(file_path).df().reset_index()

This returns the df in the following format:

TIME               col1 col2 UNKNOWN:1  UNKNOWN:2
00:00:00.22         -06 -23    1010       20
00:00:05.22         -06 -23    1020       25
00:00:10.22         -06 -23    1015       32

As you can see, my TIME column has been split into three columns at every -. The data from col1 and col2 have been shifted to UNKNOWN:1 and UNKNOWN:2 (probably these columns are created by lasio during reading). I need it to return the TIME column as in the original form and avoid shifting the values of col1 and col2, so I can strip, split and manipulate TIME using pandas once it is read into a dataframe.

Any advice is appreciated.

serdar_bay
  • 271
  • 1
  • 7

1 Answers1

0

You can try to use pd.read_csv with correct delimiter. For example:

df = pd.read_csv('your_file.txt', sep=r"\s+", engine="python")
print(df)

Prints:

                TIME  col1  col2
0  00:00:00.22-06-23  1010    20
1  00:00:05.22-06-23  1020    25
2  00:00:10.22-06-23  1015    32

EDIT: With updated file:

import re
import pandas as pd
from io import StringIO

with open('your_file.txt', 'r') as f_in:
    data = re.sub(r'\A.*~A', '', f_in.read(), count=1, flags=re.S)
    df = pd.read_csv(StringIO(data), sep=r"\s+", engine="python")

print(df)

Prints:

                TIME     col1  col2   col3
0  00:00:00.23-04-23  1977.47   160  160.5
Andrej Kesely
  • 168,389
  • 15
  • 48
  • 91