-1

I have a text file its content is like below:

H26      1         2         3         4         5         6         7         8
H26 5678901234567890123456789012345678901234567890123456789012345678901234567890
H26                                                                        

A           416.0  2008.51114 80   1  -4195081 88 68 68 264363.6 4122568.8 370.6
A           416.0  2008.51114 80   1  -4194881 88 67 68 264363.6 4122568.8 370.6
A           416.0  2008.51114 80   1  -4194881 88 66 69 264363.6 4122568.8 370.6
A           416.0  2008.51114 80   1  -4194881 88 65 69 264363.6 4122568.8 370.6
A           416.0  2008.51114 80   1  -4194881 87 65 69 264363.6 4122568.8 370.6
A           416.0  2008.51114 80   1  -4194881 88 65 69 264363.6 4122568.8 370.5
A           416.0  2008.51117 80   1  -1112380 86 58 96 264340.6 4122580.8 370.2
A           416.0  2008.51117 80   1  -2112380 86 57 99 264340.6 4122580.8 370.2
A           416.0  2008.51117 80   1  -1112280 86 57101 264340.6 4122580.8 370.3
A           416.0  2008.51117 80   1  -1112280 86 57101 264340.6 4122580.8 370.2
A           416.0  2008.51117 80   1  -1112180 85 58102 264340.6 4122580.8 370.2
A           416.0  2008.51117 80   1  -2112380 86 58103 264340.6 4122580.8 370.2
A           416.0  2008.51120 80   1  -2122179 82 51 67 264331.3 4122588.1 370.0
A           416.0  2008.51120 80   1  -2122279 82 51 69 264331.3 4122588.1 370.0

I would like to divide it to columns like:

 2008.5 264363.6 4122568.8
 2008.5 264363.6 4122568.8
 2008.5 264363.6 4122568.8

I tried pandas like below but it only output one column:

import pandas
df = pandas.read_csv("data.txt", header=4)

Any help? Thank you in advance

ayaz alp
  • 21
  • 3
  • 1
    Try to read the documentation related to the read_csv method: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html – Giuseppe Angora Apr 19 '20 at 12:00
  • 1
    Does this answer your question? [How to read file with space separated values in pandas](https://stackoverflow.com/questions/19632075/how-to-read-file-with-space-separated-values-in-pandas) – Björn Apr 19 '20 at 12:13
  • No, it does not. df = pandas.read_csv("data.txt", header=4, delim_whitespace=True) gives this error: pandas.errors.ParserError: Error tokenizing data. C error: Expected 12 fields in line 27, saw 13 – ayaz alp Apr 19 '20 at 12:25
  • Try: `pd.read_csv('data.txt', skiprows=3, delim_whitespace=True, header = None)` – DarrylG Apr 19 '20 at 12:32
  • Is your data different from `data.txt` file in [this implementation](https://repl.it/@DarrylGurganiou/MagnificentSpatialSyntax) which runs okay? – DarrylG Apr 19 '20 at 12:38
  • But, do you see that this implementation runs successfully and prints out the Dataframe? Try it by pressing the green run button at the top middle. – DarrylG Apr 19 '20 at 12:40
  • You are right @DarryIG my mistake the following data has row like A 416.0 2008.51114 80 1 -4195081 88 68168 264363.6 4122568.8 370.6 which should be 68 68 rather than 68168 – ayaz alp Apr 19 '20 at 12:43
  • The last message was incomplete, i.e. 'the following data has row like'. – DarrylG Apr 19 '20 at 12:45
  • You are right @DarryIG my mistake the following data has row like A 416.0 2008.51114 80 1 -4195081 88 68168 264363.6 4122568.8 370.6 which should be 68 68 rather than 68168 – ayaz alp Apr 19 '20 at 12:46
  • Not sure. Could you update your data in your post. Hard to tell in the comments. – DarrylG Apr 19 '20 at 12:49

1 Answers1

0

Use pandas.read_fwf

import io
import pandas as pd

s = """
H26      1         2         3         4         5         6         7         8
H26 5678901234567890123456789012345678901234567890123456789012345678901234567890
H26                                                                        

A           416.0  2008.51114 80   1  -4195081 88 68 68 264363.6 4122568.8 370.6
A           416.0  2008.51114 80   1  -4194881 88 67 68 264363.6 4122568.8 370.6
A           416.0  2008.51114 80   1  -4194881 88 66 69 264363.6 4122568.8 370.6
A           416.0  2008.51114 80   1  -4194881 88 65 69 264363.6 4122568.8 370.6
A           416.0  2008.51114 80   1  -4194881 87 65 69 264363.6 4122568.8 370.6
A           416.0  2008.51114 80   1  -4194881 88 65 69 264363.6 4122568.8 370.5
A           416.0  2008.51117 80   1  -1112380 86 58 96 264340.6 4122580.8 370.2
A           416.0  2008.51117 80   1  -2112380 86 57 99 264340.6 4122580.8 370.2
A           416.0  2008.51117 80   1  -1112280 86 57101 264340.6 4122580.8 370.3
A           416.0  2008.51117 80   1  -1112280 86 57101 264340.6 4122580.8 370.2
A           416.0  2008.51117 80   1  -1112180 85 58102 264340.6 4122580.8 370.2
A           416.0  2008.51117 80   1  -2112380 86 58103 264340.6 4122580.8 370.2
A           416.0  2008.51120 80   1  -2122179 82 51 67 264331.3 4122588.1 370.0
A           416.0  2008.51120 80   1  -2122279 82 51 69 264331.3 4122588.1 370.0"""

f = io.StringIO(s)
cols = [(19,30),(56,65),(65,75)]
df = pd.read_fwf(f,colspecs=cols,skiprows=[0,1,2,3],header=None)

df.loc[8:]
             0         1          2
8   2008.51117  264340.6  4122580.8
9   2008.51117  264340.6  4122580.8
10  2008.51117  264340.6  4122580.8
11  2008.51117  264340.6  4122580.8
12  2008.51120  264331.3  4122588.1
13  2008.51120  264331.3  4122588.1
wwii
  • 23,232
  • 7
  • 37
  • 77