-1

I have a .txt file that has the text output from a SQL Server 19 database query. I am trying to organize the columns of the file into columns in a Pandas DataFrame. The following text is an example of the .txt file (it has 193k lines in this format):

2                    99          SAÚDE PARCERIAS
2                    100         SAÚDE QUÂNTICA
2                    101         TESTE SAÚDE
2                    102         UNIDADE DE SAÚDE DA AMADORA
2                    103         VICENTE SAÚDE
3                    104         DUNHILL
4                    105         DEGÁ
4                    106         FLOR DE MUGA
4                    107         M MEGA RETAILER.CO.UK
4                    108         makè
4                    109         MEGA MAKE EARTH GREAT AGAIN
4                    110         MOGAO
4                    111         O MOEGA
4                    112         SABORES À MESA
5                    113         MAN
5                    114         MAN
6                    115         A CASA DO MONTE
6                    116         ALMA DA PONTE
6                    117         BELLO MONTE
6                    118         CANADA DO MONTE

Any suggestions for this issue? If the question is not clear enough, let me know. Thank you for your time in advance.

João Moço
  • 129
  • 7

1 Answers1

2

To load file with fixed width columns, you can use pd.read_fwf:

df = pd.read_fwf("your_file.txt", header=None)
print(df)

Prints:

    0    1                            2
0   2   99              SAÚDE PARCERIAS
1   2  100               SAÚDE QUÂNTICA
2   2  101                  TESTE SAÚDE
3   2  102  UNIDADE DE SAÚDE DA AMADORA
4   2  103                VICENTE SAÚDE
5   3  104                      DUNHILL
6   4  105                         DEGÁ
7   4  106                 FLOR DE MUGA
8   4  107        M MEGA RETAILER.CO.UK
9   4  108                         makè
10  4  109  MEGA MAKE EARTH GREAT AGAIN
11  4  110                        MOGAO
12  4  111                      O MOEGA
13  4  112               SABORES À MESA
14  5  113                          MAN
15  5  114                          MAN
16  6  115              A CASA DO MONTE
17  6  116                ALMA DA PONTE
18  6  117                  BELLO MONTE
19  6  118              CANADA DO MONTE

Or:

df = pd.read_csv("your_file.txt", sep=r"\s{2,}", engine="python", header=None)
Andrej Kesely
  • 168,389
  • 15
  • 48
  • 91