3

I have a text table as a string like this and it's quite long. It can also be written to a file,

+--------------+----------+---------+------------+
| Endpoint     | Table    | Request | Is Updated |
+--------------+----------+---------+------------+
| /api/test1   | test1    | True    | True       |
+--------------+----------+---------+------------+
| /api/test2   | test2    | False   | False      |
+--------------+----------+---------+------------+
| /api/test3   | test3    | False   | True       |
+--------------+----------+---------+------------+

I want to convert this to a pandas dataframe. Here is my expected output:

>>> import pandas as pd
>>> df = pd.DataFrame(
    {'Endpoint': ['/api/test1', '/api/test2', '/api/test3'],
     'Table': ['test1', 'test2', 'test3'],
    'Request': [True, False, False],
    'Is Updated': [True, False, True]},
)... ... ... ... ...
>>> df
     Endpoint  Table  Request  Is Updated
0  /api/test1  test1     True        True
1  /api/test2  test2    False       False
2  /api/test3  test3    False        True

Thanks in advance.

accdias
  • 5,160
  • 3
  • 19
  • 31
E. Zeytinci
  • 2,642
  • 1
  • 20
  • 37

1 Answers1

9

IIUC, use re.sub to replace regex and io.StringIO to read:

import re
from io import StringIO


text = """
+--------------+----------+---------+------------+
| Endpoint     | Table    | Request | Is Updated |
+--------------+----------+---------+------------+
| /api/test1   | test1    | True    | True       |
+--------------+----------+---------+------------+
| /api/test2   | test2    | False   | False      |
+--------------+----------+---------+------------+
| /api/test3   | test3    | False   | True       |
+--------------+----------+---------+------------+
"""

df = pd.read_csv(StringIO(re.sub(r'[-+|]', '', text)), sep='\s{2,}', engine='python')
print(df)

Output:

     Endpoint  Table  Request  Is Updated
0  /api/test1  test1     True        True
1  /api/test2  test2    False       False
2  /api/test3  test3    False        True
Chris Adams
  • 18,389
  • 4
  • 22
  • 39