0

i have a log

20201006T153833.159+0200 0243 request.log,
20201006T153833.159+0200 0244 request.ESelection,
20201006T153833.159+0200 0245 request.DateOn,   Assure.dateDeNaissance,

20201006T153833.159+0200 0289 Gestion, request.ESelection, request.ESelectionGestion, SelectionReference

and desired output is:


Date                       Number                 Request1               Request2               Request3               

20201006T153833.159+0200    0243      request.log

20201006T153833.159+0200    0244      request.ESelection
20201006T153833.159+0200    0245      request.DateOn           Assure.dateDeNaissance,

20201006T153833.159+0200    0289      Gestion,                 request.ESelection            SelectionReference

i tried this

df = pd.read_table("request.log", sep=r'^([\d+]+? [\d:]+?) ',header= None )

Anyone help?

Shaido
  • 27,497
  • 23
  • 70
  • 73
stone
  • 1
  • 3
  • You can do something similar like [here](https://stackoverflow.com/a/55129746/5302442) `pd.read_table('test.log', sep='\n', header=None)[0].str.split(' ', expand=True)` – matq007 Oct 07 '20 at 08:34

1 Answers1

0

Best I can do is reading the lines one by one, accumulating a dictionary for each line and finally forming the DataFrame

logs = open('log_data.txt', 'r') 
lines = logs.readlines()

data_dict = []
for line in lines:
    curr_dict = {}
    line_split = line.split()
    curr_dict['Date'], curr_dict['Number'] = line_split[0], line_split[1]
    for i in range(2, len(line_split)):
        curr_dict['Request' + str(i-1)] = line_split[i]
    data_dict.append(curr_dict)
data = pd.DataFrame(data_dict)
Rajesh Bhat
  • 980
  • 6
  • 8