Parse list and create DataFrame

Question

I have been given a list called data which has the following content

data=[b'Name,Age,Occupation,Salary\r\nRam,37,Plumber,1769\r\nMohan,49,Elecrician,3974\r\nRahim,39,Teacher,4559\r\n']

I wanted to have a pandas dataframe which looks like the link Expected Dataframe

How can I achieve this.

score 1 · Answer 1 · answered Jun 07 '20 at 11:47

1

You can try this:

data=[b'Name,Age,Occupation,Salary\r\nRam,37,Plumber,1769\r\nMohan,49,Elecrician,3974\r\nRahim,39,Teacher,4559\r\n']

processed_data = [x.split(',') for x in data[0].decode().replace('\r', '').strip().split('\n')]
df = pd.DataFrame(columns=processed_data[0], data=processed_data[1:])

Hope it helps.

answered Jun 07 '20 at 11:47

Abhinav Goyal

1,312
7
17

Thanks This worked perfectly. Will it have any performance issue if used over a large data? This example was a simple one. Actual use case may have millions of records. – jyotiska Jun 08 '20 at 04:03
I think it should be fine (may take a few seconds at max). List comprehensions in python are very fast. But let me know if it slows down too much. – Abhinav Goyal Jun 08 '20 at 07:21
Please consider accepting the answer if it solved your problem. :) – Abhinav Goyal Jul 31 '21 at 14:18

score 0 · Answer 2 · answered Jun 07 '20 at 11:46

I would recommend you to convert this list to string as there is only one index in this list

str1 = ''.join(data)

Then use solution provided here

import sys
if sys.version_info[0] < 3: 
    from StringIO import StringIO
else:
    from io import StringIO

import pandas as pd

TESTDATA = StringIO(str1)
df = pd.read_csv(TESTDATA, sep=",")

Parse list and create DataFrame

2 Answers2