how to merge multiple dataframes inside one file, python

Question

In my code i have received result like this one:

A B C
1 1 1
A B C
2 2 2
A B C
3 3 3

I need to merge those columns (dataframes) to one big dataframe like

To merge dataframes from different files its ease like pd.merge(df1,df2) but how to do it when dataframes are in one file? Thanks in advice!

EDIT: to receive my data i converted the lines in my dataset to get dataframes, and i have received in one output each dataset for each line. my code:

def coordinates():
    with open('file.txt') as file:
        for lines in file:
            lines =StringIO(lines[35:61]) #i need only those fields in each line
            abc=pd.read_csv(lines,sep=' ',header=None)
            abc.columns=['A', 'B', 'C','D','E','F']
            print abc

coordinates()

EDIT2: Proposition from s_vishnu its only good for prapared file with same multiple headers. But in my case i have multiple DataFrames generated to the file and each line after header have 0 value. It's many dataframes and each have only one line.

EDIT3: in my file.txt i have big amount of lines with about 80 letters in line like this:

AAA S S SSDAS ASDJAI A 234 33 43 234 2342999 2.31 22 33 SSS S D W2UUQ Q231WQ A 222 11 23 123 1231299 2.31 22 11

and from those line i need only part of information so thats why i did lines =StringIO(lines[35:61]) to take this info. In this example i will need letters [30:55] and create dataframe with them withcolumns=['A', 'B', 'C','D','E','F'] with sep=' '

Maybe the answer to this one will help you : https://stackoverflow.com/questions/44715393/how-to-concatenate-multiple-pandas-dataframes-without-running-into-memoryerror — Tbaki, Jun 23 '17 at 11:12

void · Answer 1 · 2017-06-23T10:52:31.550

0

my_test.csv:

A, B, C
1, 1 ,1
A, B, C
2, 2, 2
A, B, C
3, 3, 3

Use list slicing.

import pandas as pd
df = pd.read_csv("my_test.csv")
df=df[::2]
print(df)

output:

   A    B   C
0  1   1    1
2  2    2   2
4  3    3   3

df=df[::2] This is advanced list slicing. Where in df[::2] the 2 means starting from 0 increment by 2 step.

But note the index values. They too are in steps of 2. i.e 0,2,4,.. to change the index just do this.

import pandas as pd
df = pd.read_csv("my_test.csv")
df=df[::2]

df.index = range(len(df['A']))
print(df)

output:

   A    B   C
0  1   1    1
1  2    2   2
2  3    3   3

So you get the values you desire.

edited Jun 23 '17 at 10:52

answered Jun 23 '17 at 10:45

void

2,571
2
20
35

hello . My output its still same, its doesnt work in my case with my code. i am receiving still: ` A B C 0 1 1 1 A B C 0 2 2 2` and etc.. – Pawe Jun 23 '17 at 12:01
`In my code i have received result like this one` this is what you specified right? How did you get this? Apply my code after once you get what you mentioned it will work – void Jun 23 '17 at 12:09
please take a look on my code in my question. Its not a one frame with multiple headers, its multiple dataframes and each dataframe have one same header, maybe thats why it doesn't work on my dataset? – Pawe Jun 24 '17 at 12:33
Okay man will look. Just can you post what's in your `file.txt` – void Jun 24 '17 at 13:48
,i have edit my question, i think i can do my code better at the beggining where i am taking letters from the lines, and if someone could help me to create only one dataframe and if that will be possible i will not need to merge those dataframes which i have right now. Thanks – Pawe Jun 26 '17 at 06:37

score 0 · Answer 2 · answered Jun 26 '17 at 14:39

I have found the solution, I've changed the code at the beginning and that was helpfull:

def coordinates():
abc=open('file.txt')
lines=abc.readlines()
        for line in lines:
        abc2=line[20:-7] #i just cut the lines from the begining and from the end, and i dont need to take data from the middle
        abc3=abc2.split()
        pd.DataFrame(abc3) 
        print abc3

coordinates()

how to merge multiple dataframes inside one file, python

2 Answers2