-2

I am trying to join 4 dataframes for to make it more readable. I did try pd.concat([df1, df4], ignore_index=True, sort=False) but cannot produce the needed output.

# code being used
import pandas as pd 

data = []
with open('datafile.txt', 'r') as file:
    for line in file:
        cols = line.strip().split()
        data.append(cols)

df = pd.DataFrame(data, columns=['Object','Change'])
df['Change'] = df['Change'].astype(float)

df_sorted = df.sort_values(by='Change', ascending=False)
top_positive = df_sorted[df_sorted['Change'] > 0].head(5)
middle_values = df_sorted[(df_sorted['Change'] >= -2) & (df_sorted['Change'] <= 2)] 

middle_positive = middle_values[middle_values['Change'] > 0].head(5)
middle_negative = middle_values[middle_values['Change'] < 0].tail(5)

bottom_values = df_sorted.tail(5)
df1 = top_positive(index=False)               # Upper 5
df2 = middle_negative.iloc[::-1](index=False) # Middle 1
df3 = middle_positive(index=False)            # Middle 2
df4 = bottom_values.iloc[::-1](index=False)   #Lower 5

Current Output: # Truncated

| Object   |   Change |  #- Upper 5
|:---------|---------:|
| 123ABDCA |   11.658 |
| ABDC123  |   11.468 |
| LLLSKS13 |    3.398 |

| Object   |   Change |  #- Middle(-0.x to -2)
|:---------|---------:|
| AS716201 |   -1.99  |
| 26SSASS  |   -1.989 |
| 1ASDS2A  |   -1.92  |

| Object    |   Change |  #- Middle(0.x to 2)
|:----------|---------:|
| AS2QASD   |    1.926 |
| AASSSD001 |    1.687 |
| 76ASDSD   |    1.295 |

| Object       |   Change |  #- Lower 5
|:-------------|---------:|
| 3ISKJSDA     |  -19.029 |
| 3474ADS      |  -14.391 |
| 5665SAA2     |   -7.098 |

Need Output: # Intended Output

|       Upper 5       |  Middle(-0.x to -2) |    Middle(0.x to 2)  |          Lower 5        |
|:---------|---------:|:---------|---------:|:----------|---------:|:-------------|---------:|
| 123ABDCA |   11.658 | AS716201 |   -1.99  | AS2QASD   |    1.926 | 3ISKJSDA     |  -19.029 |
| ABDC123  |   11.468 | 26SSASS  |   -1.989 | AASSSD001 |    1.687 | 3474ADS      |  -14.391 |
| 8BADC81  |    5.041 | 9KSKSUS  |   -1.939 | A009SKSS  |    1.679 | 663KJDJS     |   -9.2   |
| 828282AS |    3.963 | A2DFASA  |   -1.922 | JFHHDW1   |    1.643 | 63ISMDSA     |   -9.153 |
| LLLSKS13 |    3.398 | 1ASDS2A  |   -1.92  | 76ASDSD   |    1.295 | 5665SAA2     |   -7.098 |
DarCoy
  • 49
  • 4
  • What's wrong with `pd.concat` exactly? At a glance, that looks like the right solution, though you would need to use `[df1, df2, df3, df4]`. Please make a [mre] including input and current output. For specifics see [How to make good reproducible pandas examples](/q/20109391/4518341). BTW, welcome to Stack Overflow! Check out the [tour], and [ask] if you want more tips. – wjandrea Aug 25 '23 at 00:35
  • `result_df = pd.concat([df1, df2, df4, df4], axis=1)` – taller_ExcelHome Aug 25 '23 at 00:40
  • 1
    All your `df*` variables are strings (since that's what `to_markdown` produces). You might want to adjust your naming conventions so that others reading your code don't mistakenly treat your `df*` variables as if they were DataFrames. – Henry Ecker Aug 25 '23 at 00:42
  • Do you realize that the "Merge Output" you're showing here isn't correct Markdown? Only four of the columns are shown and the other four are hidden. (At least, in CommonMark, which SO uses.) Are you only going to be using it as plain text ultimately? Another consequence of that is that Pandas can't generate it for you, so you'd need to munge the result of `.to_markdown` yourself. – wjandrea Aug 25 '23 at 00:47

2 Answers2

0

Looks like you want index based merges.

df = pd.merge(df1, df2, left_index=True, right_index=True)
df = pd.merge(df, df3, left_index=True, right_index=True)
df = pd.merge(df, df4, left_index=True, right_index=True)

The final df should give you the desired output minus the header information.

Danyal Imran
  • 2,515
  • 13
  • 21
  • getting error `raise TypeError( TypeError: Can only merge Series or DataFrame objects, a was passed` – DarCoy Aug 25 '23 at 00:30
  • @DarCoy In your code, what you're calling "df"s are actually strings. Remove the `.to_markdown()` calls to get the actual dataframes, and use some other name for the markdown versions. – wjandrea Aug 25 '23 at 00:42
  • OP hasn't shown the indexes, so what makes you say that? Maybe you're thinking of `pd.concat(..., ignore_index=True)`? – wjandrea Aug 25 '23 at 00:53
0

To stack the dataframes :

df = pd.concat([df1, df2, df3, df4], axis=1)

If rows order is not proper after merging :

df = df.sort_values(by="Change", ascending=False)
wjandrea
  • 28,235
  • 9
  • 60
  • 81