I am completely new to Python but am currently doing some web scraping of some tables and I am trying to combine the 2 dataframes that I get into 1 csv file.
So lets say I have df_1 which looks like:
Rank | Team | Points |
---|---|---|
64 | AR Lit Rock | 35 |
102 | Abl Christian | 75 |
34 | Air Force | 45 |
And df_2 which looks like:
Rank | Team | Points |
---|---|---|
354 | AR Lit Rock | 46 |
284 | Abl Christian | 68 |
82 | Air Force | 42 |
(I have both dataframes being sorted by Team already) I am wanting to combine both of these into 1 csv file to eventually end up looking like:
Rank | Team | Points | Rank | Team | Points |
---|---|---|---|---|---|
64 | AR Lit Rock | 35 | 354 | AR Lit Rock | 46 |
102 | Abl Christian | 75 | 284 | Abl Christian | 68 |
34 | Air Force | 45 | 82 | Air Force | 42 |
or if at all possible to separate the 2 tables with a blank column inbetween:
a | b | c | d | e | f | g | h | |
---|---|---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | |
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 |
I am looking at doing some calculations between the 2 tables within the csv at a later point and just want it to be more readable. Since I am completely new to Python, this already may be a bad idea to have both tables in 1 csv file. If that is also the case, please let me know.
Currently I have both dataframes in their own respective objects and so far have only attempted concat.
df = pd.concat([df_1, df_2], axis=1)
This had kind of done what I want except the fact the I also have both dataframes sorted alphabetically based off of a certain column and then when they concat, the 1st dataframe is sorted and the 2nd gets sorted dependent on another value from the 1st dataframe. So end result looked like this instead:
Rank | Team | Points | Rank | Team | Points |
---|---|---|---|---|---|
64 | AR Lit Rock | 35 | 64 | Oregon | 57 |
102 | Abl Christian | 75 | 102 | Sac State | 81 |
34 | Air Force | 45 | 34 | Ark Pine Bl | 59 |
2nd dataframe getting added by Rank rather than acting independently.
(Sorry, had trouble making sample tables)
I have added a little snippet of what the current csv looks like after putting both dataframes into the csv. Both dataframes are sorted alphabetically by the Team name initially, but when the 2nd dataframe gets added it gets sorted by the Rank from the 1st dataframe.
UPDATE (edited code from Soroosh to better show the issue)
import pandas as pd
data1 = {
'Rank': [64, 102, 34,],
'Team': ["AR Lit Rock", "Abl Christian", "Air Force",],
'Points': [35, 75, 45,],
}
data2 = {
'Rank': [354, 284, 82,],
'Team': ["Air Force", "Abl Christian", "AR Lit Rock",],
'Points': [46, 68, 42,],
}
df_1 = pd.DataFrame(data1)
df_2 = pd.DataFrame(data2)
df_1 = df_1.sort_values('Team')
df_2 = df_2.sort_values('Team')
data_blk = {
"Blank": [""]*len(df_1["Rank"]),
}
df_blk = pd.DataFrame(data_blk)
new_df = pd.concat([df_1, df_blk], axis=1)
new_df = pd.concat([new_df, df_2], axis=1)
print(new_df.head())
I take 2 unsorted tables, sort by the Team column, and then both dataframes to stay sorted by that Team column when concat the dataframes.