I have the problem of merging two dataframes in an alternated way based on a key.
I can in fact merge two dataframes but the large amount of data the two have just mix up and become chaos to read.
The first dataframe is "sales" and the second is "products_sales" that i recover from a database.
I need them to merge on the sale code more or less in this order so i can export to a CSV file.
#i get the path and the cnxn from other places.
def sales_format(cnxn,path):
csv_builder = []
sales= open(path+'sales_no_products.csv')
header_sales= vendas.readlines()[0]
products= open(path+'products_sale.csv')
header_products= produtos.readlines()[0]
products= open(path+'products_sale.csv')
sales= open(path+'sales_no_products.csv')
line_products= products.readlines()[1:]
line_sales= sales.readlines()[1:]
product= pd.read_csv(path+'products_sale.csv',sep=';')
sale= pd.read_csv(path+'sales_no_products.csv',sep=';')
merging = pd.merge_ordered(sale, product, on='pedido', how='inner',left_by='pedido')
print (merging)
merging.to_csv(path+'test.csv', sep=';')
#here is some commented code that in theory can do what i need, the problem here is that the large amount of columns makes it take 2 - 3 hours to complete.
# for i in line_sales:
# csv_builder.append(header_sales.split(';'))
# id_sales= i.split(';')[1]
# csv_builder.append(i.split(';'))
# csv_builder.append(header_products.split(';'))
# for j in line_products:
# id_products_sale= j.split(';')[1]
# if(id_products_sale== id_sales):
# csv_builder.append(j.split(';'))
# csv_builder.append('---;-------------------------------------;-------------------------------------;-------------------------------------;-------------------------------------;-------------------------------------;-------------------------------------;-------------------------------------;-------------------------------------;-------------------------------------;-------------------------------------;-------------------------------------;-------------------------------------;-------------------------------------;-------------------------------------;-------------------------------------;-------------------------------------;')
# file = open(path+"Sales.csv",'w')
# writer = csv.writer(file,delimiter=';')
# writer.writerows(csv_builder)
I need the results to be structured like:
Sales header
Sale data
Product_sale header
product data
- product data
- product data
...
Sales header
Sale data
Product_sale header
product data
- product data
- product data
...
As requested here is a print of the expected output.
csv expected output
Exemple of the data in the image.
HeaderS1 HeaderS2 HeaderS3 HeaderS4 HeaderS5 HeaderS6 HeaderS7 HeaderS8 HeaderS9
'DataS1' 0 'DataS3' 12 'DataS5' 1111 23 'DataS8' 'DataS9'
HeaderP1 HeaderP2 HeaderP3 HeaderP4
'DataP1' 23 67 'DataP4'
'DataP1' 22 1212 'DataP4'
'DataP1' 11 39 'DataP4'
'DataP1' 1 342 'DataP4'
--------- --------- --------- --------- --------- --------- --------- --------- ---------