1

I have the problem of merging two dataframes in an alternated way based on a key.

I can in fact merge two dataframes but the large amount of data the two have just mix up and become chaos to read.

The first dataframe is "sales" and the second is "products_sales" that i recover from a database.

I need them to merge on the sale code more or less in this order so i can export to a CSV file.

#i get the path and the cnxn from other places.
def sales_format(cnxn,path):
    csv_builder = []
    sales= open(path+'sales_no_products.csv')

    header_sales= vendas.readlines()[0]

    products= open(path+'products_sale.csv')
    header_products= produtos.readlines()[0]


    products= open(path+'products_sale.csv')
    sales= open(path+'sales_no_products.csv')
    line_products= products.readlines()[1:]
    line_sales= sales.readlines()[1:]



    product= pd.read_csv(path+'products_sale.csv',sep=';')
    sale= pd.read_csv(path+'sales_no_products.csv',sep=';')
    merging = pd.merge_ordered(sale, product, on='pedido', how='inner',left_by='pedido')

    print (merging)
    merging.to_csv(path+'test.csv', sep=';')

    #here is some commented code that in theory can do what i need, the problem here is that the large amount of columns makes it take 2 - 3 hours to complete.

    # for i in line_sales:
    #   csv_builder.append(header_sales.split(';'))
    #   id_sales= i.split(';')[1]
    #   csv_builder.append(i.split(';'))
    #   csv_builder.append(header_products.split(';'))

    #   for j in line_products:
    #       id_products_sale= j.split(';')[1]
    #       if(id_products_sale== id_sales):
    #           csv_builder.append(j.split(';'))


    #   csv_builder.append('---;-------------------------------------;-------------------------------------;-------------------------------------;-------------------------------------;-------------------------------------;-------------------------------------;-------------------------------------;-------------------------------------;-------------------------------------;-------------------------------------;-------------------------------------;-------------------------------------;-------------------------------------;-------------------------------------;-------------------------------------;-------------------------------------;')

    # file = open(path+"Sales.csv",'w')
    # writer = csv.writer(file,delimiter=';')
    # writer.writerows(csv_builder)

I need the results to be structured like:

Sales header

  • Sale data

    Product_sale header

  • product data

  • product data
  • product data

...

Sales header

  • Sale data

    Product_sale header

  • product data

  • product data
  • product data

...

As requested here is a print of the expected output. csv expected output

Exemple of the data in the image.

HeaderS1  HeaderS2  HeaderS3  HeaderS4  HeaderS5  HeaderS6  HeaderS7  HeaderS8  HeaderS9
'DataS1'  0  'DataS3'  12  'DataS5'  1111  23  'DataS8'  'DataS9'
HeaderP1  HeaderP2  HeaderP3  HeaderP4
'DataP1'  23  67  'DataP4'
'DataP1'  22  1212  'DataP4'
'DataP1'  11  39  'DataP4'
'DataP1'  1  342  'DataP4'

---------  ---------  ---------  ---------  ---------  ---------  ---------  ---------  ---------
marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Start PDV
  • 21
  • 4
  • 3
    Can you create a small mock set of data along with expected output from this small set of mock data? – Scott Boston Apr 03 '19 at 13:42
  • i will try, gimme a min. – Start PDV Apr 03 '19 at 13:46
  • i added a image of the expected output on libreOffice – Start PDV Apr 03 '19 at 14:00
  • 1
    you should never post an image of data/code. refer this , it will help you create [good reproducible pandas examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) – anky Apr 03 '19 at 14:04
  • the thing is i have to generate a CSV from data i get from a database, i cant by any means post the data i get here so i need to generate data from head to post here and show what i need to get as an output. – Start PDV Apr 03 '19 at 14:12

0 Answers0