-1

The problem I'm having is processing a table in the database and then merging all to write to another table.

The table structure is like this:

enter image description here

Owns 3 database tables.

I want to merge into the table by the Entryid

My idea is to extract the data with pandas and process.

But there are many problems writing to the database.

Below is the code I tried:

# -*- coding: UTF-8 -*-
import pandas as pd
import pymongo


def data_Process():
    client = pymongo.MongoClient(host="mongodb://localhost:27017/")

    collection1 = client.test.declare_customs
    collection2 = client.test.tax
    collection3 = client.test.ship
    collection4 = client.test.jieguan

    data1 = pd.DataFrame(list(collection1.find()))
    data2 = pd.DataFrame(list(collection2.find()))
    data3 = pd.DataFrame(list(collection3.find()))
    data3 = data3.groupby(by=['entryId']).agg(';'.join)

    data4 = pd.merge(data1, data2, on='entryId', how='left')
    data5 = pd.merge(data4, data3, on='entryId', how='left')

    data5.to_excel('data5.xlsx')
    collection4.insert(data5)



if __name__ == '__main__':
    data_Process()

Error: enter image description here

If you have a better idea.

Thanks.

hakukou
  • 111
  • 1
  • 10

1 Answers1

0

Refer the below answer,

Insert a Pandas Dataframe into mongodb using PyMongo

https://stackoverflow.com/a/20167984/3704501

You need to convert the pandas dataframe into json format before inserting into mongo collection

Sathish
  • 332
  • 4
  • 11
  • hi,I know the reason ` data3 = pd.DataFrame(list(collection3.find())) if True: del data3['_id'] ` I need to remove the ID from the data exported by mongodb – hakukou Nov 29 '19 at 05:41
  • @hakukou `if True`? You shouldn’t delete columns using `del` by the way, use `drop()`. – AMC Nov 29 '19 at 05:56
  • @Alexander Cécile hi,Yes i need to delete '_id' – hakukou Dec 05 '19 at 23:19
  • @hakukou Where is the `if True` coming from? Use [`.drop()`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.drop.html) to remove the column. – AMC Dec 05 '19 at 23:22
  • @Alexander Cécile if True not required, yes i used .drop(columns='_id') thank you very much – hakukou Dec 05 '19 at 23:34
  • @hakukou You’re welcome :) Everything is clear now? – AMC Dec 05 '19 at 23:41
  • #Alexander Cécile Yes i understand :) – hakukou Dec 05 '19 at 23:49