8

I have a dataframe that looks like this:

k = pd.DataFrame({'A':[1,2,3,4], 'B':['a','b','c','d']})

And I want to insert into a mongoDB looking like this:

dic = {1:'a', 2:'b',3:'c',4:'d'}

How could I do it?

I have checked things like this but they do not seem to work on my df:

convert pandas dataframe to json object - pandas

Thanks in advance!

Borja_042
  • 1,071
  • 1
  • 14
  • 26
  • 1
    I strongly suggest you "don't" insert like that into MongoDB at all. Whilst you can store "flexible" data structures, "garbage in" is "garbage out". Anything which is a "data point" should not be used as the name of a "key" within a database. If you want meaninful queries after import on this, you really should rethink how you want to store it. – Neil Lunn Jun 11 '18 at 09:31
  • So you are suggesting not to specify index and let mongo to create one? That is a good advice and I thought about it too, but I need that to have the sctructure of : json = {key:{key:value, key:value}} Maybe I can do it in a better way, but how? Yes, this type may work too, right? dic = :{1:'a', 2:'b',3:'c',4:'d'} – Borja_042 Jun 11 '18 at 10:12
  • What am saying is `{ key1: value1, key2: value2 }` IS the problem which you need to avoid. You cannot "query a database" in an efficient way to match `key1` or `key2`. `[{ k: "key1", v: "value1" },{ k: "key2", v: "value2" }]` on the other hand works just fine, since `k` and `v` are static in each element. – Neil Lunn Jun 11 '18 at 10:17

2 Answers2

7

Use Series.to_json and if necessary change key value add rename:

print (k.set_index('A').rename(columns={'B':'index1'}).to_json())
{"index1":{"1":"a","2":"b","3":"c","4":"d"}}

If need export to file:

k.set_index('A').rename(columns={'B':'index1'}).to_json('file.json')
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
0

Although what I am writing is not the answer to the question asked, still I am providing a solution to a small problem I was facing which I googled and reached here.

Problem: how to create a dictionary from a panda data frame with a column as the key and constant value (1 in my case) as the, you guessed it, value.

Solution:

f = pd.Series(data = [1]*df.shape[0],index=df['col_name'])
x = f.to_json(orient='columns')

Output:

{"one":1, "two":1, "three": 1}

Why would I do that? Because search in the dictionary is highly optimized (Yeah I can use set as well)

P.S. Novice in Python so please be gentle with me :).

adjr2
  • 53
  • 8