How can I make a Dataframe from a dict that maps (column, row) index pairs to values?

Question

Using pandas, given this dictionary which has tuples as keys:

dictionary = {(a,c): 1, (a,d): 3, (b,c): 2, (b,d): 4}

How can I get a dataframe like so?

	a	b
c	1	2
d	3	4

I thought about using df.at[] to assign the correct values to each row/column location - e.g. df.at[a,c] = 1. However, I'm not clear on how to use the tuple with .at[].

What dataframe library are you using, Pandas? Please add the tag for it. BTW, welcome to Stack Overflow! Check out the [tour], and [ask] if you want tips. — wjandrea, Sep 02 '23 at 14:27
What should go into cells that *aren't* specified by the dictionary? What column dtypes should be used? — Karl Knechtel, Sep 02 '23 at 14:29
@wjandrea as far as I'm aware, Pandas is the only Python library that uses the name "dataframe" for a type that it creates. — Karl Knechtel, Sep 02 '23 at 14:30
Hi wjandrea! I'm using Pandas. Thanks for your help, I'll give df.at[key] = value a try. Sorry if it was a basic question, I'm a beginner at coding — David, Sep 02 '23 at 14:33
@Karl Polars also has a [`DataFrame`](https://pola-rs.github.io/polars/py-polars/html/reference/dataframe/index.html). Although, if people don't mention it, I assume they're using Pandas, but I like to ask to make sure. — wjandrea, Sep 02 '23 at 15:02

score 3 · Answer 1 · answered Sep 02 '23 at 15:14

3

I would make a MultiIndex Series using the constructor, then unstack it's outer-level :

dictionary = {("a", "c"): 1, ("a", "d"): 3, ("b", "c"): 2, ("b", "d"): 4}

df = pd.Series(dictionary).unstack(0)

Output :

print(df)

   a  b
c  1  2
d  3  4

answered Sep 02 '23 at 15:14

Timeless

22,580
4
12
30

Should add that if the dtypes were different per column, you'd want a different approach. But this works well where they're all ints. – wjandrea Sep 02 '23 at 15:27

score 1 · Answer 2 · answered Sep 02 '23 at 14:46

You could loop through your dictionary and create a new dictionary of dictionaries, where the outer dictionary's keys are column names and the inner dictionaries' keys are the row indices. To save on a few lines of code, I'm going to use a defaultdict(dict) as the outer dictionary

from collections import defaultdict
import pandas as pd

dictionary = {('a','c'): 1, ('a','d'): 3,
              ('b','c'): 2, ('b','d'): 4}


dd = defaultdict(dict)

for (col_name, row_name), value in dictionary.items():
    dd[col_name][row_name] = value

This results in the following dd:

defaultdict(<class 'dict'>, {'a': {'c': 1, 'd': 3}, 'b': {'c': 2, 'd': 4}})

Finally, use this to create your dataframe:

df = pd.DataFrame.from_dict(dd)

Which gives the desired dataframe:

   a  b
c  1  2
d  3  4

Nice! This is a better approach than what OP had in mind; cf. this other SO question: [Creating an empty Pandas DataFrame, and then filling it](/q/13784192/4518341) — wjandrea, Sep 02 '23 at 15:38

How can I make a Dataframe from a dict that maps (column, row) index pairs to values?

2 Answers2