How can I create a 2-way table in python? I have 2 categorical variables in a data set and would like to look at the relationship between the 2 variables by creating a 2-way table. Thank you.
-
Possible Duplicate http://stackoverflow.com/questions/6129693/python-creating-a-table – Cody Piersall Jun 19 '13 at 21:19
-
3Do you mean a bidirectional map? – brice Jun 19 '13 at 21:21
-
5Can you give us a sample data set? – rickcnagy Jun 19 '13 at 21:21
-
will creating two dictionaries, each one the opposite of the other, i.e. the key in one is the value in the other solve your problem? – Fredrik Pihl Jun 19 '13 at 21:32
-
I thought it was obvious what you wanted, but I was obviously wrong. Can you please elaborate on what you mean by "two-way table"? Do you mean a table where you look up a value by row and column, or a bidirectional mapping as brice suggested? – Henry Keiter Jun 19 '13 at 21:34
-
I'm voting to close as no-one can work out what the question means, and OP has refused to actually update the question, despite being repeatedly asked. – Marcin Jun 19 '13 at 23:39
-
It's clear what's being asked here... This should not have been closed. A two-way table is another name for a contingency table: https://en.wikipedia.org/wiki/Contingency_table – Greg Mar 18 '17 at 22:07
-
as of pandas 0.17.0. you can use `pd.crosstab(df["column_1"], df["column_2"])` – Aabesh Karmacharya Jan 08 '20 at 16:36
6 Answers
There's a bidict package:
>>> from bidict import bidict
>>> husbands2wives = bidict({'john': 'jackie'})
>>> husbands2wives['john'] # the forward mapping is just like with dict
'jackie'
>>> husbands2wives[:'jackie'] # use slice for the inverse mapping
'john'
You can install it using pip install bidict.
EDIT: For your actual problem - if I understand you correctly - I would use pandas
:
# data.csv
Gender Height GPA HS GPA Seat WtFeel Cheat
Female 64 2.60 2.63 M AboutRt No 1
Male 69 2.70 3.72 M AboutRt No 2
Female 66 3.00 3.44 F AboutRt No 3
Female 63 3.11 2.73 F AboutRt No 4
Male 72 3.40 2.35 B OverWt No 0
In [1]: import pandas as pd
In [2]: df = pd.read_csv('data.csv', sep = '\s')
In [3]: grouped = df.groupby(['Gender', 'Seat'])
In [4]: grouped.size()
Out[4]:
Gender Seat
Female AboutRt 3
Male AboutRt 1
OverWt 1
dtype: int64

- 76,608
- 25
- 108
- 120
-
The 2 columns are Gender (M, F) and Body Figure (AboutRt, OverWt, UnderWt). All I want to do is count how many males are AboutRt, Overwt and underWt, and do the same for the females. I was hoping that there is a package in python similar to the gmodels library in R where you could create a cross table which is found in R. Thanks again. – Tarek Deeb Jun 19 '13 at 22:05
-
@TarekDeeb -- please edit your question populating it with a small sample of your data... – root Jun 19 '13 at 22:07
-
@TarekDeeb -- You should probably take a look at `pandas`. Added an example... – root Jun 19 '13 at 22:28
-
data.head() Gender Height GPA HS GPA Seat WtFeel Cheat 0 Female 64 2.60 2.63 M AboutRt No 1 Male 69 2.70 3.72 M AboutRt No 2 Female 66 3.00 3.44 F AboutRt No 3 Female 63 3.11 2.73 F AboutRt No 4 Male 72 3.40 2.35 B OverWt No – Tarek Deeb Jun 19 '13 at 22:53
-
1@TarekDeeb -- Please, update your question, with well formatted data - don't paste it in the comments. This makes it much easier for both the people who answer your questions and for further readers, who may have a similar problem. – root Jun 19 '13 at 23:13
You may be able to use a DoubleDict
as shown in recipe 578224 on the Python Cookbook.

- 21,433
- 16
- 79
- 117
Assuming you don't have to do any interpolation, you can use a dictionary. Use (x, y)
tuples as the keys, and whatever your values are as the values. For instance, a trivial 2x2 table like this:
___0___1___
0 | 0 0.5
1 | 0.5 1
Would look like this in code:
two_way_lookup = {
(0, 0) : 0,
(0, 1) : 0.5,
(1, 0) : 0.5,
(1, 1) : 1
}
print(two_way_lookup.get((0, 1))) # prints 0.5

- 16,863
- 7
- 51
- 80
-
2
-
Dicts provide only one way lookup, OP asked for 2-way table. – Ashwini Chaudhary Jun 19 '13 at 21:25
-
@Marcin (& Ashwini) I understand "2-way table" to mean "table where you look up `x` and `y` to get a value representing something about that combination of elements." It is not, at least in my mind, the same thing as 2-way lookup, where key/value pairs can be used in either direction. – Henry Keiter Jun 19 '13 at 21:27
-
"would like to look at the relationship between the 2 variables by creating a 2-way table. Thank you." Suggests that OP wants two-way lookup. – Marcin Jun 19 '13 at 21:28
-
@Marcin That's exactly the line that I think supports my interpretation. The asker presumably has two keys already, and wants to look up the relationship between them. – Henry Keiter Jun 19 '13 at 21:30
Probably the best solution in the standard library, if your data is moderately large, is to use sqlite
, the in-memory database: http://docs.python.org/2/library/sqlite3.html

- 48,559
- 18
- 128
- 201
You can create something like a two-level dict (that is, a dict which comprehends two dicts that map the same data in reverse order:
>>> mappings=[(0, 6), (1, 7), (2, 8), (3, 9), (4, 10)]
>>> view = dict(view1=dict(mappings), view2=dict(reversed(k) for k in mappings))
>>> view
{'view2': {8: 2, 9: 3, 10: 4, 6: 0, 7: 1},
'view1': {0: 6, 1: 7, 2: 8, 3: 9, 4: 10}}

- 7,985
- 7
- 30
- 36
-
This works well for a one-to-one function, but gets a bit tricky when you don't assume this condition. Ex: x = 4, y = (-2, 2). – dckrooney Jun 19 '13 at 22:07
If you want a home-brewed, wonky solution, you could do something like this:
import collections
class BDMap:
def __init__(self):
self.x_table = {}
self.y_table = {}
def get(self, x = None, y = None):
if (x != None) and (y != None):
y_vals = self.x_table[x]
if (y in y_vals):
return (x, y)
elif x != None:
return self.x_table[x]
elif y != None:
return self.y_table[y]
def set(self, x, y):
if isinstance(x, collections.Hashable) and isinstance(y, collections.Hashable):
self.x_table[x] = self.x_table.get(x, list()) + [y]
self.y_table[y] = self.y_table.get(y, list()) + [x]
else:
raise TypeError("unhashable type")
For anything other than a one-off script with a small data set, you're undoubtedly better off with one of approaches mentioned, though :)

- 3,041
- 3
- 22
- 28