3

How can I create a 2-way table in python? I have 2 categorical variables in a data set and would like to look at the relationship between the 2 variables by creating a 2-way table. Thank you.

Tarek Deeb
  • 81
  • 1
  • 4
  • Possible Duplicate http://stackoverflow.com/questions/6129693/python-creating-a-table – Cody Piersall Jun 19 '13 at 21:19
  • 3
    Do you mean a bidirectional map? – brice Jun 19 '13 at 21:21
  • 5
    Can you give us a sample data set? – rickcnagy Jun 19 '13 at 21:21
  • will creating two dictionaries, each one the opposite of the other, i.e. the key in one is the value in the other solve your problem? – Fredrik Pihl Jun 19 '13 at 21:32
  • I thought it was obvious what you wanted, but I was obviously wrong. Can you please elaborate on what you mean by "two-way table"? Do you mean a table where you look up a value by row and column, or a bidirectional mapping as brice suggested? – Henry Keiter Jun 19 '13 at 21:34
  • I'm voting to close as no-one can work out what the question means, and OP has refused to actually update the question, despite being repeatedly asked. – Marcin Jun 19 '13 at 23:39
  • It's clear what's being asked here... This should not have been closed. A two-way table is another name for a contingency table: https://en.wikipedia.org/wiki/Contingency_table – Greg Mar 18 '17 at 22:07
  • as of pandas 0.17.0. you can use `pd.crosstab(df["column_1"], df["column_2"])` – Aabesh Karmacharya Jan 08 '20 at 16:36

6 Answers6

5

There's a bidict package:

>>> from bidict import bidict
>>> husbands2wives = bidict({'john': 'jackie'})
>>> husbands2wives['john']  # the forward mapping is just like with dict
'jackie'
>>> husbands2wives[:'jackie']  # use slice for the inverse mapping
'john'

You can install it using pip install bidict.


EDIT: For your actual problem - if I understand you correctly - I would use pandas:

# data.csv
Gender Height GPA HS GPA Seat WtFeel Cheat 
Female 64 2.60 2.63 M AboutRt No 1 
Male 69 2.70 3.72 M AboutRt No 2 
Female 66 3.00 3.44 F AboutRt No 3 
Female 63 3.11 2.73 F AboutRt No 4 
Male 72 3.40 2.35 B OverWt No 0

In [1]: import pandas as pd

In [2]: df = pd.read_csv('data.csv', sep = '\s')

In [3]: grouped = df.groupby(['Gender', 'Seat'])

In [4]: grouped.size()
Out[4]: 
Gender  Seat   
Female  AboutRt    3
Male    AboutRt    1
        OverWt     1
dtype: int64
root
  • 76,608
  • 25
  • 108
  • 120
  • The 2 columns are Gender (M, F) and Body Figure (AboutRt, OverWt, UnderWt). All I want to do is count how many males are AboutRt, Overwt and underWt, and do the same for the females. I was hoping that there is a package in python similar to the gmodels library in R where you could create a cross table which is found in R. Thanks again. – Tarek Deeb Jun 19 '13 at 22:05
  • @TarekDeeb -- please edit your question populating it with a small sample of your data... – root Jun 19 '13 at 22:07
  • @TarekDeeb -- You should probably take a look at `pandas`. Added an example... – root Jun 19 '13 at 22:28
  • data.head() Gender Height GPA HS GPA Seat WtFeel Cheat 0 Female 64 2.60 2.63 M AboutRt No 1 Male 69 2.70 3.72 M AboutRt No 2 Female 66 3.00 3.44 F AboutRt No 3 Female 63 3.11 2.73 F AboutRt No 4 Male 72 3.40 2.35 B OverWt No – Tarek Deeb Jun 19 '13 at 22:53
  • 1
    @TarekDeeb -- Please, update your question, with well formatted data - don't paste it in the comments. This makes it much easier for both the people who answer your questions and for further readers, who may have a similar problem. – root Jun 19 '13 at 23:13
1

You may be able to use a DoubleDict as shown in recipe 578224 on the Python Cookbook.

Noctis Skytower
  • 21,433
  • 16
  • 79
  • 117
0

Assuming you don't have to do any interpolation, you can use a dictionary. Use (x, y) tuples as the keys, and whatever your values are as the values. For instance, a trivial 2x2 table like this:

   ___0___1___
0 |   0   0.5
1 |   0.5 1

Would look like this in code:

two_way_lookup = {
                  (0, 0) : 0,
                  (0, 1) : 0.5,
                  (1, 0) : 0.5,
                  (1, 1) : 1
                 }
print(two_way_lookup.get((0, 1))) # prints 0.5
Henry Keiter
  • 16,863
  • 7
  • 51
  • 80
  • 2
    This doesn't look like it actually responds to the question. – Marcin Jun 19 '13 at 21:23
  • Dicts provide only one way lookup, OP asked for 2-way table. – Ashwini Chaudhary Jun 19 '13 at 21:25
  • @Marcin (& Ashwini) I understand "2-way table" to mean "table where you look up `x` and `y` to get a value representing something about that combination of elements." It is not, at least in my mind, the same thing as 2-way lookup, where key/value pairs can be used in either direction. – Henry Keiter Jun 19 '13 at 21:27
  • "would like to look at the relationship between the 2 variables by creating a 2-way table. Thank you." Suggests that OP wants two-way lookup. – Marcin Jun 19 '13 at 21:28
  • @Marcin That's exactly the line that I think supports my interpretation. The asker presumably has two keys already, and wants to look up the relationship between them. – Henry Keiter Jun 19 '13 at 21:30
0

Probably the best solution in the standard library, if your data is moderately large, is to use sqlite, the in-memory database: http://docs.python.org/2/library/sqlite3.html

Marcin
  • 48,559
  • 18
  • 128
  • 201
0

You can create something like a two-level dict (that is, a dict which comprehends two dicts that map the same data in reverse order:

>>> mappings=[(0, 6), (1, 7), (2, 8), (3, 9), (4, 10)]
>>> view = dict(view1=dict(mappings), view2=dict(reversed(k) for k in mappings))
>>> view
{'view2': {8: 2, 9: 3, 10: 4, 6: 0, 7: 1},
'view1': {0: 6, 1: 7, 2: 8, 3: 9, 4: 10}}
michaelmeyer
  • 7,985
  • 7
  • 30
  • 36
  • This works well for a one-to-one function, but gets a bit tricky when you don't assume this condition. Ex: x = 4, y = (-2, 2). – dckrooney Jun 19 '13 at 22:07
0

If you want a home-brewed, wonky solution, you could do something like this:

import collections

class BDMap:
    def __init__(self):
        self.x_table = {}
        self.y_table = {}

    def get(self, x = None, y = None):
        if (x != None) and (y != None):
            y_vals = self.x_table[x]
            if (y in y_vals):
                return (x, y)
        elif x != None:
            return self.x_table[x]
        elif y != None:
            return self.y_table[y]

    def set(self, x, y):
        if isinstance(x, collections.Hashable) and isinstance(y, collections.Hashable):
            self.x_table[x] = self.x_table.get(x, list()) + [y]
            self.y_table[y] = self.y_table.get(y, list()) + [x]
        else:
            raise TypeError("unhashable type")

For anything other than a one-off script with a small data set, you're undoubtedly better off with one of approaches mentioned, though :)

dckrooney
  • 3,041
  • 3
  • 22
  • 28