14

In R we can find the frequency of each item using table. This is an example in R:

x <- c(1,1,1,1,2,2)
y <- c("a","a","b","a","a","b")
table(x,y)
#   y
#x   a b
#  1 3 1
#  2 1 1

How can I implement it in python while x and y are as DataFrame? I am totally new in Python and I searched a lot but I was unable to find my answer. I should mention that I read this article but I couldn't implement it in my case?

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
Hadij
  • 3,661
  • 5
  • 26
  • 48
  • 3
    Try with `crosstab` – akrun Jan 14 '18 at 17:16
  • 2
    To deal with tablular data, Python has a very good library - [Pandas](https://pandas.pydata.org/pandas-docs/stable/10min.html). Read this 10 minute introduction and you'll be able to handle the table manipulation tasks easily. – TrigonaMinima Jan 14 '18 at 17:19
  • 3
    @akrun Thank you very much. `pd.crosstab(X,Y)` was exactly what I needed. – Hadij Jan 14 '18 at 17:26
  • @TrigonaMinima Thank you, I will go over it. Actually, I am using basic Panda frequently but crosstab was new to me. – Hadij Jan 14 '18 at 17:26

2 Answers2

19

We can do this with crosstab from pandas

import numpy as np;
import pandas as pd;
x = np.array([1, 1, 1, 1, 2, 2]);
y = np.array(["a", "a", "b", "a", "a", "b"]);
pd.crosstab(x, y, rownames = ['x'], colnames = ['y']);
#  y  a  b
#x
#1  3  1
#2  1  1
akrun
  • 874,273
  • 37
  • 540
  • 662
12

counting occurrences R:

sort(table(df$source), decreasing = TRUE)

Python Pandas:

df.source.value_counts() 
#or
df["source"].value_counts()

Source: R vs Python - a One-on-One Comparison


For counting occurrences between two columns

with R

table(cdc$gender,cdc$smoke100)

with python

pd.crosstab(index=df['gender'], columns=df['smoke100'])

Source: look at this answer

anasmorahhib
  • 807
  • 9
  • 14