How to transform a set of qualitative data into numbers according to quantity?

Question

i have a data frame like this:

country level_of_smoking clasification_country
germany  high              A
greece   low               B
USA      medium            A
france   none              A
italy    low               B
spain    medium            A

and so on (the list is longer than this is just a example)

so, I wonder how transform this dataframe into something like this:

       high   medium    low    none
classA 1        2        0      1
classB 0        0        2      0

please if you can help me with R or python code that make this.

In `tidyverse`, `df1 %>% count(class, smoking) %>% spread(smoking, n, fill = 0)` — Ronak Shah, Sep 12 '19 at 02:21

Trenton McKinney · Answer 1 · 2019-09-11T21:50:05.980

0

Python

Data:

country smoking class
 greece     low     B
    USA  medium     A
 france    none     A
  italy     low     B
  spain  medium     A

import pandas as pd

# to read test data from clipboard
df = pd.read_clipboard(sep='\\s+')
df.groupby(['class', 'smoking'])['smoking'].count().unstack()

Output:

edited Sep 11 '19 at 21:50

answered Sep 11 '19 at 21:44

Trenton McKinney

56,955
33
144
158

score 0 · Answer 2 · answered Sep 11 '19 at 21:48

In R, this can be done with table (No packages used)

table(df1[3:2])
#   smoking
#class low medium none
#    A   0      2    1
#    B   2      0    0

data

df1 <- structure(list(country = c("greece", "USA", "france", "italy", 
"spain"), smoking = c("low", "medium", "none", "low", "medium"
), class = c("B", "A", "A", "B", "A")), class = "data.frame", row.names = c(NA, 
-5L))

score 0 · Answer 3 · answered Sep 11 '19 at 21:53

It's a long to wide problem, you can also use reshape2::dcast()

data <- structure(list(country = c("greece", "USA", "france", "italy", 
"spain"), value = c("low", "medium", "none", "low", "medium"), 
    class = c("B", "A", "A", "B", "A")), class = c("data.frame"), row.names = c(NA, -5L))


reshape2::dcast(
    data,
    class ~ value, # formula
    value.var = "country", 
    fun.aggregate = length # use length as aggregation function
    )

  class low medium none
1     A   0      2    1
2     B   2      0    0

How to transform a set of qualitative data into numbers according to quantity?

3 Answers3

Python

Data:

Output:

data