0

i have a data frame like this:

country level_of_smoking clasification_country
germany  high              A
greece   low               B
USA      medium            A
france   none              A
italy    low               B
spain    medium            A

and so on (the list is longer than this is just a example)

so, I wonder how transform this dataframe into something like this:

       high   medium    low    none
classA 1        2        0      1
classB 0        0        2      0

please if you can help me with R or python code that make this.

Ronak Shah
  • 377,200
  • 20
  • 156
  • 213

3 Answers3

0

Python

Data:

country smoking class
 greece     low     B
    USA  medium     A
 france    none     A
  italy     low     B
  spain  medium     A
import pandas as pd

# to read test data from clipboard
df = pd.read_clipboard(sep='\\s+')
df.groupby(['class', 'smoking'])['smoking'].count().unstack()

Output:

enter image description here

Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158
0

In R, this can be done with table (No packages used)

table(df1[3:2])
#   smoking
#class low medium none
#    A   0      2    1
#    B   2      0    0

data

df1 <- structure(list(country = c("greece", "USA", "france", "italy", 
"spain"), smoking = c("low", "medium", "none", "low", "medium"
), class = c("B", "A", "A", "B", "A")), class = "data.frame", row.names = c(NA, 
-5L))
akrun
  • 874,273
  • 37
  • 540
  • 662
0

It's a long to wide problem, you can also use reshape2::dcast()

data <- structure(list(country = c("greece", "USA", "france", "italy", 
"spain"), value = c("low", "medium", "none", "low", "medium"), 
    class = c("B", "A", "A", "B", "A")), class = c("data.frame"), row.names = c(NA, -5L))


reshape2::dcast(
    data,
    class ~ value, # formula
    value.var = "country", 
    fun.aggregate = length # use length as aggregation function
    )
  class low medium none
1     A   0      2    1
2     B   2      0    0
yusuzech
  • 5,896
  • 1
  • 18
  • 33