Highest Voted 'data-wrangling' Questions

127

votes

8 answers

Good alternative to Pandas .append() method, now that it is being deprecated?

I use the following method a lot to append a single row to a dataframe. One thing I really like about it is that it allows you to append a simple dict object. For example: # Creating an empty dataframe df = pd.DataFrame(columns=['a', 'b']) #…

asked Jan 24 '22 at 16:46

Glenn

4,195
9
33
41

17

votes

5 answers

How to swap the column and row entries in R

library(data.table) dat1 <- data.table(id = c(1, 2, 34, 99), class = c("sports", "", "music, sports", ""), hobby = c("knitting, music, sports", "", "", "music")) > dat1 id class hobby 1 1 …

r data.table data-wrangling

asked Sep 08 '22 at 03:01

Adrian

9,229
24
74
132

14

votes

1 answer

R: Changing column names in pivot_wider() -- suffix to prefix

I'm trying to figure out how to alter the way in which tidyr's pivot_wider() function creates new variable names in resulting wide data sets. Specifically, I would like the "names_from" variable to be added to the prefix of the new variables rather…

r reshape tidyr data-wrangling

asked Jul 23 '20 at 19:40

mkpcr

431
1
3
13

7

votes

3 answers

How to summarise a dataframe retaining all the columns

Consider the following dataframe: dummy_df <- tibble( A=c("ABC", "ABC", "BCD", "CDF", "CDF", "CDF"), B=c(0.25, 0.25, 1.23, 0.58, 0.58, 0.58), C=c("lorem", "ipsum", "dolor", "amet", "something", "else"), D=c("up", "up", "down", "down",…

r dplyr group-by aggregate data-wrangling

asked May 04 '23 at 18:10

jpm92

143
1
8

5

votes

2 answers

Pandas: normalize values by group

I find it hard to explain with words what I want to achieve, so please don't judge me for showing a simple example instead. I have a table that looks like…

python pandas dataframe data-science data-wrangling

asked Sep 27 '22 at 13:51

Max Skoryk

404
2
10

5

votes

3 answers

New column based on values from other columns AND respecting pre-established rules

I'm looking for an algorithm to create a new column based on values from other columns AND respecting pre-established rules. Here's an example: artificial data df = data.frame( col_1 =…

python r data-wrangling

asked Jul 19 '22 at 10:58

Henrique

146
7

5

votes

1 answer

Mutate across multiple columns using dplyr

I am trying to calculate rowwise averages for a number of columns. Could somebody please explain why the code below only calculates the mean for the two variables in the code (var_1 and var_13), rather than the mean for all 13 columns? df %>%…

r dplyr data-wrangling

asked Apr 13 '22 at 10:03

EvieeG

53
3

5

votes

3 answers

Transforming complete age from character to numeric in R

I have a dataset with people's complete age as strings (e.g., "10 years 8 months 23 days) in R, and I need to transform it into a numeric variable that makes sense. I'm thinking about converting it to how many days of age the person has (which is…

r data-cleaning lubridate stringr data-wrangling

asked Dec 01 '21 at 20:59

Ruam Pimentel

1,288
4
16

5

votes

3 answers

How do I create new columns based on the values of a different column and count the percentage value of another numerical column in R?

The sample data frame: no <- rep(1:5, each=2) type <- rep(LETTERS[1:2], times=5) set.seed(4) value <- round(runif(10, 10, 30)) df <- data.frame(no, type, value) df no type value 1 1 A 22 2 1 B 10 3 2 A 16 4 2 B …

r dataframe dplyr data-wrangling

asked Sep 17 '21 at 06:11

Shibaprasadb

1,307
1
7
22

5

votes

1 answer

Julia. Summarise one column into a new DataFrame with multiple columns

I need to group a dataframe by one variable and then summarising it by adding the number or rows (I can already do this) and number of columns relative to .25, .5, .75 quantiles of another variable. In R I would do e.g.: iris %>% …

dataframe julia data-wrangling

asked Jun 07 '21 at 16:33

Bakaburg

3,165
4
32
64

5

votes

3 answers

Create new dataframe by dividing all possibles columns combination from another table

I'm struggling to find an easy a fast solution to create a new data frame by multiplying all "group" of columns between them. Data for example a1 <- rnorm(n = 10) b1 <- rnorm(n = 10) c1 <- rnorm(n = 10) a2 <- rnorm(n = 10) b2 <- rnorm(n = 10) c2 <-…

r dplyr tidyverse tidyr data-wrangling

asked Jun 06 '21 at 04:31

Ian.T

1,016
1
9
19

5

votes

4 answers

How to write an efficient wrapper for data wrangling, allowing to turn off any wrapped part when calling the wrapper

To streamline data wrangling, I write a wrapper function consisted of several "verb functions" that process the data. Each one performs one task on the data. However, not all tasks are applicable to all datasets that pass through this process, and…

r function user-defined-functions wrapper data-wrangling

asked Mar 03 '21 at 13:35

Emman

3,695
2
20
44

5

votes

1 answer

Data manipulation in Pandas: create a boolean column from values on column then fill with value from yet another column

ok, I've been trying this for too long, time to ask for help. I have a dataframe that looks a bit like this: person fruit quantity all_fruits 0 p1 grapes 2 [grapes, banana] 1 p1 banana 1 [grapes, banana] 2 p2…

python pandas function dataframe data-wrangling

asked Sep 03 '20 at 12:48

Giovanna Fernandes

117
1
10

4

votes

3 answers

Tidyverse column-wise differences

Suppose I have a data frame like this: df = data.frame(preA = c(1,2,3),preB = c(3,4,5),postA = c(6,7,8),postB = c(9,8,4)) I want to add columns having column-wise differences, that is: diffA = postA - preA diffB = postB - preB and so on... Is…

r data-wrangling

asked Oct 14 '22 at 13:15

Ravi

41
3

4

votes

2 answers

Remove rows of a certain value, before values change in R

I have a data frame like the following: dat <- data.frame(Target = c(rep("01", times = 8), rep("02", times = 5), rep("03", times = 4)), targ2clicks = c(1, 1, 1, 1, 0, 0 ,0 , 1, 1, 0, 0, 0, 1, …

r dataframe rows data-wrangling

asked Jun 23 '22 at 14:38

milsandhills

99
8

Questions tagged [data-wrangling]