Questions tagged [wrangle]

Wrangle is the domain-specific language used to build transformation recipes in Cloud Dataprep.

A Wrangle recipe is a sequence of transforms, which are applied to your dataset in order to produce your results.

  • A transform is a single action applied to your dataset. For most transforms, you can pass one or more parameters to define the context (columns, rows, or conditions) where the transform is applied to your dataset.
  • Within some parameters of a transform, you can specify one or more functions. A function is a computational action performed on one or more columns of data in your dataset.
24 questions
3
votes
2 answers

How can I pivot wider and transform my data frame?

I have a data frame like this: tibble( School = c(1, 1, 2, 3, 3, 4), City = c("A","A", "B", "C", "C", "B"), Grade = c("7th", "7th", "7th", "6th", "8th", "8th"), Number_Students = c(20, 23, 25, 21, 28, 34), Type_school = c("public",…
Tormod
  • 83
  • 6
3
votes
1 answer

Label day timing into morning, afternoon and evening in R

How can i label time of the day (Morning, Afternoon and Evening) for given timestamps? Initial Data Id Time_stamp 3083188c 2016-08-29 13:10:51 924d500e 2016-08-29 09:22:33 ad4dd7ff 2016-08-25 20:29:35 Final data Id …
ajax
  • 131
  • 1
  • 11
2
votes
1 answer

Wrangle data from long to wide format for cox regression in R

I am trying to wrangle some data for a cox regression... #generate some data set.seed(1) ID <- sort(rep(1:10, times = 5)) conditions <- rep(c("asthma", "copd", "af", "cvd", "ckd"), times = 10) day <- sample(1:100, 50) #assign to dataframe df <-…
Richard Summers
  • 143
  • 1
  • 10
2
votes
2 answers

How to subtract rows between two different dataframes and replace original value?

I have two dataframes shown as below. How can I replace Bank1 data by subtracting 10 by 3, and 55 by 2? import pandas as pd data = [['Bank1', 10, 55], ['Bank2', 15,65], ['Bank3', 14,54]] df1 = pd.DataFrame(data, columns = ['BankName',…
Jiamei
  • 405
  • 3
  • 14
2
votes
1 answer

Unpack json columns into a dataframe

I have json strings inside a dataframe column. I want to bring all these new json columns into the dataframe. # Input JsonID <- as.factor(c(1,2,3)) JsonString1 = "{\"device\":{\"site\":\"Location1\"},\"tags\":{\"Engine…
Brad
  • 580
  • 4
  • 19
1
vote
1 answer

Wrangle dataframe in R, possibly with dcast

I have a data.frame quite large that I have to wrangle it a bit. the current structure is: V1 V2 V3 V4 V5 V6 V7 V8 ... Vn Vn+1 chr1 1 A T sample_1 value_1 sample_2 value_4 ... sample_n …
Lu_Ste
  • 21
  • 6
1
vote
2 answers

Data Wrangling in R using tidyverse?

So I have this dataset The main transition is pivoting the table, so the population names are on the first column, the names are the headers for each column (and they are renamed, so Chlorophyll is renamed to CHLa for example). The other alteration…
1
vote
2 answers

How can I calculate time duration for given time points in R

I'm trying to find a package or R code that can help to calculate the time duration of different time points for multiple subjects. This is what the data looks like ------------------------------------ SubjectID | Task …
Mr.M
  • 111
  • 1
  • 9
1
vote
1 answer

Is there a function in R that will let me convert a dataset into "long format" but also merge columns?

I have a dataset derived from Pokemon statistics containing a lot of the numerical and categorical data. My end goal is to create a model or recommendation system that a user can input a list of Pokemon and the model finds similar Pokemon they may…
1
vote
1 answer

How do you add a column to data frames in a list?

I have a list of data frames. I want to add a new column to each data frame. For example, I have three data frames as follows: a = data.frame("Name" = c("John","Dor")) b = data.frame("Name" = c("John2","Dor2")) c = data.frame("Name" =…
jtam
  • 814
  • 1
  • 8
  • 24
1
vote
1 answer

How to transform from tibble to dataframe as it is?

First, I averaged the player's data and scaled it down. player <- player %>% group_by(NM) %>% summarise_all(funs(mean(., na.rm = TRUE))) And this is the result. head(player) # A tibble: 6 x 26 NM NO MIN `2PTM` `2PTA` `2PT(%)`…
Sang won kim
  • 524
  • 5
  • 21
1
vote
2 answers

Split dataframe into list based on identical consecutive element

Is there an efficient way to split a dataframe based on identical consecutive element in a column into a list (and keep the order of the dataframe element inside the list) as follow ? The dataframe :…
Sofiane M'barki
  • 193
  • 1
  • 1
  • 11
0
votes
0 answers

I want to load data present in multiple CSV files in BigQuery corresponding tables by using Data Fusion

I want to load data present in multiple CSV files in BigQuery corresponding tables by using Data Fusion. How can we handle this in Wrangle.
Ram D
  • 1
0
votes
1 answer

convert range of numeric variable maintaining ratio

how do I change the range of a numeric variable? Now it’s 100.000-1.000000, which is way too big. It's personIDs of survey respondents. They number of observations is actually only 926. If I don't change this, my plot looks wrongs, since it looks…
dilly
  • 63
  • 1
  • 7
0
votes
2 answers

How can I add empty columns to the output schema in data fusion - wrangle?

I'm developing a pipeline in Data Fusion that must read a JSON from Google Cloud Storage, transform some fields (erase or rename some of them) and then send the info into a BigQuery table. I'm doing the transformation in Wrangle. My problem is that…
1
2