Questions tagged [splitstackshape]

Use the splitstackshape R package to stack and reshape datasets after splitting concatenated values

Online data collection tools like Google Forms often export multiple-response questions with data concatenated in cells. The concat.split (cSplit) family of functions splits such data into separate cells. The package also includes functions to stack groups of columns and to reshape wide data, even when the data are "unbalanced"---something which reshape (from base R) does not handle, and which melt and dcast from reshape2 do not easily handle.

The package has data.table as a dependency and some of its functions return data.tables.

CRAN Documentation

Main Website

60 questions

votes

5 answers

Splitting a single column into multiple observation using R

I am working on HCUP data and this has range of values in one single column that needs to be split into multiple columns. Below is the HCUP data frame for reference : code label 61000-61003 excision of CNS 0169T-0169T ventricular…

r data.table medical data-cleaning splitstackshape

asked Oct 13 '15 at 21:50

x1carbon

votes

2 answers

cSplit library(splitstackshape) is always dropping the column

I was searching for a way to split the column content by a separator and converting a table into a long format. I found cSplit from the splitstackshape package and it is almost doing what I was looking for. Problem is now with the drop option. I…

r splitstackshape

asked May 13 '15 at 06:45

drmariod

11,106
16
64
110

votes

2 answers

Using sep = "." in `fread` from "data.table"

Can fread from "data.table" be forced to successfully use "." as a sep value? I'm trying to use fread to speed up my concat.split functions in "splitstackshape". See this Gist for the general approach I'm taking, and this question for why I want to…

r data.table fread splitstackshape

asked Oct 08 '13 at 04:55

A5C1D2H2I1M1N2O1R2T1

190,393
28
405
485

votes

2 answers

Stratified data splitting in R

I've been using caret::createDataPartition() in order to split the data in a stratified way. Now I'm trying another approach that I found here in stack, which is splitstackshape::stratified(), and the reason I'm intrested in this is that it allows…

r machine-learning caret splitstackshape

asked Nov 25 '22 at 13:05

Programming Noob

1,232
3
14

votes

2 answers

How to prevent data.table to force numeric variables into character variables without manually specifying these?

Consider the following dataset: dt <- structure(list(lllocatie = structure(c(1L, 6L, 2L, 4L, 3L), .Label = c("Assen", "Oosterwijtwerd", "Startenhuizen", "t-Zandt", "Tjuchem", "Winneweer"), class = "factor"), lat = c(52.992, 53.32,…

r data.table splitstackshape

asked Jul 22 '15 at 16:08

Jaap

81,064
34
182
193

votes

3 answers

Split multiple columns into rows

I'm working with a very raw set of data and need to shape it up in order to work with it. I am trying to split selected columns based on seperator '|' d <- data.frame(id = c(022,565,893,415), name = c('c|e','m|q','w','w|s|e'), score =…

r split strsplit splitstackshape

asked Nov 14 '16 at 14:29

Davis

votes

2 answers

R cSplit only using first delimiter in string

I had a long list with two columns where the I had the same string in each column in multiple rows. So I used paste to concatenate using - and then used setDT to return the unique set of concats with their frequency. Now I want to reverse my…

r concatenation delimiter splitstackshape

asked May 18 '16 at 15:41

Oli

votes

2 answers

split dataframe with multiple delimiters in R

df1 <- Gene GeneLocus CPA1|1357 chr7:130020290-130027948:+ GUCY2D|3000 chr17:7905988-7923658:+ UBC|7316 chr12:125396194-125399577:- C11orf95|65998 chr11:63527365-63536113:- …

r splitstackshape

asked Sep 22 '15 at 13:37

Kryo

votes

1 answer

Splitting text to words with R and cSplit()

I'm trying to split a series of sentences into separate words, that is, to tokenize the text. I have found an R package splitstackshape that is able to do what I want, well almost... it truncates the output to the first and last 5 rows. Anyway, this…

r splitstackshape

asked Sep 17 '15 at 06:49

Joshua

votes

2 answers

Project Euler #22, off by 158,055

I'm currently working through Project Euler problem 22 which has the following challenge: Using names.txt (right click and 'Save Link/Target As...'), a 46K text file containing over five-thousand first names, begin by sorting it into alphabetical…

r splitstackshape

asked May 21 '14 at 15:06

rmbaughman

votes

4 answers

Splitting concatenated column and populating corresponding columns with values

I have a nasty data table that has a couple of different kinds of messiness, and I can't figure out how to combine some of the other answers that use the tidyr and splitstackshape packages. subject <- c("A", "B", "C") review <- c("Bill: [1.0]",…

r tidyr splitstackshape

asked Mar 08 '18 at 21:33

bikeclub

votes

2 answers

Modification of cSplit_e function to account for multiple values

I understand that "cSplit_e" in "splitstackshape" can be used to convert multiple values under one column to separate columns with binary values. I am dealing with a text problem for calculating tf-idf and it is not necassary to have all unique…

r dplyr text-mining tidyr splitstackshape

asked Feb 23 '17 at 17:08

syebill

votes

1 answer

How do I convert a 2x2 contingency table into a long format dataframe?

How do I convert a 2x2 contingency table into a long format data frame? I tried this: library(reshape2) Table <- matrix(c(7,67,19,71), 2, 2, byrow=TRUE) rownames(Table) <- c('Drug', 'No_Drug') colnames(Table) <- c('Comp', 'No_Comp') melt(Table) I…

r reshape2 tidyr splitstackshape

asked Nov 07 '15 at 14:49

FTF

votes

1 answer

Combining irrelevant/similar observations into one (others)

After performing a survey on perceived problems per neighborhood I get this dataframe. Since the survey had different options to choose from + an open one, the results on the open question are frequently irrelevant (see…

r dataframe dplyr splitstackshape

asked Jul 24 '15 at 11:47

ccamara

1,141
1
12
32

votes

1 answer

How can I reshape a data.table (long into wide) without doing a function like sum or mean?

How can I reshape a data.table (long into wide) without doing a function like sum or mean? I was looking at dcast/melt/reshape/etc. But I don't get the desired results. This is my data: DT <- data.table(id = c("1","1","2","3"), score = c("5", "4",…

r data.table reshape reshape2 splitstackshape

asked Dec 07 '14 at 19:05

peter

2 3 4 Next