Questions tagged [data-manipulation]

Data manipulation is the process of altering data from a less useful state to a more useful state.

Data manipulation is the process of taking data from either a source or format that isn't easy to read or search into a format or data storage solution that can be quickly read and/or searched. For example, a log's output could be split into rows of a database to make it easier to pull out just the entries that pertain to a situation, or simply reordered to make locating entries based on the ordered field easier. Data manipulation can make data mining easier.

The process of taking raw data and parsing, filtering, extracting, organizing, combining, cleaning or otherwise converting it into a consistent usable form for further processing or input to an algorithm or system.

3845 questions
1119
votes
31 answers

How can I access and process nested objects, arrays, or JSON?

I have a nested data structure containing objects and arrays. How can I extract the information, i.e. access a specific or multiple values (or keys)? For example: var data = { code: 42, items: [{ id: 1, name: 'foo' }, { …
Felix Kling
  • 795,719
  • 175
  • 1,089
  • 1,143
158
votes
6 answers

Split delimited strings in a column and insert as new rows

I have a data frame as follow: +-----+-------+ | V1 | V2 | +-----+-------+ | 1 | a,b,c | | 2 | a,c | | 3 | b,d | | 4 | e,f | | . | . | +-----+-------+ Each of the alphabet is a character separated by comma. I would like to…
Boxuan
  • 4,937
  • 6
  • 37
  • 73
129
votes
8 answers

Removing elements with Array.map in JavaScript

I would like to filter an array of items by using the map() function. Here is a code snippet: var filteredItems = items.map(function(item) { if( ...some condition... ) { return item; } }); The problem is that filtered out items…
91
votes
20 answers

How do I remove objects from an array in Java?

Given an array of n Objects, let's say it is an array of strings, and it has the following values: foo[0] = "a"; foo[1] = "cc"; foo[2] = "a"; foo[3] = "dd"; What do I have to do to delete/remove all the strings/objects equal to "a" in the array?
ramayac
  • 5,173
  • 10
  • 50
  • 58
81
votes
3 answers

Arranging rows in custom order using dplyr

With arrange function in dplyr, we can arrange row in ascending or descending order. Wonder how to arrange rows in custom order. Please see MWE. Reg <- rep(LETTERS[1:3], each = 2) Res <- rep(c("Urban", "Rural"), times = 3) set.seed(12345) Pop <-…
MYaseen208
  • 22,666
  • 37
  • 165
  • 309
70
votes
10 answers

Read a CSV from github into R

I am trying to read a CSV from github into R: latent.growth.data <- read.csv("https://github.com/aronlindberg/latent_growth_classes/blob/master/LGC_data.csv") However, this gives me: Error in file(file, "rt") : cannot open the connection In…
histelheim
  • 4,938
  • 6
  • 33
  • 63
42
votes
7 answers

Converting String Array to an Integer Array

so basically user enters a sequence from an scanner input. 12, 3, 4, etc. It can be of any length long and it has to be integers. I want to convert the string input to an integer array. so int[0] would be 12, int[1] would be 3, etc. Any tips and…
Mario
  • 821
  • 3
  • 9
  • 13
37
votes
2 answers

pandas reset_index after groupby.value_counts()

I am trying to groupby a column and compute value counts on another column. import pandas as pd dftest = pd.DataFrame({'A':[1,1,1,1,1,1,1,1,1,2,2,2,2,2], 'Amt':[20,20,20,30,30,30,30,40, 40,10, 10, 40,40,40]}) dftest looks like A…
muon
  • 12,821
  • 11
  • 69
  • 88
36
votes
5 answers

Extract non null elements from a list

I have a list like this: x = list(a = 1:4, b = 3:10, c = NULL) x #$a #[1] 1 2 3 4 # #$b #[1] 3 4 5 6 7 8 9 10 # #$c #NULL and I want to extract all elements that are not null. How can this be done?…
qed
  • 22,298
  • 21
  • 125
  • 196
31
votes
2 answers

remove row with nan value

let's say, for example, i have this data: data <- c(1,2,3,4,5,6,NaN,5,9,NaN,23,9) attr(data,"dim") <- c(6,2) data [,1] [,2] [1,] 1 NaN [2,] 2 5 [3,] 3 9 [4,] 4 NaN [5,] 5 23 [6,] 6 9 Now i want to remove the…
Sir Ksilem
  • 1,195
  • 2
  • 12
  • 27
24
votes
9 answers

Efficient recursive random sampling

Imagine a df in the following format: ID1 ID2 1 A 1 2 A 2 3 A 3 4 A 4 5 A 5 6 B 1 7 B 2 8 B 3 9 B 4 10 B 5 11 C 1 12 C 2 13 C 3 14 C 4 15 C 5 The problem is to randomly select…
tmfmnk
  • 38,881
  • 4
  • 47
  • 67
21
votes
4 answers

Convert string to dict, then access key:values??? How to access data in a for Python?

I am having issues accessing data inside a dictionary. Sys: Macbook 2012 Python: Python 3.5.1 :: Continuum Analytics, Inc. I am working with a dask.dataframe created from a csv. Edit Question How I got to this point Assume I start out with…
Linwoodc3
  • 1,037
  • 2
  • 11
  • 14
21
votes
3 answers

Subsetting a matrix by row.names

I have a matrix with the following row.names: "X1" "X5" "X33" "X37" "X52" "X566" Now I want to select only the rows which match the entries of a list, say: include_list <- c("X1", "X5", "X33") I imagine I'd do something like…
histelheim
  • 4,938
  • 6
  • 33
  • 63
20
votes
3 answers

How to run tapply() on multiple columns of data frame using R?

I have a data frame like the following: a b1 b2 b3 b4 b5 b6 b7 b8 b9 D 4 6 9 5 3 9 7 9 8 F 7 3 8 1 3 1 4 4 3 R 2 5 5 1 4 2 3 1 6 D 9 2 1 4 3 3 8 2 5 D 5 4 3 1 …
Jota
  • 17,281
  • 7
  • 63
  • 93
19
votes
3 answers

Get first and last values per group – dplyr group_by with last() and first()

The code below should group the data by year and then create two new columns with the first and last value of each year. library(dplyr) set.seed(123) d <- data.frame( group = rep(1:3, each = 3), year = rep(seq(2000,2002,1),3), value =…
phillyooo
  • 1,523
  • 2
  • 16
  • 22
1
2 3
99 100