How can I split a 2D array by a grouping variable, and return a list of arrays please (also the order is important).
To show expected outcome, the equivalent in R can be done as
> (A = matrix(c("a", "b", "a", "c", "b", "d"), nr=3, byrow=TRUE)) # input
[,1] [,2]
[1,] "a" "b"
[2,] "a" "c"
[3,] "b" "d"
> (split.data.frame(A, A[,1])) # output
$a
[,1] [,2]
[1,] "a" "b"
[2,] "a" "c"
$b
[,1] [,2]
[1,] "b" "d"
EDIT: To clarify: I'd like to split the array/matrix, A
into a list of multiple arrays based on the unique values in the first column. That is, split A
into one array where the first column has an a
, and another array where the first column has a b
.
I have tried Python equivalent of R "split"-function but this gives three arrays
import numpy as np
import itertools
A = np.array([["a", "b"], ["a", "c"], ["b", "d"]])
b = a[:,0]
def split(x, f):
return list(itertools.compress(x, f)), list(itertools.compress(x, (not i for i in f)))
split(A, b)
([array(['a', 'b'], dtype='<U1'),
array(['a', 'c'], dtype='<U1'),
array(['b', 'd'], dtype='<U1')],
[])
And also numpy.split
, using np.split(A, b)
, but which needs integers. I though I may be able to use How to convert strings into integers in Python? to convert the letters to integers, but even if I pass integers, it doesn't split as expected
c = np.transpose(np.array([1,1,2]))
np.split(A, c) # returns 4 arrays
Can this be done? thanks
EDIT: please note that this is a small example, and the number of groups may be greater than two and they may not be ordered.