I've been using R
for data analysis and am trying to learn python
. In R, I can create vectors with c()
, which gives me back a "column" resulting from whatever I pass it. I often use it to concatenate sequences or repeated values. Something like this:
> test <- c(rep(1:2, each = 2), seq(5, 10, by = 2), runif(3))
> test
[1] 1.0000000 1.0000000 2.0000000 2.0000000 5.0000000 7.0000000 9.0000000
[8] 0.9237168 0.5051230 0.2367923
What is the pythonic way to do this (guessing with pandas
or numpy
)?
This question is the closest I've found, but it's only putting together range()
objects. Trying to do the above in python
, storing the output as a pd.Series
, I tried:
import numpy as np
import pandas as pd
test = pd.Series([np.repeat([1, 2], 2),
np.arange(5, 10, 2),
np.random.random_sample(3)])
That gets me a sort of nested thing:
0 [1, 1, 2, 2]
1 [5, 7, 9]
2 [0.989736164378, 0.558979301843, 0.385354683044]
dtype: object
I see that I could flatten the list manually but that seems like overkill. I magically googled onto this question which contained the potentially helpful tolist()
function which I hadn't heard of. While that's about getting some row of dataframes (??) into a pd.Series
, the function seems like it might do the trick?
Combining that I can use +
to add lists (gleaned from the first linked question), and the tolist()
bit from the last one, I found this:
test1 = np.repeat([1, 2], 2).tolist()
test2 = np.arange(5, 10, 2).tolist()
test3 = np.random.random_sample(3).tolist()
test = pd.Series(test1 + test2 + test3)
0 1.000000
1 1.000000
2 2.000000
3 2.000000
4 5.000000
5 7.000000
6 9.000000
7 0.472650
8 0.077398
9 0.672734
dtype: float64
Hopefully what I'm trying to do is clear. I like that with c()
, you pass in whatever you want and can elegantly string together a series of generated numbers in a desired pattern. I was surprised by how tough it was to do this with a pd.Series
and infer from that I'm doing it wrong!
How is this typically done with python
?