sounds similar to the last, but is a different question:
I can create "incremental-growing" samples from a df, by doing this:
# df = { take an average float dataframe of 0.5-1mio rows & 20-50 cols ...}
arr = np.asarray(df)
res = list((map(lambda i: arr[:i], range(1,df.shape[0]+1))))
print(res)
>>>[
[
["2019-06-17 08:45:00", 12089.89, 12089.89, 12087.71, 12087.71 ]
],
[
["2019-06-17 08:45:00", 12089.89, 12089.89, 12087.71, 12087.71 ],
["2019-06-17 08:46:00", 12087.91, np.nan, 12087.71, 12087.91 ]
],
[
["2019-06-17 08:45:00", 12089.89, 12089.89, 12087.71, 12087.71 ],
["2019-06-17 08:46:00" , 12087.91 , np.nan, 12087.71, 12087.91 ],
["2019-06-17 08:47:00" , 12088.21 , 12088.21, 12084.21 , 12085.21 ]
],
[
["2019-06-17 08:45:00", 12089.89, 12089.89 , 12087.71 , 12087.71 ],
["2019-06-17 08:46:00" , 12087.91 , np.nan, 12087.71, 12087.91 ],
["2019-06-17 08:47:00" , 12088.21 , 12088.21 , 12084.21 , 12085.21 ],
["2019-06-17 08:48:00" , 12085.09 , 12090.21 , 12084.91 , 12089.41 ]
],
[
["2019-06-17 08:45:00", 12089.89, 12089.89 , 12087.71 , 12087.71 ],
["2019-06-17 08:46:00" , 12087.91 , np.nan, 12087.71, 12087.91 ],
["2019-06-17 08:47:00" , 12088.21 , 12088.21 , 12084.21 , 12085.21 ],
["2019-06-17 08:48:00" , 12085.09 , 12090.21 , 12084.91 , 12089.41 ],
["2019-06-17 08:49:00" , 12089.71 , 12090.21 , 12087.21 , 12088.21 ]
]
]
but they arent equally shaped (intentionally). So I want to fill them with np.nan-rows.
Important: The np.nan-rows can be anywhere in the sample, as long as they dont destroy the original row. -> So they can be in between the rows randomly, but not change the original rows.
TL,DR: I need to keep the order of the original data rows, and not change values in those rows, but otherwise fill the sample with np.nan-rows until they all are same length (->as the longest), no matter where. (and in a time-efficient manner, how?)
Ideal result looks like this: (below you can see another possible outcome with random np.nanrow positioning).
print(new_res)
>>>
[
[
[ np.nan, np.nan, np.nan, np.nan, np.nan ],
[ np.nan, np.nan, np.nan, np.nan, np.nan ],
[ np.nan, np.nan, np.nan, np.nan, np.nan ],
[ np.nan, np.nan, np.nan, np.nan, np.nan ],
["2019-06-17 08:45:00", 12089.89, 12089.89, 12087.71, 12087.71 ]
],
[
[ np.nan, np.nan, np.nan, np.nan, np.nan ],
[ np.nan, np.nan, np.nan, np.nan, np.nan ],
[ np.nan, np.nan, np.nan, np.nan, np.nan ],
["2019-06-17 08:45:00", 12089.89, 12089.89, 12087.71, 12087.71 ],
["2019-06-17 08:46:00", 12087.91, np.nan, 12087.71, 12087.91 ]
],
[
[ np.nan, np.nan, np.nan, np.nan, np.nan ],
[ np.nan, np.nan, np.nan, np.nan, np.nan ],
["2019-06-17 08:45:00", 12089.89, 12089.89, 12087.71, 12087.71 ],
["2019-06-17 08:46:00" , 12087.91 , np.nan, 12087.71, 12087.91 ],
["2019-06-17 08:47:00" , 12088.21 , 12088.21, 12084.21 , 12085.21 ]
],
[
[ np.nan np.nan np.nan np.nan np.nan ]
["2019-06-17 08:45:00", 12089.89, 12089.89, 12087.71 , 12087.71 ],
["2019-06-17 08:46:00" , 12087.91 , np.nan, 12087.71, 12087.91 ],
["2019-06-17 08:47:00" , 12088.21 , 12088.21, 12084.21 , 12085.21 ],
["2019-06-17 08:48:00" , 12085.09 , 12090.21, 12084.91 , 12089.41 ]
],
[
["2019-06-17 08:45:00", 12089.89, 12089.89 , 12087.71 , 12087.71 ],
["2019-06-17 08:46:00" , 12087.91 , np.nan, 12087.71, 12087.91 ],
["2019-06-17 08:47:00" , 12088.21 , 12088.21 , 12084.21 , 12085.21 ],
["2019-06-17 08:48:00" , 12085.09 , 12090.21 , 12084.91 , 12089.41 ],
["2019-06-17 08:49:00" , 12089.71 , 12090.21 , 12087.21 , 12088.21 ]
]
]
Randomly added np.nan-rows sample:
print(new_res)
>>>
[
[
[ np.nan, np.nan, np.nan, np.nan, np.nan ],
["2019-06-17 08:45:00", 12089.89, 12089.89, 12087.71, 12087.71 ]
[ np.nan, np.nan, np.nan, np.nan, np.nan ],
[ np.nan, np.nan, np.nan, np.nan, np.nan ],
[ np.nan, np.nan, np.nan, np.nan, np.nan ]
],
[
[ np.nan, np.nan, np.nan, np.nan, np.nan ],
[ np.nan, np.nan, np.nan, np.nan, np.nan ],
["2019-06-17 08:45:00", 12089.89, 12089.89, 12087.71, 12087.71 ],
[ np.nan, np.nan, np.nan, np.nan, np.nan ],
["2019-06-17 08:46:00", 12087.91, np.nan, 12087.71, 12087.91 ]
],
[
["2019-06-17 08:45:00", 12089.89, 12089.89, 12087.71, 12087.71 ],
["2019-06-17 08:46:00" , 12087.91 , np.nan, 12087.71, 12087.91 ],
[ np.nan, np.nan, np.nan, np.nan, np.nan ],
[ np.nan, np.nan, np.nan, np.nan, np.nan ],
["2019-06-17 08:47:00" , 12088.21 , 12088.21, 12084.21 , 12085.21 ]
],
[
["2019-06-17 08:45:00", 12089.89, 12089.89, 12087.71 , 12087.71 ],
["2019-06-17 08:46:00" , 12087.91 , np.nan, 12087.71, 12087.91 ],
[ np.nan, np.nan, np.nan, np.nan, np.nan ],
["2019-06-17 08:47:00" , 12088.21 , 12088.21, 12084.21 , 12085.21 ],
["2019-06-17 08:48:00" , 12085.09 , 12090.21, 12084.91 , 12089.41 ]
],
[
["2019-06-17 08:45:00", 12089.89, 12089.89 , 12087.71 , 12087.71 ],
["2019-06-17 08:46:00" , 12087.91 , np.nan, 12087.71, 12087.91 ],
["2019-06-17 08:47:00" , 12088.21 , 12088.21 , 12084.21 , 12085.21 ],
["2019-06-17 08:48:00" , 12085.09 , 12090.21 , 12084.91 , 12089.41 ],
["2019-06-17 08:49:00" , 12089.71 , 12090.21 , 12087.21 , 12088.21 ]
]
]