0

I want to make a code that goes through the lists within vals array one by one for each unique digit_vals value. The digit_vals value shows the nth number for the expected output, so since the first value in digit_vals is 24 then it means that all the numbers before it will be a filled with zeroes and the 24th number will contain value from vals. Since there are two 24s within digit_vals it means that the 2nd index within the first list of vals ([-3.3, -4.3, 23.05, 23.08, 23.88, 3.72]) will contain the 24th value in the Expected Output which is -4.3. The 4th index of the 2nd list within vals will contain the value for the 27th value in digit_vals and so on. The gaps between the digit_vals will be filled with zeroes as well in the results so between 24 and 27 there will be 2 zeroes for the 25th and 26th place respectively. How would I be able to code this function that allows me to achieve the Expected Output below?

import pandas as pd 
import numpy as np 

digit_vals = np.array([24, 24, 27, 27, 27, 27,
                       28, 28, 28, 31])

vals = np.array([list([-3.3, -4.3, 23.05, 23.08, 23.88, 3.72]),
 list([2.3, 2.05, 3.08, -4.88, 4.72]),
 list([5.3, 2.05, 6.08, -13.88, -17.2]),
 list([9.05, 6.08, 3.88, -13.72])], dtype=object)

Expected Output:

array([  0.         ,   0.        ,   0.        ,   0.        ,
          0.        ,   0.        ,   0.        ,   0.        ,
          0.        ,   0.        ,   0.        ,   0.        ,
          0.        ,   0.        ,   0.        ,   0.        ,
          0.        ,   0.        ,   0.        ,   0.        ,
          0.        ,   0.        ,   0.        ,   0.        ,
         -4.3,      ,   0.        ,   0.        ,  -4.88      ,
          6.08,     ,   0         ,   9.05])
bull selcuk
  • 152
  • 8

1 Answers1

1

First off, if I understand your question correctly, then your output array should be one element longer, with one more zero between to 6.08 value and the 9.05 value, because the 9.05 should be at index position 31 (the other values match their index position specified in digit_vals). The hardest part of this question is transforming the information in the digits_vals array into two arrays that correctly index into each vals array list, and into the correct index position in the output array. Because you're already using numpy, I think this is a reasonable approach

val_ind = []
out_ind = []
for ind, cnt in enumerate(np.bincount(digit_vals)):
    if cnt > 0:
        val_ind.append(cnt-1)
        out_ind.append(ind)

Calculate the number of occurrences of each value in digits_vals and use that count (minus one for zero indexing) as the index into each list within the vals array. Each unique number in digits_vals is identified by capturing the index for each value with a nonzero count, assuming digits_vals will be ordered, as specified in the question example.

Once you have the index lists built, it is straightforward to build the output array:

out_arr = np.zeros(np.max(digit_vals)+1)
for r_ind, (v_ind, o_ind) in enumerate(zip(val_ind, out_ind)):
    out_arr[o_ind] = vals[r_ind][v_ind]

Again, the enumeration provides the row index for extracting the correct row's data from the vals array. I've confirmed this reproduces the output array you provided, including the fix noted above. Hopefully I understood your question correctly, and made reasonable assumptions. If so, please update your question with a little more detail describing assumptions, etc.

  • 1
    Thank youi that works – bull selcuk Jan 10 '22 at 23:20
  • 1
    Hi could you take a look at this issue as well it is relating to this issue. It just fetches the max and min values as well. https://stackoverflow.com/questions/70660762/getting-the-max-min-and-last-index-of-formatted-arrays-numpy-python. – bull selcuk Jan 11 '22 at 01:47