1

From a list of following elements how to choose by index, so index 1,2, then 5,6 then 9,10? Numbers and text are not relevant, the order is relevant. The basic idea behind is as follows: Suppose you have feature a,b,c,d and for all of them you have mean, standard deviation, min and max. If you only interested in showing feature b and c, how to show them?

column=[]
for i in range(1,4):
    for j in list('abcd'):
        column.append(str(j)+str(i))
 column
['a1', 'b1', 'c1', 'd1', 'a2', 'b2', 'c2', 'd2', 'a3', 'b3', 'c3', 'd3']

How can I extract the values at the indices 1, 2, 5, 6, 9, 10 so the result is

['b1', 'c1', 'b2', 'c2', 'b3', 'c3']
Aran-Fey
  • 39,665
  • 11
  • 104
  • 149
mile.d
  • 43
  • 6
  • 1
    can you update with what the pattern is specifcally? – Grant Williams Apr 26 '18 at 18:16
  • do you want to get them by index, or by matching? – RishiG Apr 26 '18 at 18:16
  • by index, so index 1,2, then 5,6 then 9,10 – mile.d Apr 26 '18 at 18:19
  • looking at patterns from [oeis](https://oeis.org/) i dont see much that looks relevant. if you are indexing from 1 then it could be the numbers such that x^2 + y^2 + z^2 = the index, or it could be 2or3 mod 5 if it is from index 0. That is still not very helpful :/ – Grant Williams Apr 26 '18 at 18:20
  • okay so you want all of the b's and c's in order of their number? – Grant Williams Apr 26 '18 at 18:21
  • 2
    Please [edit] that information about the indices into the question. – Aran-Fey Apr 26 '18 at 18:21
  • In the title you say you want "every nth subset", then in the comments you say you want to get the items by index, and the two answers down there are filtering the list based on the first letter of the string. Seeing how there are 3 different problems being asked/solved here, I'm voting to close this question as unclear until we have a proper problem statement. – Aran-Fey Apr 26 '18 at 18:26
  • Do you mean something like this? `idx = [1, 2, 5, 6, 9, 10]; print([column[i] for i in idx])` – PM 2Ring Apr 27 '18 at 09:16
  • Related: [Select list elements based on indices](https://stackoverflow.com/questions/2621674/how-to-extract-elements-from-a-list-using-indices-in-python). Also related: [Select list elements based on a boolean mask](https://stackoverflow.com/questions/18665873/filtering-a-list-based-on-a-list-of-booleans) – Aran-Fey Apr 27 '18 at 10:20

3 Answers3

3

One way is to use itertools.compress and itertools.cycle. It basically uses a mask to repeatedly select the elements at indices 1 and 2 for every 4-element chunk.

import itertools as it

print([x for x in it.compress(column, it.cycle([0, 1, 1, 0]))])
# ['b1', 'c1', 'b2', 'c2', 'b3', 'c3']
Jonathan
  • 1,382
  • 1
  • 13
  • 13
0

You can use regular expressions to find all columns that start with the features you want to identify.

def get_columns(arr, sw):
  return re.findall(r'(?:{})\d+'.format('|'.join(sw)), ''.join(arr))

The above function takes a list of features, as well as a list of columns that you want to match.

In action:

In [6]: y = ['featureA1', 'featureB1', 'featureC1', 'featureA2', 'featureB2', 'featureC2']

In [7]: def get_columns(arr, sw):
   ...:   return re.findall(r'(?:{})\d+'.format('|'.join(sw)), ''.join(arr))

In [8]: get_columns(y, ['featureA'])
Out[8]: ['featureA1', 'featureA2']
user3483203
  • 50,081
  • 9
  • 65
  • 94
  • Im curious to know if always numbers 1, 2, 3 in order of if that also needs to be flexible? I appreciate that your approach allows for changes in the startin letter though. – Grant Williams Apr 26 '18 at 18:24
  • This will find the columns in the same order they appear in the string. If he wants the columns in order of their number, he could sort by the number for each column. – user3483203 Apr 26 '18 at 18:25
  • No, numbers are not relevant, the order is relevant. The basic idea behind is as follows: Suppose you have feature a,b,c,d and for all of them you have mean, standard deviation, min and max. If you only interested in showing feature b and c, how to show them? – mile.d Apr 26 '18 at 18:28
  • 2
    @mile.d Put the clarification in the question, not a comment. – Barmar Apr 26 '18 at 18:29
  • @mile.d then my second approach should work, you can pass a list of "features", i.e., `['a', 'b']`, and you will get all entries in your list starting with `a` or `b` and ending with a number. – user3483203 Apr 26 '18 at 18:33
0

You can use a range with a step of 4, then access the two list elements starting from that index.

result = []
for i in range(1, len(column)-1, 4):
    result.append(column[i])
    result.append(column[i+1])

Caveat: this will only return elements where both elements of each index pair exist. E.g. it won't return element 9 if there's no 10.

Barmar
  • 741,623
  • 53
  • 500
  • 612
  • Keep in mind that this will omit some results if the length of the input is 2 or 6 or 10, etc. An input of length 3 gives 2 numbers as output, but an input of length 2 gives 0. – Aran-Fey Apr 27 '18 at 09:00
  • Yes, this code only returns the paired elements that he described. It won't return item 9 if there's no 10 to go with it. – Barmar Apr 27 '18 at 19:46