0

I am working with list data and want to filter out my data, basically, I want to remove all the recurring elements from a list:

list = ['0260','1234','02/03/2020','1245','1.1','','','112233','abcd','','','',
        '0260','1235','02/03/2021','1215','1.2','','','112233','abcd','','','',
        '0260','1224','02/03/2023','1225','3.1','','','112233','abcd','','','',
        '0260','1114','02/03/2024','1235','6.1','','','112233','abcd','','','',
        '0260','1224','02/03/2025','1265','17.1','','','112233','abcd','','','',
        '0260','1564','02/03/2026','1275','19.1','','','112233','abcd','','','',
        '0260','1904','02/03/2027','1295','11.1','','','112233','abcd','','','',
       ]


desired result = ['0260','1234','02/03/2020','1245','1.1',
                '0260','1235','02/03/2021','1215','1.2',
                '0260','1224','02/03/2023','1225','3.1',
                '0260','1114','02/03/2024','1235','6.1',
                '0260','1224','02/03/2025','1265','17.1',
                '0260','1564','02/03/2026','1275','19.1',
                '0260','1904','02/03/2027','1295','11.1',
       ]

I tried multiple approaches but none of them seems to be working. I tried list comprehension and a while loop:

while 2 in x:
    x.remove(2)

However it only removes one element at a time, but I want to go through all the list elements, check and then delete multiple elements accordingly.

cconsta1
  • 737
  • 1
  • 6
  • 20
Ravi Dawade
  • 66
  • 1
  • 5
  • 3
    With cconsta1's edit reformatting the code, it doesn't look like you really want to eliminate duplicates, since `'0260'` appears repeatedly in your desired output, and several repeated elements at the end of the lines don't appear at all any more. Can you be clearer about what determines which elements should remain versus which are removed? – Blckknght Mar 21 '23 at 22:48
  • So i want to go though list and '0260','1234','02/03/2020','1245','1.1' after this i want to delete all the elements untill '0260' comes again so – Ravi Dawade Mar 21 '23 at 22:51
  • 1
    I'm not sure of the problem description. Why should `'112233'` be removed? Because it appears again **elsewhere in the list**? But in that case, why should `'0260'` **not** be removed? That also appears repeatedly in the list - with the same spacing pattern, even. – Karl Knechtel Mar 21 '23 at 23:17
  • I can see three possibilities: 1) you want to remove values **that are duplicate and adjacent** - i.e., remove all the `''` because they only appear in groups of two or more consecutively. That's the duplicate I linked. 2) You want to remove **anything that appears more than once** - we have other duplicates for this, and it will depend on whether you need to preserve the order. 3) The **real** requirement is more complicated than this, and you haven't properly figured it out yet. – Karl Knechtel Mar 21 '23 at 23:19

1 Answers1

0

Following the comments, here is a code that defines a list of data and then filters it to only include elements that appear immediately after the string '0260'. The filtered data consists of the five elements following each instance of '0260'. The resulting filtered data is then printed:

data = ['0260','1234','02/03/2020','1245','1.1','','','112233','abcd','','','',
        '0260','1235','02/03/2021','1215','1.2','','','112233','abcd','','','',
        '0260','1224','02/03/2023','1225','3.1','','','112233','abcd','','','',
        '0260','1114','02/03/2024','1235','6.1','','','112233','abcd','','','',
        '0260','1224','02/03/2025','1265','17.1','','','112233','abcd','','','',
        '0260','1564','02/03/2026','1275','19.1','','','112233','abcd','','','',
        '0260','1904','02/03/2027','1295','11.1','','','112233','abcd','','','',
       ]

start_indexes = [i for i, x in enumerate(data) if x == '0260'] + [len(data)]

filtered_data = []
for i in range(len(start_indexes) - 1):
    filtered_data.extend(data[start_indexes[i]:start_indexes[i] + 5])

print(filtered_data)

cconsta1
  • 737
  • 1
  • 6
  • 20
  • It won't work if my data changes, can i give indexing like every after 5th elements delete particular elements until it again start from '0260 – Ravi Dawade Mar 21 '23 at 22:58
  • if these elements_to_remove = ['', '112233', 'abcd'] will change all the time in my data then i cannot hard code it , i wanted to give indexing of nth elements so that whatever data comes it should remove untill '0260' code starts again – Ravi Dawade Mar 21 '23 at 23:00
  • 1
    You can modify the code to make it more dynamic by finding the indexes of '0260' and then deleting the required elements after the 5th element until the next '0260' is encountered. Check my updated code – cconsta1 Mar 21 '23 at 23:02
  • 1
    This helped, i can make it more dynamic Thanks a lot for quick help – Ravi Dawade Mar 21 '23 at 23:06