0

This is what I have: A list which contains sublists

A= [['filename.yaml','0001'],['filename.yaml','0001'],['filename.yaml','0001'], ['fname.yaml','0002'], ['fname.yaml','0002']]

What i want is to rename the first element of each sublist when the sublist is present more than once. The out put should be:

[['filename_0.yaml','0001'],['filename_1.yaml','0001'],['filename_2.yaml','0001'], ['fname_0.yaml','0002'], ['fname_1.yaml','0002']]

This is my code:

def asso_name_id(A):

for sublist in A:
    if A.count(sublist)>1:
        for i in range(A.count(sublist)):
            base=os.path.splitext(os.path.basename(sublist[0]))[0]
            sublist[0]=base+"_"+str(i)+'.yaml'

This is what i get with this code:

[['filename_0_1_2.yaml', '0001'], ['filename_0_1.yaml', '0001'], ['filename.yaml', '0001'], ['fname_0_1.yaml', '0002'], ['fname.yaml', '0002']]

What am I doing wrong and how can I fix it?

esenes221
  • 27
  • 2
  • You need to start by getting the indices of the duplicate elements. There is a nice suggestion for doing that [here](https://stackoverflow.com/a/5419576/3254859). – Chris Mueller Jun 09 '17 at 12:53

4 Answers4

0

You can try this:

from itertools import chain
A= [['filename.yaml','0001'],['filename.yaml','0001'],['filename.yaml','0001'], ['fname.yaml','0002'], ['fname.yaml','0002']]

flattened = list(chain(*A))

new_dict = {}

for i in A:
    if i[0] not in new_dict:
        new_dict[i[0]] = 1

    else:
        new_dict[i[0]] += 1

final_list = []

for i in A:
    first = i[0].split(".")
    new = first[0]+"_"+str(abs(new_dict[i[0]]-flattened.count(i[0])))
    final_list.append([new+first[1], i[1]])

    new_dict[i[0]] -= 1

print final_list

Output:

[['filename_0yaml', '0001'], ['filename_1yaml', '0001'], ['filename_2yaml', '0001'], ['fname_0yaml', '0002'], ['fname_1yaml', '0002']]
Ajax1234
  • 69,937
  • 8
  • 61
  • 102
0

At each sublist, you are checking how many identical ones there are, then you repeat x times the operation on the same sublist. That is why the second one it only does it twice, because the first one is no longer identical so it only detects 2 sublists identical. Instead, try this:

#!/usr/bin/python3 
import os
A= [['filename.yaml','0001'],['filename.yaml','0001'],['filename.yaml','0001'], ['fname.yaml','0002'],['fname.yaml','0002']]

def asso_name_id(A):

  for sublist in A
    if A.count(sublist) > 1:
            sublist_name = (sublist[0]+'.')[:-1]
            count = 0
            for s_list in A:
                if s_list[0] == sublist_name:
                        base=os.path.splitext(os.path.basename(s_list[0]))[0]
                        s_list[0]= base+"_"+str(count)+".yaml"
                        count += 1
  return A

print(asso_name_id(A))

Output:

[['filename_0.yaml', '0001'], ['filename_1.yaml', '0001'], ['filename_2.yaml', '0001'], ['fname_0.yaml', '0002'], ['fname_1.yaml', '0002']]
dheiberg
  • 1,914
  • 14
  • 18
0
first = map(lambda (first, second): first, A)
second = map(lambda (first, second): second, A)
zip([item for sublist in [map(lambda inc: key + "_" + str(inc), range(value)) for key, value in Counter(first).iteritems()] for item in sublist], second)

Not a complete solution. Still need to string split on yaml. Order of operations 1. Get just the file names 2. Get just the '000X's 3. Count the file names 4. for each file name create a new filename with _Y where Y <= number of occurrences (using Counter class) 5. Zip this with 2)

Andrew Cassidy
  • 2,940
  • 1
  • 22
  • 46
0

A simple solution using List comprehensions

import collections
import itertools as IT

A= [['filename.yaml','0001'],['filename.yaml','0001'],['filename.yaml','0001'], ['fname.yaml','0002'], ['fname.yaml','0002']]

counter1 = IT.count(0)
counter2 = IT.count(0)

A = [['filename_{0}.yaml'.format(next(counter1)),sub_list[1]] 
       if sub_list[0]=='filename.yaml' else ['fname_{0}.yaml'.format(next(counter2)),sub_list[1]] 
       for sub_list in A ]

print(A)

output:

[['filename_0.yaml', '0001'], ['filename_1.yaml', '0001'], ['filename_2.yaml', '0001'], ['fname_0.yaml', '0002'], ['fname_1.yaml', '0002']]

List comprehension explained

For each list (sub_list) in A if the sub_list[0] i.e first element of sub_list is 'filename.yaml' then format it to 'filename_{0}.yaml' else 'fname_{0}.yaml' where {0} will hold our variable counter.

Use a counter starting from 0 and use next() to increment the counter.

Note: use two counters.

void
  • 2,571
  • 2
  • 20
  • 35
  • doesn't this limit the input pretty heavily? what if you input something with sublists where sublist[0] is 'filepath.yaml'? – dheiberg Jun 09 '17 at 13:28
  • Thanks for the answer. But the solution should be generic. I mean it has to work regardless of the contents of the sublists. – esenes221 Jun 09 '17 at 13:31