0

I'm getting a bit more used to python now, and my professor has started teaching us about lists and string manipulation, list slicing and all of that. He proposed an exercise which I'm pretty sure can be solved by just using list slices and should not need super specific inbuilt python functions, but I have no clue as in how to start thinking about this problem. It goes as follows:

"Create a function, lets call it "generate_n_gaps", that has as parameters a string 'dna' of any length that includes any combination of the characters in the string DNA = 'ATCG' (just like the dna sequencing we see in biology) and a gap that we will denote as GAP = '_' , and an integer parameter 'n'. The function must return a list with all variations of 'dna' containing up to n extra gaps without repetition."

Here are a couple of examples:

In [1]: generate_gaps( 'T', 2 )
Out[1]: ['T', '_T', 'T_', '__T', '_T_', 'T__']

In [2]: generate_gaps( 'CA', 2 )
Out[2]: ['CA', '_CA', 'C_A', 'CA_', '__CA', '_C_A', '_CA_', 'C__A', 'C_A_', 'CA__']

In [3]: generate_gaps( 'C_A', 2)
Out[3]: ['C_A', '_C_A', 'C__A', 'C_A_', '__C_A', '_C__A', '_C_A_', 'C___A', 'C__A_', 'C_A__']

The function should be defined as follows:

def generate_n_gaps( dna, n = 1 ):

As requested, I have worked on the problem a bit and have written a code that manages to generate the anagrams needed. I managed to write another function that generates only one gap, and used that for the main one.

def generate_n_gaps( dna, n = 1 ):
last=generate_gaps(dna) 
a=len(last)
for i in range(a):
    b=generate_gaps(last[i])
    last.append(b)
return last


def generate_gaps( dna ):
comb=[]
for i in range(0 , len(dna)+1):
    partial=''
    partial=dna[:i]+GAP+dna[i:]
    comb.append(partial)
last=[]
for i in comb:
    if i not in last:
        last.append(i)    
return last

This manages to get me the anagrams I need, but the list my function returns is a bit messy, how would I go about cleaning it up? By that, I mean, is there any way to 'combine' the lists, removing the other lists inside the main list? I reckon if I manage to do that, removing the duplicates is not really an issue.

This is what my function returns for the first example:

In [1]: generate_gaps( 'T', 2 )
Out[1]: ['_T', 'T_', ['__T', '_T_'], ['_T_', 'T__']]
  • 2
    THis is a very broad question, and we ask tat questions include a [mcve] showing code for what you've tried so far based on your own research. Regarding what functions, if any, to use, that falls under recommendations for tools and libraries and opinion-based, which are both explicitly off topic. It might be worth just starting with `for i in range(n):` and see what you come up with, then ask a more specific question if you hit a more specific problem – G. Anderson Jun 16 '22 at 17:52
  • 2
    If it were me I would, however, look into [itertools](https://docs.python.org/3/library/itertools.html) as a way to make things easier – G. Anderson Jun 16 '22 at 17:52
  • Wouldn't say it's that broad, just need a direction pretty much, the code I was using was pretty messy, and was only just a draft, which is the reason I didn't include it. As for using itertools, it's off limits, was explicitly told not to use import. – Nicholas Boscolo Jun 16 '22 at 19:01
  • "just need a direction pretty much" unfortunately, this is a question and answer site, not a general discussion forum or code-writing service, and we ask for _specific_ questions so that we can provide specific answers. Your messy draft code is great and very welcome, as it gives potential answerers a place to start knowing how to help. On the face of it, the best we can really do is point you toward an existing question like [Insert some string into string at given index](https://stackoverflow.com/questions/4022827/insert-some-string-into-given-string-at-given-index) – G. Anderson Jun 16 '22 at 20:21

1 Answers1

0

this will flatten your list of list. hope it helps.

l = ['_T', 'T_', ['__T', '_T_'], ['_T_', 'T__']]

flat_l = []
for item in l:
    if isinstance(item, str):
        flat_l.append(item)
    else:
        for j in item:
            flat_l.append(j)
            
            
print(flat_l)
#['_T', 'T_', '__T', '_T_', '_T_', 'T__']
Ritwick Jha
  • 340
  • 2
  • 10