Filling the missing values in the specified format - Python

Question

I'm given a problem that explicitly asks me not to use numpy or pandas.

Problem:

Given a string with digits and '_'(missing value) symbols you have to replace the '_' symbols as explained

Ex 1: _, _, _, 24 ==> 24/4, 24/4, 24/4, 24/4 i.e we. have distributed the 24 equally to all 4 places 

Ex 2: 40, _, _, _, 60 ==> (60+40)/5,(60+40)/5,(60+40)/5,(60+40)/5,(60+40)/5 ==> 20, 20, 20, 20, 20 i.e. the sum of (60+40) is distributed qually to all 5 places

Ex 3: 80, _, _, _, _  ==> 80/5,80/5,80/5,80/5,80/5 ==> 16, 16, 16, 16, 16 i.e. the 80 is distributed qually to all 5 missing values that are right to it

Ex 4: _, _, 30, _, _, _, 50, _, _  
==> we will fill the missing values from left to right 
    a. first we will distribute the 30 to left two missing values (10, 10, 10, _, _, _, 50, _, _)
    b. now distribute the sum (10+50) missing values in between (10, 10, 12, 12, 12, 12, 12, _, _) 
    c. now we will distribute 12 to right side missing values (10, 10, 12, 12, 12, 12, 4, 4, 4)

for a given string with comma separated values, which will have both missing values numbers like ex: "_, _, x, _, _, " you need fill the missing values Q: your program reads a string like ex: ", _, x, _, _, _" and returns the filled sequence Ex:

Input1: "_,_,_,24"
Output1: 6,6,6,6

Input2: "40,_,_,_,60"
Output2: 20,20,20,20,20

Input3: "80,_,_,_,_"
Output3: 16,16,16,16,16

Input4: "_,_,30,_,_,_,50,_,_"
Output4: 10,10,12,12,12,12,4,4,4

I'm trying to split the string in a list using the split function. I'm then trying to check for blanks on the left and the count the number of such blanks and then once I encounter a non-blank, I'm dividing that number by the total count i.e (no.blanks encountered before the number and number itself) and spreading the values and replacing the blanks left the number

Then I'm checking for the blanks in between two number and then applying the same logic, after which doing the same for blanks on the right.

However, the code I shared below is throwing all sorts of errors, and I believe there are gaps in logic I shared above, therefore would appreciate insights on solving this issue

def blanks(S):

  a= S.split()
  count = 0
  middle_store = 0
  #left
  for i in range(len(a)):
    if(a[i]=='_'):
      count = count+1  #find number of blanks to the left of a number
    else:
      for j in range(0,i+1):
        #if there are n blanks to the left of the number speard the number equal over n+1 spaces
        a[j] = str((int(a[i])/(count+1)))
        middle_store= i
    break  

  #blanks in the middle
  denominator =0
  flag = 0
  for k in len(middle_store+1,len(a)):
    if(a[k] !='_'):
      denominator = (k+1-middle_store)
      flag=k
    break

  for p in len(middle_store,flag+1):
    a[p] = str((int(a[p])/denominator))

  #blanks at the right 
  for q in len(flag,len(a)):
    a[q] = str((int(a[q])/(len(a)-flag+1)))

S=  "_,_,30,_,_,_,50,_,_"
print(blanks(S))

What errors? Have you tried debugging your code? – h4z3 Jul 24 '19 at 10:03 — h4z3, Jul 24 '19 at 10:03

score 5 · Answer 1 · answered Jul 24 '19 at 10:35

Modular solution

# takes an array x and two indices a,b. 
# Replaces all the _'s with (x[a]+x[b])/(b-a+1)
def fun(x, a, b):
    if a == -1:
        v = float(x[b])/(b+1)
        for i in range(a+1,b+1):
            x[i] = v
    elif b == -1:
        v = float(x[a])/(len(x)-a)
        for i in range(a, len(x)):
            x[i] = v
    else:
        v = (float(x[a])+float(x[b]))/(b-a+1)
        for i in range(a,b+1):
            x[i] = v
    return x

def replace(text):
    # Create array from the string
    x = text.replace(" ","").split(",")
    # Get all the pairs of indices having number
    y = [i for i, v in enumerate(x) if v != '_']
    # Starting with _ ?
    if y[0] != 0:
        y = [-1] + y
    # Ending with _ ?
    if y[-1] != len(x)-1:
        y = y + [-1]    
    # run over all the pairs
    for (a, b) in zip(y[:-1], y[1:]): 
        fun(x,a,b)          
    return x

# Test cases
tests = [
    "_,_,_,24",
    "40,_,_,_,60",
    "80,_,_,_,_",
     "_,_,30,_,_,_,50,_,_"]

for i in tests:
    print (replace(i))

score 2 · Answer 2 · answered Jul 24 '19 at 10:15

First of all, you should specify a delimiter as argument in the split method, by default, it splits by spaces.

So "_,_,x,_,_,y,_".split() gives you ['_,_,x,_,_,y,_']

while "_,_,x,_,_,y,_".split(',') will give you ['_', '_', 'x', '_', '_', 'y', '_'].

Secondly, for the "middle" and "right" loop (for the right), you need to replace len with range.

Because of the division, you better use float instead of int

Since you use it for division, you better initialize denominator to 1.

In the last loop, a[q] = str((int(a[q])/(len(a)-flag+1))) (same with a[p]) should return an error because a[q] is "_". You need to use a variable to save the a[flag] value.

Each break should be in the else or if statement, otherwise, you'll pass the loop only once.

Finally, for better complexity, you can exit the middle_store asssignement from the j loop, to avoid asigning it every time.

TL;DR: Try this:

def blanks(S):
    a = S.split(',')
    count = 0
    middle_store = 0
    # left
    for i in range(len(a)):
        if a[i] == '_':
            count = count + 1  # find number of blanks to the left of a number
        else:
            for j in range(i + 1):
                # if there are n blanks to the left of the number speard the number equal over n+1 spaces
                a[j] = str((float(a[i]) / (count + 1)))
            middle_store = i
            middle_store_value = float(a[i])
            break

        # blanks in the middle
    denominator = 1
    flag = 0
    for k in range(middle_store + 1, len(a)):
        if a[k] != '_':
            denominator = (k + 1 - middle_store)
            flag = k
            break
    flag_value = float(a[flag])
    for p in range(middle_store, flag + 1):
        a[p] = str((middle_store_value+flag_value) / denominator)

    # blanks at the right
    last_value = float(a[flag])
    for q in range(flag, len(a)):
        a[q] = str(last_value / (len(a) - flag))

    return a

S=  "_,_,30,_,_,_,50,_,_"
print(blanks(S))

PS: did you even try to solve the errors ? or do you just wait for someone to solve your math problem ?

score 0 · Answer 3 · answered Sep 23 '19 at 18:17

A code for the problem thrown in question may also be done in the following way, though the code is not optimized and simplified, it is written with a different perspective:

import re 
def curve_smoothing(S): 

    pattern = '\d+'
    ls_num=re.findall(pattern, S)   # list of numeral present in string
    pattern = '\d+'
    spaces = re.split(pattern, S)  # split string to seperate '_' spaces

    if len(spaces[0])==0 and len(ls_num)==1:
        Space_num=len(re.findall('_',  S))
        sums=int(ls_num[0])
        repl_int=round(sums/(Space_num+1))
        S=re.sub(r'(\d{2})(,_)', r'{}\2'.format(str(repl_int)) , S, 1)
        S=re.sub('_', str(repl_int),S, Space_num)
        return S

    elif len(spaces[0])==0 and len(ls_num)>1:
        for i in range(1,len(spaces)):
            if i==1:
                Space_num=len(re.findall('_',  spaces[i]))
                sums=int(ls_num[i-1])+int(ls_num[(i)])
                repl_int=round(sums/(Space_num+2))
                S=re.sub(str(ls_num[i-1]), str(repl_int),S)
                S=re.sub('_', str(repl_int),S, Space_num)
                S=re.sub(str(ls_num[i]), str(repl_int),S,1)
                ls_num[i]=repl_int
            elif i<len(ls_num):
                Space_num=len(re.findall('_',  spaces[i]))
                sums=int(ls_num[i-1])+int(ls_num[(i)])
                repl_int=round(sums/(Space_num+2))
                S=re.sub(r'(\d{2})(,_)', r'{}\2'.format(str(repl_int)) , S, 1)
                S=re.sub('_', str(repl_int),S, Space_num)
                S=re.sub(str(ls_num[i]), str(repl_int),S,1)
                ls_num[i]=repl_int
            elif len(spaces[-1])!=0:
                Space_num=len(re.findall('_',  spaces[i]))
                repl_int=round(ls_num[(i-1)]/(Space_num+1))
                S=re.sub(r'(\d{2})(,_)', r'{}\2'.format(str(repl_int)) , S, 1)
                S=re.sub('_', str(repl_int),S, Space_num)
        return S


    else:
        for i in range(len(spaces)):
            if i==0:
                Space_num=len(re.findall('_',  spaces[i]))
                sums=int(ls_num[(i)])
                repl_int=round(sums/(Space_num+1))
                S=re.sub(r'(\d{2})(,_)', r'{}\2'.format(str(repl_int)) , S, 1)
                S=re.sub('_', str(repl_int),S, Space_num)
                S=re.sub(str(ls_num[i]), str(repl_int),S, 1)
                ls_num[i]=repl_int
            elif i>=1 and i<len(ls_num):
                Space_num=len(re.findall('_',  spaces[i]))
                sums=int(ls_num[i-1])+int(ls_num[(i)])
                repl_int=round(sums/(Space_num+2))
                S=re.sub(r'(\d{2})(,_)', r'{}\2'.format(str(repl_int)) , S, 1)
                S=re.sub('_', str(repl_int),S, Space_num)
                S=re.sub(str(ls_num[i]), str(repl_int),S,1)
                ls_num[i]=repl_int
            elif len(spaces[-1])!=0:
                Space_num=len(re.findall('_',  spaces[i]))
                repl_int=round(ls_num[(i-1)]/(Space_num+1))
                S=re.sub(r'(\d{2})(,_)', r'{}\2'.format(str(repl_int)) , S, 1)
                S=re.sub('_', str(repl_int),S, Space_num)
        return S

S1="_,_,_,24"
S2="40,_,_,_,60"
S3=  "80,_,_,_,_"
S4="_,_,30,_,_,_,50,_,_"
S5="10_,_,30,_,_,_,50,_,_"
S6="_,_,30,_,_,_,50,_,_20"
S7="10_,_,30,_,_,_,50,_,_20"

print(curve_smoothing(S1))
print(curve_smoothing(S2))
print(curve_smoothing(S3))
print(curve_smoothing(S4))
print(curve_smoothing(S5))
print(curve_smoothing(S6))
print(curve_smoothing(S7))

score 0 · Answer 4 · answered Feb 27 '20 at 08:43

# _, _, 30, _, _, _, 50, _, _ 
def replace(string):
    lst=string.split(',')
    for i in range(len(lst)):
        if lst[i].isdigit():
            for j in range(i+1):
                lst[j]=int(lst[i])//(i+1)
            new_index=i
            new_value=int(lst[i])
            break
    for i in range(new_index+1,len(lst)):
        if lst[i].isdigit():
            temp=(new_value+int(lst[i]))//(i-new_index+1)
            for j in range(new_index,i+1):
                lst[j]=temp
            new_index=i
            new_value=int(lst[i])
    try:
        for i in range(new_index+1,len(lst)):
            if not(lst[i].isdigit()):
                count=lst.count('_')
                break
        temp1=new_value//(count+1)
        for i in range(new_index,len(lst)):
            lst[i]=temp1
    except:
        pass
    return lst

Welcom to SO. While this code may answer the question, providing additional context regarding how and/or why it solves the problem would improve the answer's long-term value. — Jack O'Neill, Feb 27 '20 at 09:28
Thank you sir. I will keep that in mind and try better approaches for future codes. — maahi, Feb 28 '20 at 13:42
Why only for the future? You can [edit] your answer and improve it right now... — Tomerikoo, Jun 02 '20 at 18:13

score 0 · Answer 5 · answered Feb 27 '20 at 11:24

def replace(string):
lst=string.split(',')
if lst[0].strip().isdigit():
    index0=0
    while True:
        index1=index0
        value1=int(lst[index0].strip())
        index2=index1
        for i in range((index1+1),len(lst)):
            if lst[i].strip().isdigit():
                index2=i
                break
        value2=0
        if index2>index1:
            value2=int(lst[index2].strip())
        else:
            index2=len(lst)-1
        value=str(int((value1+value2)/((index2-index1)+1)))
        for i in range(index1,index2+1):
            lst[i]=value
        index0=index2

        if index0>=(len(lst)-1):
            break

else:
    index0=0
    while True:
        index1=index0
        value1=0
        if lst[index0].strip().isdigit():
            value1=int(lst[index0].strip())
        index2=index1
        for i in range((index1+1),len(lst)):
            if lst[i].strip().isdigit():
                index2=i
                break
        value2=0
        if index2>index1:
            value2=int(lst[index2].strip())
        else:
            index2=len(lst)-1
        value=str(int((value1+value2)/((index2-index1)+1)))
        for i in range(index1,index2+1):
            lst[i]=value
        index0=index2

        if index0>=(len(lst)-1):
            break

return lst   

string = "20,_,_,30, _, _,10,_,_,_,_,110"
replace(string)

score 0 · Answer 6 · answered Mar 19 '23 at 16:32

# write your python code here
# you can take the above example as sample input for your program to test
# it should work for any general input try not to hard code for only given input strings
#run your code in the function for each of the inputs mentioned above and make sure that you get the same results
def appendvalues(value,startIndex,endIndex,values_list):
  #values_list=[]
  for i in range(startIndex,endIndex):
    values_list[i]=value
    #.append(value)
  #return values_list

def calculate_missing_values(values_list):
  filled_values=[]
  filled_positions=[]
  for i in range(len(values_list)):
    if(values_list[i].isdigit()):
      if(len(filled_positions) ==0):      
        missingvalues= int(int(values_list[i]) / (i+1))
        appendvalues(missingvalues,0,i+1,values_list)
      else:
        missingvalues= int((int(filled_values[len(filled_values)-1])+int(values_list[i])) / ((i+1)-filled_positions[len(filled_positions)-1]))
        appendvalues(missingvalues,filled_positions[len(filled_positions)-1],i+1,values_list)
      filled_positions.append(i)
      filled_values.append(int(values_list[i]))
  if(len(values_list) != filled_positions[len(filled_positions)-1]):
    missingvalues= int(int(values_list[filled_positions[len(filled_positions)-1]])/(len(values_list)- filled_positions[len(filled_positions)-1]))
    appendvalues(missingvalues,filled_positions[len(filled_positions)-1],len(values_list),values_list)
  return values_list

# you can free to change all these codes/structure
def curve_smoothing(string):
    # your code
    values_list = string.split(',')
    filled_values=calculate_missing_values(values_list)
    return filled_values#list of values

S=  "_,_,30,_,_,_,50,_,_"
smoothed_values= curve_smoothing(S)
print(smoothed_values)

Saran Koundinya · Answer 7 · 2020-06-03T04:38:13.913

"Check this its work for all the inputs"

def replace(s):

val=0
lst=s.split(",")

if lst[0].isdigit():
    for i in range(1,len(lst)):
        if lst[i].isdigit():
            value=(int(lst[0])+int(lst[i]))//((i+1))
            for j in range(0,i+1):
                lst[j]=value
            index=i
            break    
else:
    for i in range(len(s)):
        if lst[i].isdigit():
            for j in range(i+1):
                lst[j]=(int(lst[i]))//(i+1)
            index=i
            value=int(lst[i])
            break
for i in range(index+1,len(lst)):
    if lst[i].isdigit():
        temp=(value+int(lst[i]))//(i-index+1)
        for j in range(index,i+1):
            lst[j]=temp
        index=i
        value=int(lst[i])


try :
    for i in range(index+1,len(lst)):
        if not(lst[i].isdigit()):
            count=lst.count('_')
            break
    temp1=value//(count+1)
    for i in range(index,len(lst)):
        lst[i]=temp1
except UnboundLocalError as e:
    print (e)
return lst

Using a bare except like that is bad practice, see, for example https://stackoverflow.com/questions/54948548/what-is-wrong-with-using-a-bare-except. — AMC, Jun 03 '20 at 01:00
u can use this instead of using bare except except UnboundLocalError as e: print (e) — Saran Koundinya, Jun 03 '20 at 04:35

Filling the missing values in the specified format - Python

7 Answers7