0

I wrote a simple function to process a list of links and get useful information out of them. Inside the function, I want to have a print function to show me which element it is processing now. But in this case, the output is not what I expected. Here are my list and my code.

list = ['https://www.theguardian.com/world/2020/nov/18/test-and-trace',
 'https://www.theguardian.com/world/2000/jan/27/3',
 'https://www.theguardian.com/world/2020/nov/14/israeli-agents-in-iran-kill',
 'https://www.theguardian.com/world/2020/nov/10/nagorno-karabakh-peace-deal',
 'https://www.theguardian.com/world/2020/dec/06/professor-neil-ferguson',
 'https://www.theguardian.com/world/2020/nov/15/south-australia-records-three',
 'https://www.theguardian.com/world/2000/feb/28/gender.uk2']

and my code:

def tidy_links(links):
    # make an empty dataframe for putting all links in a tidy manner
    df = pd.DataFrame(columns=['cat', 'year', 'month', 'day', 'url', 'name'])

    # loop over links
    for i in range(len(links)):
        print('Processing link number ', i, 'out of', len(links), end = '\r')

        # add the data to the dataframe
        s = links[i].split('/')
        name = s[-5] + '_' + s[-4] + '_' + s[-3] + '_' + s[-2] + '_' + s[-1]
        df.loc[len(df)] = [s[-5], s[-4], s[-3], s[-2], links[i], name]
    return df

and this is the output:

df = tidy_links(links)
Processing link number  3060 out of 3061 0108 out of 3061 0238 out of 3061 0494 out of 3061 0802 out of 3061 1186 out of 3061 2265 out of 3061
Mehdi Abbassi
  • 627
  • 1
  • 7
  • 24
  • `the output is not what I expected` what did you expect? the print function call is doing exactly what you have coded it too do. If you want the prints on a separate line then remove the `end="\r"` keyword arg. –  Jul 26 '21 at 09:47
  • I expected to have `Processing link number 3060 out of 3061` but with the first number changing. – Mehdi Abbassi Jul 26 '21 at 10:03

2 Answers2

0

I believe this is what you're after

def tidy_links(links):
    # make an empty dataframe for putting all links in a tidy manner
    df = pd.DataFrame(columns=['cat', 'year', 'month', 'day', 'url', 'name'])

    # loop over links
    for i in range(len(links)):
        print('Processing link number %d out of %d \r' % (i, len(links)), end = '\r')

        # add the data to the dataframe
        s = links[i].split('/')
        name = s[-5] + '_' + s[-4] + '_' + s[-3] + '_' + s[-2] + '_' + s[-1]
        df.loc[len(df)] = [s[-5], s[-4], s[-3], s[-2], links[i], name]
    return df

Source:

How do I write output in same place on the console?

Sameet
  • 61
  • 2
0

I found the solution! I just replaced end = '\r' with end = ''!

Mehdi Abbassi
  • 627
  • 1
  • 7
  • 24