3

I want to download multiple specific links(images´ urls) into a txt file(or any file where all links can be listed underneath each others).

I get them but the code wrtite each link on the top of the other one and at the end it stays only a link :(. Also I want not repeated urls

def dlink(self, image_url):
        r = self.session.get(image_url, stream=True)
        with open('Output.txt','w') as f:
            f.write(image_url + '\n')
developer_hatch
  • 15,898
  • 3
  • 42
  • 75
TTT
  • 33
  • 1
  • 5

4 Answers4

3

The issue is most simply that opening a file with mode 'w' truncates any existing file. You should change 'w' to 'a' instead. This will open an existing file for writing, but append instead of truncating.

More fundamentally, the problem may be that you are opening the file over and over in a loop. This is very inefficient. The only time the approach you use could be really useful is if your program is approaching the OS-imposed limit on number of open files. If this is not the case, I would recommended putting the loop inside the with block, keeping the mode as 'w' since you open the file just once now, and passing the open file to your dlink function.

Mad Physicist
  • 107,652
  • 25
  • 181
  • 264
2

Edit

Huge mistake of my part, as it is a method, and you will call it several times, if you open it in write mode ('w') or similar, it will Overwrites the existing file if the file exists. So, if you use the 'a' way, you can see that:

Opens a file for appending. The file pointer is at the end of the file if the file exists. That is, the file is in the append mode. If the file does not exist, it creates a new file for writing.

The other problem radics in image_url is a list, so you need to write it line by line:

def dlink(self, image_url):
        r = self.session.get(image_url, stream=True)
        with open('Output.txt','a') as f:
            for url in list(set(image_url)):
                f.write(image_url + '\n')

another way to do it:

your_file = open('Output.txt', 'a')
r = self.session.get(image_url, stream=True)
for url in list(set(image_url)):
  your_file.write("%s\n" % url)
your_file.close() #dont forget close it :)
developer_hatch
  • 15,898
  • 3
  • 42
  • 75
  • DVoter, I need a little information of why you downvoted please, so I can see the mistake and can learn also – developer_hatch Jun 29 '17 at 00:26
  • No need for `+` in the mode – Mad Physicist Jun 29 '17 at 00:35
  • @MadPhysicist yes thank you, I notice that reading a little more, I have never been in this kind of situation, and didn't count that it was a method, so it could be call more than once, causing those issues, I edited the answer – developer_hatch Jun 29 '17 at 00:37
  • I did not DVoted any answer, i appreciate all of them. And now it works as it should :) – TTT Jun 29 '17 at 00:37
  • @TTT haha no no, I now you didn't :), and you can't even if you want to, you can only upvote for now, and accept answers, I said that to the user who did, but I don't know who is it, but there is no need to – developer_hatch Jun 29 '17 at 00:39
  • @TTT don't forget accept the answer that fixed the problem! Only one you can, is a good practice, with the check box top left in the answer, and good luck! I have always loved to help ;) – developer_hatch Jun 29 '17 at 00:42
  • @DamianLattenero just one thing in the newly created txt file the same link comes several times, how to exclude that ?. i mean that the link comes only once – TTT Jun 29 '17 at 00:48
  • @TTT you can use `for url in list(set(image_url)):` , don't forget to vote and accept the answer :) – developer_hatch Jun 29 '17 at 01:01
  • Thank you so much for your time. I will accept your answer because it helped me the most and for cleaning the list the answered there helped me more: https://stackoverflow.com/questions/1215208/how-might-i-remove-duplicate-lines-from-a-file – TTT Jun 29 '17 at 01:15
2

the file open mode is wrong,'w' mode make this file was overwritten every time you open it,not appended to it. replace it to 'a' mode.

you can see this https://stackoverflow.com/a/23566951/8178794 for more detail

laixiong
  • 29
  • 2
0

Open a file with option w overwrite the file if existring, use the mode a to append data to an existing file.

Try :

import requests
from os.path import splitext


# use mode='a' to append result without erasing filename
def dlink(url, filename, mode='w'):
    r = requests.get(url)
    if r.status_code != 200:
        return
    # here the link is valid
    with open(filename, mode) as desc:
        desc.write(url)


def dimg(img_url, img_name):
    r = requests.get(img_url, stream=True)
    if r.status_code != 200:
        return
    _, ext = splitext(img_url)
    with open(img_name + ext, 'wb') as desc:
        for chunk in r:
            desc.write(chunk)


dlink('https://image.flaticon.com/teams/slug/freepik.jpg', 'links.txt')
dlink('https://image.flaticon.com/teams/slug/freepik.jpg', 'links.txt', 'a')

dimg('https://image.flaticon.com/teams/slug/freepik.jpg', 'freepik')
glegoux
  • 3,505
  • 15
  • 32