How to download images from a txt file?

Question

[![enter image description here][1]][1]I want to download the images from a Wikipedia page so I write this program, the txt file it's saving with all of the links but I don't know how to continue the program to download files. Can someone help me?

from urllib.request import urlopen
from bs4 import BeautifulSoup
from requests import get 
import urllib.request
import wikipedia
import requests
import re

title = input("Title: ")
link = (wikipedia.page(title).url)
html = urlopen(link)
bs = BeautifulSoup(html, 'html.parser')
images = bs.find_all('img', {'src':re.compile('.jpg')})
f= open("cache.txt","w+")
for image in images: 
    url = ('https:' + image['src']+'\n')
    f.write(url)

Does this answer your question? [Downloading a picture via urllib and python](https://stackoverflow.com/questions/3042757/downloading-a-picture-via-urllib-and-python) — Prayson W. Daniel, May 09 '21 at 10:49
`import urllib.request;urllib.request.urlretrieve(url, filename)` that is all — Prayson W. Daniel, May 09 '21 at 10:51

shaongit · Answer 1 · 2021-05-09T11:06:50.143

0

You can use the wget module to download file.

pip install wget

To download a file using wget

wget.download(url)

You have to go through each line on your txt file and download the file using wget.

python code

import wget
import csv


with open("cache.txt","r") as f:
    line = csv.reader(f)
    for i in line:
        wget.download(i[0])

edited May 09 '21 at 11:06

answered May 09 '21 at 09:37

shaongit

56
4

It appear: AttributeError: 'list' object has no attribute 'decode' – Marius Gabriel May 09 '21 at 10:29
Check your txt file if the URLs are saving properly. It would be better if you can share a screenshot of your csv file. – shaongit May 09 '21 at 10:32
I put a screenshot in the post – Marius Gabriel May 09 '21 at 10:38
wget.download(i[0]) will solve the problem – shaongit May 09 '21 at 11:07

Marius Gabriel · Answer 2 · 2021-05-09T10:50:35.067

0

Ifound this maybe help... It downloads a image but the rest urllib.error.HTTPError: HTTP Error 404: Not Found

import wget
import csv
with open('cache.csv', newline='') as csvfile:
     spamreader = csv.reader(csvfile, delimiter=' ', quotechar='|')
     for row in spamreader:
         wget.download(', '.join(row))

edited May 09 '21 at 10:50

answered May 09 '21 at 10:44

Marius Gabriel

35
6

score 0 · Accepted Answer · answered May 13 '21 at 05:27

I solve it this is the code:

from urllib.request import urlopen
from bs4 import BeautifulSoup
from requests import get 
import urllib.request
import wikipedia
import requests
import re

title = input("Title: ")
link = (wikipedia.page(title).url)
html = urlopen(link)
bs = BeautifulSoup(html, 'html.parser')
images = bs.find_all('img', {'src':re.compile('.jpg')})
f= open("cache.txt","w+")
for image in images: 
    url = ('https:' + image['src']+'\n')
    f.write(url)

with open('cache.txt') as f:
   for line in f:
      url = line
      path = 'image'+url.split('/', -1)[-1]
      urllib.request.urlretrieve(url, path.rstrip('\n'))

How to download images from a txt file?

3 Answers3