0

I know this questions has been asked before but I'm struggling to get the code to work. The output of my scrape containes '\n's which I need to remove:

Here's the code I am using to scrape:

import bs4 as bs
import urllib.request

source = urllib.request.urlopen('https://en.wikipedia.org/wiki/List_of_motorway_service_areas_in_the_United_Kingdom#:~:text=Only%2020%20motorway%20services%20in,leases%20to%20private%20operating%20companies.').read()
soup = bs.BeautifulSoup(source,'lxml')
table = soup.table
table_rows = table.find_all('tr')

for tr in table_rows:
    td = tr.find_all('td')
    row = [i.text for i in td]
    print(row)

And hear is the output:

['\n', 'Abington services\n', 'Welcome Break[2]\n', 'M74\n', 'South Lanarkshire\n', 'The service station is one of fourteen for which large murals were commissioned from artist David Fisher in the 1990s, designed to reflect the local area and history.[3]\n']
['\n', 'Annandale Water services\n', 'RoadChef\n', 'A74(M)\n', 'Dumfriesshire\n', '[4]\n']
['\n', 'Baldock services\n', 'Extra\n', 'A1(M)\n', 'Hertfordshire\n', '[5]\n']
Teege
  • 101
  • 6

1 Answers1

0
for tr in table_rows:
        td = tr.find_all('td')
        # this will remove '\n' from list and from the end of the parsed results
        row = [i.text.strip() for i in td if i.text.strip()]
        print(row)
Vishal Singh
  • 6,014
  • 2
  • 17
  • 33