0

Basically I'm doing a Python script that returns a list of all of the trains' depart time from today, from a certain stop (as you may see on the POST parameters), but it's just returning the last train, for some reason.

Current code:

import requests
from bs4 import BeautifulSoup
import datetime
import calendar


def get_todays_trains():
    now = datetime.datetime.now()

    url = 'https://www.cp.pt/sites/passageiros/en/train-times/Train-time-results'

    r = requests.post(url, allow_redirects=False, data={
        'arrival': 'Porto - Campanha',
        'depart': 'Aguas Santas - Palmilheira',
        'departDate': str(now.year) + '-' + str(now.month) + '-' + str(now.day),
        'Date': str(now.day) + ' ' + calendar.month_name[now.month] + ', ' + str(now.year)
    })

    html = r.text
    soup = BeautifulSoup(html, 'html.parser')

    for row in soup.findAll('tbody')[1].tbody.findAll('tr'):
        depart = row.findAll('td')[2]

    print(depart)
    print('departDate: ' + str(now.year) + '-' + str(now.month) + '-' + str(now.day))
    print('Date: ' + str(now.day) + ' ' + calendar.month_name[now.month] + ', ' + str(now.year))

    return depart


get_todays_trains()

If you don't want to go to the page, here's a stripped down version of the HTML from the page:

https://pastebin.com/bfkAr6sH

O Tal Antiquado
  • 283
  • 1
  • 2
  • 9
  • 1
    It's because you're overwriting the value of `depart` each time you go through the loop. Depending on your use case, you probably want to do something like put the values together into a list, and return that. – Robin Zigmond Oct 18 '18 at 14:38
  • Possible duplicate of [python BeautifulSoup parsing table](https://stackoverflow.com/questions/23377533/python-beautifulsoup-parsing-table) – stovfl Oct 18 '18 at 15:33

1 Answers1

0

As Robin says, you have to put the temporary values into a list and return them. My suggestion would be have a dictionary which contains all the values such ass departing date and other data you need. Like,

train_data = dict()
train_data['departing_date'] = str(now.year) + '-' + str(now.month) + '-' + str(now.day)
train_data['other_data'] = 'something you need'
train_data['departing_trains'] = []
for row in soup.findAll('tbody')[1].tbody.findAll('tr'):
    depart = row.findAll('td')[2]
    train_data['departing_trains'].append(depart)
return train_data

The returned dictionary will be easy to parse and more pythonic too.

Hope this helps! Cheers!

SanthoshSolomon
  • 1,383
  • 1
  • 14
  • 25
  • Nice one, didn't remember the dictionaries could be useful here, but even if I append to the dictionary, it still prints me the last departing train: `train_data = dict() train_data['departing_date'] = str(now.year) + '-' + str(now.month) + '-' + str(now.day) train_data['departing_trains'] = [] for row in soup.findAll('table')[1].tbody.findAll('tr'): depart = row.findAll('td')[2].text.split() train_data['departing_trains'].append(depart) print(depart) return depart` The console returns: `['23h52']` – O Tal Antiquado Oct 18 '18 at 21:26
  • Hi there! You are trying to return the temporary variable `depart` instead of the dict `train_data` which actually supposed to be returned. Because of this you are getting the value of last temporary value. P.S.: you need to use `find_all` instead of `findAll` as it got deprecated. – SanthoshSolomon Oct 19 '18 at 05:51
  • I already fixed the code and it's up and running. I noticed that I was printing the wrong variable. Also I didn't know it was deprecated! I'll have to fix all of my other projects with the `find_all` then. Thank you! – O Tal Antiquado Oct 19 '18 at 10:49