-1

I'm in the process of creating a table with pandas that contain a certain value. For example I want to paste the links from different years of the Premier League and get in multiple rows how a particular team is doing that year. I would also like to have the link in the first column from which the information comes.

import requests
import pandas as pd

url = 'https://www.skysports.com/premier-league-table'
html = requests.get(url).content
df_list = pd.read_html(html)
df = df_list[-1]

contain = df[df["Team"].str.contains("Liverpool")]

print(contain)

Here I already have the first approach for a specific year. So I'm told here how Liverpool is doing this year. However, I would still like to get more information on how Liverpool has fared in the other years. For example for the year 21/22 (https://www.skysports.com/premier-league-table/2021).

So I would like to add another row with the dates for 21/22, 20/21, etc.. At the end there should be several rows of dates with the information and the source.

At the moment I get this:
    #       Team  Pl  W  ...   A  GD  Pts  Last 6
9  10  Liverpool   8  2  ...  12   8   10     NaN
I would like to get this:
    #       Team  Pl  W  ...   A  GD  Pts  Last 6  Link
9  10  Liverpool   8  2  ...  12   8   10     NaN  https://www.sky...
1  2   Liverpool   8  28 ...  12  68   92     NaN  https://www.sky...
...
Vitalizzare
  • 4,496
  • 7
  • 13
  • 32
TestUser
  • 1
  • 1
  • 1
    Please clarify your specific problem or provide additional details to highlight exactly what you need. As it's currently written, it's hard to tell exactly what you're asking. – Community Oct 09 '22 at 19:39
  • What is holding you back from doing the same operation for multiple links of the same type? – Vitalizzare Oct 09 '22 at 20:17
  • @Vitalizzare Because then I only get one line each. I would like to get a complete table with all contents. Also, I could make it easier for all the other teams. – TestUser Oct 09 '22 at 20:22
  • Why not to make a list of records for each year and apply `pd.concat` to it? – Vitalizzare Oct 09 '22 at 20:26

1 Answers1

0

You can create a one-column df and merge it by the default index 0

urldf=pd.DataFrame([url],columns=["Link"]) 
contain=contain.reset_index()
contain = pd.merge(contain,urldf,left_index=True,right_index=True)

Here is a related question Merge two dataframes by index

You can do this for all the years and use pandas.concat to make the desired outcome dataframe

Dronakuul
  • 147
  • 9