I am trying to scrape this site for the starting lineups. https://www.cbssports.com/nhl/teams/BOS/boston-bruins/depth-chart/
I am using the following code, but the table that is printed contains information I do not want, such as the player shortname and player news. I would only like to extract the CellPlayerName--long, but I am unsure how to do that.
url = "https://www.cbssports.com/nhl/teams/BOS/boston-bruins/depth-chart"
data = requests.get(url).text
soup = BeautifulSoup(data, 'html.parser')
df = pd.read_html(str(soup.find_all('table')))
df[0]
It prints the following:
POS | Starter | Second | Third+ |
---|---|---|---|
Center | P. Bergeron Bruins' Patrice Bergeron: Pots winner in NJ Patrice Bergeron Bruins' Patrice Bergeron: Pots winner in NJ | D. KrejciDavid Krejci | C. CoyleCharlie CoyleT. Nosek Bruins' Tomas Nosek: Returning Monday Undisclosed: Expected to be out until at least Jan 2 Tomas Nosek Bruins' Tomas Nosek: Returning Monday Undisclosed: Expected to be out until at least Jan 2 M. Filipe Lower Body: IR. Expected to be out until at least Jan 29 Matt Filipe Lower Body: IR. Expected to be out until at least Jan 29 |
Left Wing | B. Marchand Bruins' Brad Marchand: Two points against Buffalo Brad Marchand Bruins' Brad Marchand: Two points against Buffalo | P. ZachaPavel Zacha | T. HallTaylor HallN. FolignoNick FolignoA. GreerA.J. Greer |
Right Wing | J. DeBruskJake DeBrusk | D. Pastrnak Bruins' David Pastrnak: Another two-point performance David Pastrnak Bruins' David Pastrnak: Another two-point performance | T. Frederic Bruins' Trent Frederic: Scores goal Wednesday Trent Frederic Bruins' Trent Frederic: Scores goal Wednesday C. SmithCraig Smith |
Left Defenseman | H. LindholmHampus Lindholm | M. GrzelcykMatt Grzelcyk | D. ForbortDerek ForbortJ. ZborilJakub Zboril |
Right Defenseman | C. McAvoyCharlie McAvoy | B. CarloBrandon Carlo | C. CliftonConnor Clifton |
Goalie | L. Ullmark Bruins' Linus Ullmark: Staring in Winter Classic Linus Ullmark Bruins' Linus Ullmark: Staring in Winter Classic | J. Swayman Bruins' Jeremy Swayman: Falls short in OT Jeremy Swayman Bruins' Jeremy Swayman: Falls short in OT | — |
Edit: This is the desired output
POS | Starter | Second | Third |
---|---|---|---|
Center | Patrice Bergeron | David Krejci | Charlie Coyle Tomas Nosek Matt Filipe |
Left Wing | Brad Marchand | Pavel Zacha | Taylor Hall Nick Foligno A.J. Greer |
Right Wing | Jake DeBrusk | David Pastrnak | Trent Frederic Craig Smith |
Left Defenseman | Hampus Lindholm | Matt Grzelcyk | Derek Forbort Jakub Zboril |
Right Defenseman | Charlie McAvoy | Brandon Carlo | Connor Clifton |
Goalie | Linus Ullmark | Jeremy Swayman |