0

Hi so I have this html here and within it are listings denoted in the 'class = app-search-result' tag. What I'm trying to do is print all instances of the search results with just the title, description, closing date and link. However, whenever I try to enumerate through and pull out that information from each listing, all I get is the first instance/the first search result.

Here is the html:

    <li class="app-search-result">
        <h2 class="govuk-heading-s govuk-!-margin-bottom-1">
            <a class="govuk-link" href="/digital-outcomes-and-specialists/opportunities/14646">Provision of End User Compute Services – PS/21/54</a>
        </h2>

        <ul class="govuk-list govuk-!-margin-top-0 govuk-!-margin-bottom-0">
            <li class="govuk-!-font-weight-bold govuk-!-font-size-16 govuk-!-margin-bottom-0">
                <span class="govuk-visually-hidden">Organisation: </span>Driver and Vehicle Licensing Agency (DVLA)
            </li>
            <li class="govuk-!-font-weight-bold govuk-!-font-size-16">
                <span class="govuk-visually-hidden">Location: </span>Wales
            </li>
        </ul>

        <ul class="govuk-list app-search-result__metadata">
            <li class="govuk-!-display-inline-block">
                Digital outcomes
            </li>
            
        </ul>

        <ul class="govuk-list app-search-result__metadata">
            
                <li>
                    Published: Thursday 22 April 2021
                </li>
                <li>
                    Deadline for asking questions: Thursday 29 April 2021
                </li>
                <li>
                    Closing: Thursday 6 May 2021
                </li>
            
        </ul>

        <p class="govuk-body govuk-!-font-size-16 govuk-!-margin-bottom-0 govuk-!-margin-top-1">
            DVLA requires a supplier to provide a supply of specialised resource to support the delivery of the departments ambitious IT Transformation programme in 2021&#x2F;22. The programme broadly looks to renew existing infrastructure, devices and services, across a user base of c.6000.
        </p>
    </li>
    
    <li class="app-search-result">
        <h2 class="govuk-heading-s govuk-!-margin-bottom-1">
            <a class="govuk-link" href="/digital-outcomes-and-specialists/opportunities/14643">WP1964: GOV.UK account and personalisation</a>
        </h2>

        <ul class="govuk-list govuk-!-margin-top-0 govuk-!-margin-bottom-0">
            <li class="govuk-!-font-weight-bold govuk-!-font-size-16 govuk-!-margin-bottom-0">
                <span class="govuk-visually-hidden">Organisation: </span>Government Digital Service
            </li>
            <li class="govuk-!-font-weight-bold govuk-!-font-size-16">
                <span class="govuk-visually-hidden">Location: </span>Off-site
            </li>
        </ul>

        <ul class="govuk-list app-search-result__metadata">
            <li class="govuk-!-display-inline-block">
                Digital outcomes
            </li>
            
        </ul>

        <ul class="govuk-list app-search-result__metadata">
            
                <li>
                    Published: Thursday 22 April 2021
                </li>
                <li>
                    Deadline for asking questions: Thursday 29 April 2021
                </li>
                <li>
                    Closing: Thursday 6 May 2021
                </li>
            
        </ul>

        <p class="govuk-body govuk-!-font-size-16 govuk-!-margin-bottom-0 govuk-!-margin-top-1">
            Through research, experimentation and prototyping help us to test our assumptions and hypotheses relating to GOV.UK account and personalisation. With a particular focus on user experience of bringing together multiple interactions with Government services into one account and the opportunities and challenges of personalising GOV.UK.
        </p>
    </li>

Here is my code:

import requests 
from bs4 import BeautifulSoup

res = requests.get('https://www.digitalmarketplace.service.gov.uk/digital-outcomes-and-specialists/opportunities?q=&statusOpenClosed=open')
soup = BeautifulSoup(res.text, 'html.parser')
opps = soup.select('.app-search-result')

for idx, item in enumerate(opps):
  custom_ot = {}
  custom_ot["Title"] = item.find("h2").getText()
  custom_ot["Description"] = item.find("p").getText()
  custom_ot["Link"] = item.find("a").get("href")
  custom_ot["Deadline"] = item.find("ul").find("li").getText()

print(custom_ot)

This prints:

{'Title': '\nStrategic Partner for the digital transformation of Eye Care and other specialities\n', 'Description': '\n            NHSX are working with national and local NHS organisations to support a bold but grounded modern digital approach to transformation.\r\n\r\nWe are looking for a supplier who can provide the necessary knowledge, skills and experience to provide external support and delivery on strategy, digital transformation, capability building and stakeholder management.\n        ', 'Link': '/digital-outcomes-and-specialists/opportunities/14528', 'Deadline': '\nOrganisation: NHSX on behalf of the Department of Health and Social Care\n            '}

Any help would be much appreciated. Thanks :)

KWL
  • 53
  • 4
  • You should be getting the last one, not the first one. You replace the variable `custom_ot` each time through the loop, and then print it after the loop. – Barmar Apr 23 '21 at 14:41
  • 1
    Move `print(custom_ot)` inside the loop to see all of them. – Barmar Apr 23 '21 at 14:41
  • Your sample output doesn't match the sample HTML. That title doesn't appear anywhere in the HTML. – Barmar Apr 23 '21 at 14:47
  • Why are you using `enumerate()`? You never use `idx`. – Barmar Apr 23 '21 at 14:47
  • Putting the print(custom_ot) inside the loop seemed to fix it. Thank you so much I appreciate it – KWL Apr 23 '21 at 14:52

2 Answers2

0

You did indentation mistake. Your print statement outside of for loop.

import requests 
from bs4 import BeautifulSoup

res = requests.get('https://www.digitalmarketplace.service.gov.uk/digital-outcomes-and-specialists/opportunities?q=&statusOpenClosed=open')
soup = BeautifulSoup(res.text, 'html.parser')
opps = soup.select('.app-search-result')

for idx, item in enumerate(opps):
  custom_ot = {}
  custom_ot["Title"] = item.find("h2").getText()
  custom_ot["Description"] = item.find("p").getText()
  custom_ot["Link"] = item.find("a").get("href")
  custom_ot["Deadline"] = item.find("ul").find("li").getText()
  print(custom_ot) #Now it will print all element. As Your previous print statement was outside of for loop so you get only first instance instead of all.

result:

{'Title': '\nMETIS FTEP - SaaS Enhancement with CoP and Integration Enhancement\n', 'Description': '\n            Metis Enhancement partner to deliver innovative solutions and enhancements to meet complex requirements and deliver future business outcomes and benefits aligned to the departments vision covering SaaS configuration changes, PaaS/IaaS integrations and onboarding other government agencies onto the existing solution.\n        ', 'Link': '/digital-outcomes-and-specialists/opportunities/14588', 'Deadline': '\nOrganisation: UK Home Office\n            '}
{'Title': '\nCCT 984-Security Assurance Support to Application Services and Development Team services\n', 'Description': '\n            We are looking for support to develop and deliver packages of work to build our digital Security Assurance capability and capacity. The Supplier will work with our teams, delivering outcomes across our services.\n        ', 'Link': '/digital-outcomes-and-specialists/opportunities/14570', 'Deadline': '\nOrganisation: Defence Digital, Ministry of Defence Corsham\n            '}
{'Title': '\nPortfolio, Programme and Project Management Maturity Model (P3M3) for Major Projects\n', 'Description': '\n            The aim of this commission is to provide the clear roadmap of activities required to move from our current level of maturity to a recognised P3M3 level 4 maturity organisation.\n        ', 'Link': '/digital-outcomes-and-specialists/opportunities/14649', 'Deadline': '\nOrganisation: Highways England\n            '}
{'Title': '\nProvision of End User Compute Services – PS/21/54\n', 'Description': '\n            DVLA requires a supplier to provide a supply of specialised resource to support the delivery of the departments ambitious IT Transformation programme in 2021/22. The programme broadly looks to renew existing infrastructure, devices and services, across a user base of c.6000.\n        ', 'Link': '/digital-outcomes-and-specialists/opportunities/14646', 'Deadline': '\nOrganisation: Driver and Vehicle Licensing Agency (DVLA)\n            '}
{'Title': '\nWP1964: GOV.UK account and personalisation\n', 'Description': '\n            Through research, experimentation and prototyping help us to test our assumptions and hypotheses relating to GOV.UK account and personalisation. With a particular focus on user experience of bringing together multiple interactions with Government services into one account and the opportunities and challenges of personalising GOV.UK.\n        ', 'Link': '/digital-outcomes-and-specialists/opportunities/14643', 'Deadline': '\nOrganisation: Government Digital Service\n            '}
{'Title': "\nDelivery Partner for the FCDO's Future Service Management (FSM) Programme\n", 'Description': '\n            Define the Operating Model, Organisation Design and toolset strategy for the Service organisation.  Document the rationale for make or buy recommendations.  Create the Outline Business Case, ensuring approval.  Then develop the Full Business Case enabling the programme to implement, ensuring successful transition to the new model.\n        ', 'Link': '/digital-outcomes-and-specialists/opportunities/14626', 'Deadline': '\nOrganisation: The Foreign, Commonwealth & Development Office (FCDO)\n            '}
{'Title': '\nCCT 991 Security Assurance Coordinator to SMOps Interoperability Deployed (Radio) in Defence Digital\n', 'Description': '\n            The Security Assurance Coordinator (SAC) will be the main focal point for all Security Assurance related support tasks; dependant on the business need. Checks and balances must be maintained and monitored in accordance with policy and standards and supported by production of a formal document set to achieve accreditation.\n        ', 'Link': '/digital-outcomes-and-specialists/opportunities/14627', 'Deadline': '\nOrganisation: Ministry of Defence\n            '}
{'Title': '\nDBS User Research Participants 2021\n', 'Description': '\n            Disclosure and Barring Service (DBS) requires participant recruitment for regular user research activities.\r\nThe objective of the advert is to procure participants for regular user research days for services being developed by DBS.\n        ', 'Link': '/digital-outcomes-and-specialists/opportunities/14624', 'Deadline': '\nOrganisation: Disclosure and Barring Service\n            '}
{'Title': '\n102423 - App development across various app stores to identify and map security and privacy guidance\n', 'Description': '\n            The Department for Digital, Culture, Media and Sport requires a Supplier with experience in developing apps for Android or iOS devices to identify security and privacy guidance provided to developers by app stores through creating a basic app for 12 app stores.\n        ', 'Link': '/digital-outcomes-and-specialists/opportunities/14622', 'Deadline': '\nOrganisation: Department for Digital, Culture, Media and Sport\n            '}
{'Title': '\nHRA Digital Directorate: Supporting the HRA’s Digital Transformation Programme\n', 'Description': '\n            Providing services to support the management, analysis, development and testing of the HRA’s digital transformation programmes including the RSP (Research Systems Programme).\n        ', 'Link': '/digital-outcomes-and-specialists/opportunities/14617', 'Deadline': '\nOrganisation: Health Research Authority (HRA)\n            '}
{'Title': '\nPHE Talk to FRANK website technical development\n', 'Description': '\n            To deliver technical development of the Talk to FRANK web site including performing user testing and user research, maintenance and providing new functionality where appropriate based on user needs; in order to support the Government Drug Strategy: https://www.gov.uk/government/publications/drug-strategy-2017\n        ', 'Link': '/digital-outcomes-and-specialists/opportunities/14390', 'Deadline': '\nOrganisation: Public Health England (PHE)\n            '}
{'Title': "\n'Get Help With Tech' Remote Education Tech Support for Schools - Digital Delivery Specialists copy\n", 'Description': '\n            Development of the DfE customer facing service hub, providing remote technology for users.\r\n\r\nSpecifically focusing on designing, developing, and maintenance of a service that fits together as a coherent, consistent offer for users, as part of a longer-term aim to provide a single, seamless service with a common entry point.\n        ', 'Link': '/digital-outcomes-and-specialists/opportunities/14605', 'Deadline': '\nOrganisation: Department for Education\n            '}
{'Title': '\nPRE-TENDER MARKET ENGAGEMENT  - Open Regulation Platform Alpha\n', 'Description': '\n            This is an early market engagement for the Open Regulation Platform - a project around building an enriched, machine readable dataset of regulations as well as an open API to release this data to the public.\r\n\r\nPlease contact us at openregulationplatform@beis.gov.uk for the full specification and proposal templates.\n        ', 'Link': '/digital-outcomes-and-specialists/opportunities/14600', 'Deadline': '\nOrganisation: Department for Business, Energy and Industrial Strategy (BEIS), Better Regulation Executive (BRE)\n            '}
{'Title': '\nIPO Delivery of Common Technology Components (CTC) - Phase 3\n', 'Description': '\n            IPO has started a Transformation programme (OneIPO) to deliver a range of new Digital Services.\r\n\r\nCommon Technology Components (CTC) is a delivery workstream within the overall IPO Transformation Programme.\r\n\r\nCTC delivers reusable building blocks similar to microservices that will be assembled by other suppliers into end to end Digital Services.\n        ', 'Link': '/digital-outcomes-and-specialists/opportunities/14552', 'Deadline': '\nOrganisation: Intellectual Property Office\n            '}
{'Title': '\nFrontend Developer (Semantic HTML, SASS, ITCSS, Bootstrap 4, Accessibility)\n', 'Description': '\n            Front-end developer required with extensive experience in developing and delivering complex UX designs using HTML5, CSS3, Bootstrap, and fully meeting WCAG 2.1 AA accessibility guidelines. This is an exciting opportunity to help deliver phase 2 of a redesign project for the Managed Learning Environment, providing mobile capability to all users.\n        ', 'Link': '/digital-outcomes-and-specialists/opportunities/14587', 'Deadline': '\nOrganisation: College of Policing Ltd\n            '}
{'Title': '\nExpert SAS Technical Assurance Services\n', 'Description': "\n            HMRC requires support to assure that the solution being developed meets its' requirements in a cost effective manner.\n        ", 'Link': '/digital-outcomes-and-specialists/opportunities/14571', 'Deadline': '\nOrganisation: HMRC Customer Compliance Group\n            '}
{'Title': '\nAutomated Processing and Triage of RADAR Signals\n', 'Description': '\n            Produce a MVP for increased efficiency for bulk data sifting, that can identify, match and create new intercept records.\n        ', 'Link': '/digital-outcomes-and-specialists/opportunities/14474', 'Deadline': '\nOrganisation: Ministry of Defence Joint Electronic Warfare Operations Support Centre (JEWOSC)\n            '}
{'Title': '\nMedical Information Services (MedIS) Specialist Business Delivery Support\n', 'Description': '\n            Medical Information Services (Med IS) (Programme CORTISONE and Defence Healthcare Delivery Optimisation Programme (DHDO)) require specialist Business Delivery support for the delivery of an Eco-System of Med IS capability to meet the requirements of the Defence Medical Services.\n        ', 'Link': '/digital-outcomes-and-specialists/opportunities/14150', 'Deadline': '\nOrganisation: Ministry of Defence, Medical Information Services\n            '}
{'Title': '\nFBC and Procurement Specialist Support for BSW AHA EPR\n', 'Description': '\n            Take the emerging vision for clinical service provision and progress the FBC for the preferred option concluded in the OBC, of a shared EPR. Procurement support with a partnership approach to produce an Output Based Specification, inline with the procurement strategy, and leading the tendering activities through to contract award.\n        ', 'Link': '/digital-outcomes-and-specialists/opportunities/14558', 'Deadline': '\nOrganisation: Bath, Swindon and Wiltshire Acute Healthcare Alliance\n            '}
{'Title': '\nUX / UI audit\n', 'Description': '\n            Audit Wokingham Borough Council public facing digital estate from a user experience point of view and produce a report detailing where usability can be improved across sites and pages and levels of priority for identified improvements.\n        ', 'Link': '/digital-outcomes-and-specialists/opportunities/14035', 'Deadline': '\nOrganisation: Wokingham Borough Council\n            '}
{'Title': '\nC20972 Delivery Partner required for Discovery as a Service (DaaS)\n', 'Description': '\n            The Innovation - Law Enforcement (I-LE) function within Police and Public Protection Technology Portfolio (PPPT) is developing a ‘Discovery as a Service’ (DaaS) function. DaaS is a centralised capability, as part of Demand Management (DM), with a vision of standardising and scaling the delivery of Discovery projects across PPPT.\n        ', 'Link': '/digital-outcomes-and-specialists/opportunities/14551', 'Deadline': '\nOrganisation: Home Office\n            '}
{'Title': '\nHelp to Grow Subsidy Discovery\n', 'Description': '\n            BEIS require a team to conduct a discovery for the voucher element of the Help to Grow Digital Scheme in line GDS standards. We expect visuals to be produced to illustrate the various user journeys throughout the phase. Potential for extension for alpha.\n        ', 'Link': '/digital-outcomes-and-specialists/opportunities/14516', 'Deadline': '\nOrganisation: Department for Business, Energy and Industrial Strategy\n            '}
{'Title': '\nStrategic Partner for the digital transformation of Eye Care and other specialities\n', 'Description': '\n            NHSX are working with national and local NHS organisations to support a bold but grounded modern digital approach to transformation.\r\n\r\nWe are looking for a supplier who can provide the necessary knowledge, skills and experience to provide external support and delivery on strategy, digital transformation, capability building and stakeholder management.\n        ', 'Link': '/digital-outcomes-and-specialists/opportunities/14528', 'Deadline': '\nOrganisation: NHSX on behalf of the Department of Health and Social Care\n            '}

or you can try soup.find_all instead of soup.select. Here I little bit modified your code.

import requests 
from bs4 import BeautifulSoup

res = requests.get('https://www.digitalmarketplace.service.gov.uk/digital-outcomes-and-specialists/opportunities?q=&statusOpenClosed=open')
soup = BeautifulSoup(res.text, 'html.parser')
opps = soup.find_all('li',{'class':'app-search-result'}) 


for item in opps:
    title = item.find('h2').text
    
    print(f'title: {title}')

result:

title: 
Portfolio, Programme and Project Management Maturity Model (P3M3) for Major Projects

title: 
Provision of End User Compute Services – PS/21/54

title: 
WP1964: GOV.UK account and personalisation

title: 
Delivery Partner for the FCDO's Future Service Management (FSM) Programme

title: 
CCT 991 Security Assurance Coordinator to SMOps Interoperability Deployed (Radio) in Defence Digital

title: 
DBS User Research Participants 2021

title: 
102423 - App development across various app stores to identify and map security and privacy guidance

title:
HRA Digital Directorate: Supporting the HRA’s Digital Transformation Programme

title:
PHE Talk to FRANK website technical development

title:
'Get Help With Tech' Remote Education Tech Support for Schools - Digital Delivery Specialists copy

title:
PRE-TENDER MARKET ENGAGEMENT  - Open Regulation Platform Alpha

title:
IPO Delivery of Common Technology Components (CTC) - Phase 3

title:
Frontend Developer (Semantic HTML, SASS, ITCSS, Bootstrap 4, Accessibility)

title:
Expert SAS Technical Assurance Services

title:
Automated Processing and Triage of RADAR Signals

title:
Medical Information Services (MedIS) Specialist Business Delivery Support

title:
FBC and Procurement Specialist Support for BSW AHA EPR

title:
UX / UI audit

title:
C20972 Delivery Partner required for Discovery as a Service (DaaS)

title:
Help to Grow Subsidy Discovery

title:
Strategic Partner for the digital transformation of Eye Care and other specialities
boyenec
  • 1,405
  • 5
  • 29
0

This is just an example of many different variants when using Beautifulsoup. Personally, I prefer to work with xpath whenever possible.

import requests
from bs4 import BeautifulSoup

page = requests.get('https://www.digitalmarketplace.service.gov.uk/digital-outcomes-and-specialists/opportunities?q=&statusOpenClosed=open')
soup = BeautifulSoup(page.text, 'html.parser')

for result in soup.find_all("li", attrs={"class": "app-search-result"}):
    print(50 * '*')
    print(f"Title:       {result.h2.a.text}")
    print(f"Description: {(result.p.text.strip())[0:80]}...")
    print(f"Link:        {result.h2.a['href']}")
    print()
IODEV
  • 1,706
  • 2
  • 17
  • 20