1

I have looked through stackoverflow and am unable to find the answer I am looking for, or understand if the answer given by another post is the answer I am looking for.

So what I would like to do is pull from a webpage, that has an input box, enter data into that input box, and get the return result.

What is a way I can go about doing this with Python? I saw someone created a similar scraper using json or Node I believe. But again I would like to use Python if that is doable.

right now I have the follow code

from bs4 import BeautifulSoup
import requests
     source = requests.get('https://somewebsitehere.org').text
     soup = BeautifulSoup(source, 'lxml')
     receipt_box = soup.find('div', class_='filed-box')
     print(receipt_box)

which gives me this

<div class="filed-box">
<input class="form-control textbox initial- 
focus" id="receipt_number" maxlength="13" 
name="appReceiptNum" type="text"/>
</div>

I think I need to use the appReceiptNum and from there enter my "receipt_number" into the input box.

I saw that Postpy2 may be able to help me with this but I don't really know.

any help is appreciated.

EDIT: So using Selenium this is what I have as an idea for accessing and send the desired info.

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By

driver = webdriver.Chrome('PATH to my chromedriver.exe')

driver.get("https://egov.uscis.gov/casestatus/landing.do")

elem = driver.find_element(By.NAME, "appReceiptNum")

elem.send_keys("case number")
     

How does this look? I haven't gotten to sending the information yet.

John
  • 25
  • 9
  • can you confirm the url, is it publicly accessible? – Barry the Platipus Aug 14 '22 at 21:17
  • https://egov.uscis.gov/casestatus/landing.do – John Aug 14 '22 at 21:31
  • You won't use requests.get to send data. IF requests method is used, it would be a post, not get. Do you have number to test with and I can take a look? – chitown88 Aug 15 '22 at 09:46
  • I had a look at that page: there is a form POSTing that information .. somewhere, I couldn't clearly see any XHR call in Network tab, maybe @chitown88 can figure it out. In any case, it would probably be less complex to use Selenium for this job. – Barry the Platipus Aug 15 '22 at 10:45
  • @platipus_on_fire, ya Selenium is always an option for things like this. There is a nifty plug-in called tamper, that lets you debug/pause the requests being made, so that might be able to help narrow down the url and form needed to return the data. Like I said, I'd need a valid input to send to check to see how/if it returns the data needed. – chitown88 Aug 15 '22 at 13:12
  • @chitown88 I don't have a number but it is typically a three character start followed by ten digits. probably could make up any combination and still get a result from the post. try IOE0812469845. Should throw a invalid receipt number error. But that would be good enough to test if the post is working correct? – John Aug 15 '22 at 14:32
  • @platipus_on_fire I will look into the Selenium use case thanks – John Aug 15 '22 at 14:32
  • Sure @Walt, let me know if you need an example on how to access that page & input the case # – Barry the Platipus Aug 15 '22 at 14:38
  • @Walt, ya I actually tried a made up number before your comment. I did not see/find where it was posted. I think Selenium here is your best bet. – chitown88 Aug 15 '22 at 14:53
  • onto the Selenium documentation than, thanks for the pointers – John Aug 15 '22 at 22:07
  • @platipus_on_fire check out my EDIT if you would like to and let me know if I am heading in the correct direction. Thanks :) – John Aug 22 '22 at 21:47

1 Answers1

0

This is one way of inputting the caseid into that page, and clicking Submit, using Selenium:

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC


chrome_options = Options()
chrome_options.add_argument("--no-sandbox")
chrome_options.add_argument('disable-notifications')
chrome_options.add_argument("window-size=1280,720")

webdriver_service = Service("chromedriver/chromedriver") ## path to where you saved chromedriver binary
browser = webdriver.Chrome(service=webdriver_service, options=chrome_options)


url = 'https://egov.uscis.gov/casestatus/landing.do'

browser.get(url) 

caseno_input = WebDriverWait(browser,10).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "input[id='receipt_number']")))
caseno_input.send_keys('WAC1234567890')
WebDriverWait(browser,10).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "input[title='CHECK STATUS']"))).click()
print('clicked the button')

The above setup is using Chrome/chromedriver on linux, but you can adapt it to your own setup, just observe the imports, and the code after defining the browser/driver. Selenium documentation can be found at: https://www.selenium.dev/documentation/

Barry the Platipus
  • 9,594
  • 2
  • 6
  • 30
  • Thank you for the explanation. Classes just started and between work and class I was going to give it a more thorough go on the weekends. – John Aug 23 '22 at 00:45
  • I do have a question though. Do we need to update the URL when say we input our information and proceed to the next page? Once on that next page I want to grab some information and then use it to send notifications to my Spouse and I's phones. – John Aug 24 '22 at 13:12
  • I believe I found the answer here https://stackoverflow.com/questions/39626759/how-to-continue-to-fill-the-data-in-next-page-by-selenium But please feel free to advise me on a better way if you would like. – John Aug 24 '22 at 13:14