I once wrote a Python script does basically scrapes a webpage and searches for a particular text and gives the value of the number of times the text appears in the web page.
Now, I want to incorporate the same as a web app.
My app will be taking two, string variables; a date (which I will split into day, month, and year) and a name.
The first variable will be used to generate the unique web URL (it's a date-based list) and this URL will be parsed to collect information using text search (if possible by using RegEx beyond simple Python search functions).
Now, I want to set up a webpage which will have two elements (for date and a name). I want these two variables to be run by the script and then the output must be generated in the web page (on the same page or on a new page).
Simple.
With my limited knowledge, I think both Flask and Django will be too heavy for this.
How do you think would I be able do it?
EDIT: Here's my code (that I essentially thought out and grabbed from different places.
# KATscrape is a script that Basil Ajith (https://twitter.com/basilajith) wrote way back
# in 2016-2017 in order to search and parse the KAT
# cause lists. Now, it is being re-written to be hosted
# as a web application online.
# Parsing web page learnt from:
# https://stackoverflow.com/questions/25067580/passing-web-data-into-beautiful-soup-empty-list#25068054
# Developed by https://stackoverflow.com/users/2141635/padraic-cunningham
# Printing List without quotes learnt from:
# https://stackoverflow.com/questions/11178061/print-list-without-brackets-in-a-single-row#11178075
# Developed by https://stackoverflow.com/users/1172428/fatalerror
# Edited by https://stackoverflow.com/users/6451573/jean-fran%c3%a7ois-fabre
# Finding occurence of advocate's name in the cause list learnt from:
# https://stackoverflow.com/questions/17268958/finding-occurrences-of-a-word-in-a-string-in-python-3#17268979
# Developed by https://stackoverflow.com/users/148870/amber
# Dependencies
from sys import argv
from bs4 import BeautifulSoup
import requests
import re # I don't know pandas (neither do I know RegEx much); but I think RegEx would serve our purpose.
filename, date, adv_name = argv
# Short Lists
court_numbers = ["1", "7", "8", "4"]
# The parser function
def katscrape():
day = date[0:2]
month = date[3:5]
year = date[6:14]
base_url = "http://keralaadministrativetribunal.gov.in/ciskat/pages/cause_list_home.php?type=search&dte=%s/%s/%s&court=%s"
# Starting to parse
for i in court_numbers:
cl_current = base_url % (day, month, year, i)
the_page = requests.get(cl_current)
soup = BeautifulSoup(the_page.content, "lxml")
da_stuff = str(soup)
judges_list = ["Mr. Justice T.R. Ramachandran Nair",
"Mr. V. Somasundaran", "Mr. V.Rajendran", "Mr. Rajesh Dewan", "Mr. Benny Gervacis"]
sitting=[]
for x in judges_list:
if x in da_stuff:
sitting.append(x)
# Printing court number and presiding members.
print("Court No. %s:" % i)
print("Presiding: ", (", ".join(sitting)), " \n")
# Checking for advocate's name in the cause list:
count = sum(1 for _ in re.finditer(r'\b%s\b' % re.escape(adv_name), da_stuff))
print("%s has %d matters in this court.\n" % (adv_name, count))
print("Matters for %s on %s:" % (adv_name, date) + "\n")
katscrape()