-1

I have written following code

    page=requests.get("http://3.85.131.173:8000/random_company") 
    soup=BeautifulSoup(page.content,"html.parser")
    info_list=soup.find_all("li")
    print(info_list)

and print gives following answer

[<li>Name: Walker, Meyer and Allen</li>, <li>CEO: David Pollard</li>, <li>CTO: Sandra Boyd</li>, <li>Address: 275 Jones Station Suite 008
Bradburgh, UT 24369</li>, <li>Investment Round: C</li>, <li>Purpose: Reduced logistical contingency for whiteboard end-to-end applications</li>]

I want to extract name and position earlier I was using indexing but it was dynamic could anyone advise how to extract name and purpose.

My edited code after feedback :

page=requests.get("http://3.85.131.173:8000/random_company") 
soup=BeautifulSoup(page.content,"html.parser")
info_list=soup.find_all("li")
print(info_list)
name=[]
purpose=[]

I am now able to print name and location successfully. it is giving following output ['Name: Burnett and Sons'] suppose if I want only Burnett and Sons then what should I do? Could any advise?

Community
  • 1
  • 1
cmpunk
  • 15
  • 3
  • 2
    loop over the list (`for item in info_list:`) and check `if "Name" in item:`, similarly do for the position – Matiiss Oct 25 '21 at 23:22
  • 1
    Do you want to extract all names with their respective positions? Or only the item tagged `'Name'` ? – chickity china chinese chicken Oct 26 '21 at 00:03
  • Which is it you "want to extract name and position", or "to extract name and purpose" ? – chickity china chinese chicken Oct 26 '21 at 00:04
  • It's unclear what exactly you want to scrape. Please [edit] your question to show the expected output. – MendelG Oct 26 '21 at 00:06
  • @matiiss-I want to extract name and purpose .I tried below code { page=requests.get("http://3.85.131.173:8000/random_company") soup=BeautifulSoup(page.content,"html.parser") info_list=soup.find_all("li") for item in info_list: if("Name" in item): print(item) } I am getting only one output that is purpose. .I need name and Purpose .Also I looped for name but it is giving me output as purpose? – cmpunk Oct 26 '21 at 02:29

2 Answers2

0
if 'Name' in item.text:
    name=name.append(item)        <-- Wrong: assigns None to name
if 'Purpose' in item.text:
    purpose=purpose.append(item)  <-- Wrong: assign None to purpose

The two above pointed lines are the problem. list.append() returns None.
(See futher explanation: Why does append() always return None in Python?)

To get your expected output remove the name= part and let list.append() add to your lists inline as below:

for item in item_list: 
    if 'Name' in item.text:
        name.append(item.text)
    if 'Purpose' in item.text:
        purpose.append(item.text)

print(name, purpose)

should print:

['Name: Ward and Sons'] ['Purpose: User-friendly mission-critical algorithm for visualize killer e-business']
 
  • I do not want position I need name and purpose .Kindly see my code.I am unable to extract name and purpose the info_list variable in my code.Also for extraction I don't want to use index .For example using indexing info_list[0] will give me name but I don't to hard code it. – cmpunk Oct 26 '21 at 02:35
  • Got it, thanks for explaining, @cmpunk, please see my updated code – chickity china chinese chicken Oct 26 '21 at 03:45
  • The problem with your code was searching only the Tag element, not the ***text*** within the Tag. – chickity china chinese chicken Oct 26 '21 at 03:55
  • Thank you both of you for your response.I wanted the same thing but when I ran the full code{page=requests.get("http://3.85.131.173:8000/random_company") soup=BeautifulSoup(page.text,"html.parser") item_list=soup.find_all("li") for item in info_list: if 'Name' in item.text or 'Purpose' in item.text: print(item) it is giving me error in if statement error(NavigableString' object has no attribute 'text') – cmpunk Oct 26 '21 at 04:18
  • @cmpunk See how it runs here no errors, press the Blue (>>> Run) button at top-right of page: https://www.pythonanywhere.com/user/downshift/files/home/downshift/69715704/script.py?edit – chickity china chinese chicken Oct 26 '21 at 04:36
  • the `NavigableString` errror may be because you're using `page.content`, try using `page.text` instead, I've included it in my answer. Also, see [BeautifulSoup: AttributeError: 'NavigableString' object has no attribute 'name'](https://stackoverflow.com/questions/7591535/beautifulsoup-attributeerror-navigablestring-object-has-no-attribute-name) – chickity china chinese chicken Oct 26 '21 at 04:44
  • Thanks I have one more doubt.I need to put Name and Purpose in a separate list and then transfer it to a dataframe.I tried below code :soup=BeautifulSoup(page.text,"html.parser") name=[] purpose=[] item_list=soup.find_all("li") for item in item_list: { if 'Name' in item.text: name=name.append(item) if 'Purpose' in item.text: purpose=purpose.append(item) } Could you advise how to rectify the code.Thanks for your help once again. – cmpunk Oct 28 '21 at 00:21
  • How about two lines: `df['Name'] = name; df['Purpose'] = purpose` ? You've got the two lists, just add them to the DataFrame – chickity china chinese chicken Oct 28 '21 at 00:38
  • soup=BeautifulSoup(page.text,"html.parser") name=[] purpose=[] item_list=soup.find_all("li") for item in item_list: { if 'Name' in item.text: name=name.append(item) if 'Purpose' in item.text: purpose=purpose.append(item) } I understand the code that you wrote.My issue is I am unable to separate Name and Purpose using IF statement. It is giving me error: File "", line 9 if 'Name' in item.text: – cmpunk Oct 28 '21 at 01:43
  • you didn't include what kind of error it is, what is the full text of the Traceback error? – chickity china chinese chicken Oct 28 '21 at 02:18
  • My code:soup=BeautifulSoup(page.text,"html.parser") name=[] purpose=[] item_list=soup.find_all("li") for item in item_list: { if 'Name' in item.text: name=name.append(item) if 'Purpose' in item.text: purpose=purpose.append(item) } Error:{ File "", line 9 if 'Name' in item.text: ^] SyntaxError: invalid syntax} – cmpunk Oct 28 '21 at 02:39
  • You have an error in your syntax, [edit] and post your code in the question above, the code is unclear in comments. – chickity china chinese chicken Oct 28 '21 at 02:46
  • I have edited my post.If possible kindly advise. – cmpunk Oct 28 '21 at 03:04
  • You edited ***my*** post. I needed you to put the code into ***your*** post at the top of page. – chickity china chinese chicken Oct 28 '21 at 03:50
  • But the problem is you are mixing Java syntax with Python syntax. Curly-braces `{`, `}` in your `if-else` blocks are illegal and not allowed. Remove these from your code: `{`, `}` – chickity china chinese chicken Oct 28 '21 at 03:52
  • Otherwise, your code should run fine. Remove the curly-braces and try again, and post your results. – chickity china chinese chicken Oct 28 '21 at 03:55
  • I have edited my post and pasted code after removing brackets.Now it is giving output as None.Kindly advise. – cmpunk Oct 28 '21 at 10:02
  • Thanks @cmpunk, see my updated answer – chickity china chinese chicken Oct 28 '21 at 22:07
0

I think you're looking for something like this:

targets = ["Name","Purpose"]
for item in info_list:
    if item.text.split(":")[0] in targets:
        print(item.text)

Output (in this case):

Name: Jimenez LLC
Purpose: Mandatory context-sensitive approach for leverage compelling communities
Jack Fleeting
  • 24,385
  • 6
  • 23
  • 45