0

So I'm using pandas and requests to scrape IP's from https://free-proxy-list.net/ but how do I cover this code

import pandas as pd

resp = requests.get('https://free-proxy-list.net/')
df = pd.read_html(resp.text)[0]

df = (df[(df['Anonymity'] == 'elite proxy')])

print(df.to_string(index=False))

so that the output is list of IP's without anything else. I managed to remove index and only added elite proxy but I can't make a variable that is a list with only IP's and without index.

Nice
  • 1
  • Does this answer your question? [Get list from pandas dataframe column or row?](https://stackoverflow.com/questions/22341271/get-list-from-pandas-dataframe-column-or-row) – Nick ODell Feb 20 '22 at 19:25

3 Answers3

1

You can use loc to slice directly the column for the matching rows, and to_list to convert to list:

df.loc[df['Anonymity'].eq('elite proxy'), 'IP Address'].to_list()

output: ['134.119.xxx.xxx', '173.249.xxx.xxx'...]

mozway
  • 194,879
  • 13
  • 39
  • 75
0

To get the contents of the 'IP Address' column, subset to the 'IP address' column and use .to_list().

Here's how:

print(df['IP Address'].to_list())
Nick ODell
  • 15,465
  • 3
  • 32
  • 66
0

It looks like you are trying to accomplish something like below:

print(df['IP Address'].to_string(index=False))

Also It would be a good idea, after filtering your dataframe to reset its index like below:

df = df.reset_index(drop=True)

So the code snippet would be something like this:

import pandas as pd

resp = requests.get('https://free-proxy-list.net/')
df = pd.read_html(resp.text)[0]

df = (df[(df['Anonymity'] == 'elite proxy')])
df = df.reset_index(drop=True)
print(df['IP Address'].to_string(index=False))