2

I have a data frame like:

df['website']

I want a condition that df['website'] should contain the only names of websites in the URL form. If the data frame has other sentences rather than URL it should display a warning message.

Atom Store
  • 961
  • 1
  • 11
  • 35
  • Please include some sample [`reproducible`](https://stackoverflow.com/a/20159305/4985099) input along with expected output. – sushanth Apr 22 '21 at 04:19

2 Answers2

1

Can use validators package. If you want to know more about it, follow this link.

After getting a function which returns whether url is valid or not, you can use df.apply() and apply that function to all URLs in the dataframe. You can return ture/false for whether it's valid or not. Moreover, in the function, you can print a warning if you find it's invalid.

import validators

def isUrlValid(url):
    return True if validators.url(url) else False
df['isURLValid'] = df['website'].apply(isUrlValid)

Output:

website     isURLValid
0   https://stackoverflow.com/  True
1   no  False

Lastly, if you don't want to add the results as a column in a dataframe, you can loop through all values in df['website'].tolist() and call the function for each value and print warning in the function

Shubham Periwal
  • 2,198
  • 2
  • 8
  • 26
0

i don't know about the alert.

but to check as for the "url formatting" you could write a function to check for usual url elements like: "http" or ".com". Or even if the data has "." in it.

it really depends on your data...

hbrandao
  • 3
  • 1
  • 4