I have a data frame like:
df['website']
I want a condition that df['website'] should contain the only names of websites in the URL form. If the data frame has other sentences rather than URL it should display a warning message.
I have a data frame like:
df['website']
I want a condition that df['website'] should contain the only names of websites in the URL form. If the data frame has other sentences rather than URL it should display a warning message.
Can use validators package. If you want to know more about it, follow this link.
After getting a function which returns whether url is valid or not, you can use df.apply() and apply that function to all URLs in the dataframe. You can return ture/false for whether it's valid or not. Moreover, in the function, you can print a warning if you find it's invalid.
import validators
def isUrlValid(url):
return True if validators.url(url) else False
df['isURLValid'] = df['website'].apply(isUrlValid)
Output:
website isURLValid
0 https://stackoverflow.com/ True
1 no False
Lastly, if you don't want to add the results as a column in a dataframe, you can loop through all values in df['website'].tolist()
and call the function for each value and print warning in the function
i don't know about the alert.
but to check as for the "url formatting" you could write a function to check for usual url elements like: "http" or ".com". Or even if the data has "." in it.
it really depends on your data...