strong textI am trying to modify a DataFrame by formatting each column in a specific way: my issue is that when I try to use some conditions for the formatting of a specific column I can't seem to be able to do it. More specifically, here are my 3 conditions:
- If the string starts with 'http://o', then put it between <>
- If the string starts with a capital letter, do some formatting such as '"' + df['object'] + '"@en .'
- If the string starts with 'http://w' replace it with 'owl:Class'
I tried with .iterrows()
, itertuple()
, and even .apply()
but I nothing seems to work... thank you in advance
input df
[something] |[something] | http://www.w3.org/2002/07/owl#Class|
[something] | [something] | A part of something|
[something] | [something]|http://oaei.ontologymatching.org/tests/101/onto.rdf#R|
expected output
[something] | [something] | owl:Class|
[something] | [something] | "A part of something"@en|
[something] | [something] | <__http://oaei.ontologymatching.org/tests/101/onto.rdf#Reference>__|
Here is my code:
def to_turtle(df):
df['subject'] = '<' + df['subject'] + '>'
df['predicate'] = '<' + df['predicate'] + '>'
for row in df.itertuples():
if df.loc[row.Index,'object'].str.startswith('http://o', na=False):
df.at[row.Index, 'object'] = "<" + df['object'] + "> ;"
elif df.loc[row.Index,'object'].str.contains('[A-Z]',na=False,regex=True):
df.at[row.Index, 'object'] = '"' + df['object'] + '"@en .'
else:
df.at[row.Index, 'object'] = df['object'].str.replace('http://www.w3.org/2002/07/owl#Class', 'owl:Class')
ont1 = pd.read_csv('1.tsv',sep='\t',names=['subject','predicate','object'])