0

strong textI am trying to modify a DataFrame by formatting each column in a specific way: my issue is that when I try to use some conditions for the formatting of a specific column I can't seem to be able to do it. More specifically, here are my 3 conditions:

  1. If the string starts with 'http://o', then put it between <>
  2. If the string starts with a capital letter, do some formatting such as '"' + df['object'] + '"@en .'
  3. If the string starts with 'http://w' replace it with 'owl:Class'

I tried with .iterrows(), itertuple(), and even .apply() but I nothing seems to work... thank you in advance

input df

[something] |[something] | http://www.w3.org/2002/07/owl#Class|

[something] | [something] | A part of something|

[something] | [something]|http://oaei.ontologymatching.org/tests/101/onto.rdf#R|

expected output

[something] | [something] | owl:Class|

[something] | [something] | "A part of something"@en|

[something] | [something] | <__http://oaei.ontologymatching.org/tests/101/onto.rdf#Reference>__|

Here is my code:

def to_turtle(df):
    df['subject'] = '<' + df['subject'] + '>'
    df['predicate'] = '<' + df['predicate'] + '>'
    for row in df.itertuples():
        if df.loc[row.Index,'object'].str.startswith('http://o', na=False):
            df.at[row.Index, 'object'] = "<" + df['object'] + "> ;"
        elif df.loc[row.Index,'object'].str.contains('[A-Z]',na=False,regex=True):
            df.at[row.Index, 'object'] = '"' + df['object'] + '"@en .'
        else:
            df.at[row.Index, 'object'] = df['object'].str.replace('http://www.w3.org/2002/07/owl#Class', 'owl:Class')

ont1 = pd.read_csv('1.tsv',sep='\t',names=['subject','predicate','object'])
  • if you can add a sample df , your conditions and expected df, that you fetch you more answers. – anky Feb 27 '19 at 16:11
  • See https://stackoverflow.com/a/20159305/463796 on how to make a great pandas question. Add an example `df` and the intended results, so we can run your code locally. – w-m Feb 28 '19 at 13:28

0 Answers0