0

I have the below df:

    JOB_Command Parent_path
0   /data/ingestao_gpdb_nextdm_rep/execucao/call_r...   /data/ingestao_gpdb_nextdm_rep/execucao/call_r...
1   /data/processos/current/ingestao/ciclico/BACEN...   /data/processos/current/ingestao/
2   /data/processos/current/ingestao/ciclico/BACEN...   /data/processos/current/ingestao/
3   /data/processos/current/ingestao/ciclico/BACEN...   /data/processos/current/ingestao/
4   /data/processos/current/ingestao/ciclico/BACEN...   /data/processos/current/ingestao/
5   /data/processos/current/ingestao/ciclico/Inges...   /data/processos/current/ingestao/

And I'm using this piece of code to get the last update of a folder\file.

def get_fourth_elem(file_path):
    """Helper function.

    Args:
        file_path: file path as a string.

    Returns:
        absolute path to the fourth element (or last one if shorter) as a Pathlib object.
    """
    file_path_length = len(file_path.strip("/").split("/"))
    file_path = Path(file_path)
    if file_path_length > 4:
        for _ in range(file_path_length - 4):
            file_path = Path(file_path.parent)
        return file_path
    else:
        return file_path

df_test["Last_Update"] = df_test["JOB_Command"].apply(
    lambda x: datetime.datetime.fromtimestamp(
        get_fourth_elem(x).stat().st_mtime
    ).strftime("%Y-%m-%d %H:%H:%S")
    if Path(x).exists()
    else np.nan
)

But, I'm getting an error when I'm trying to analyze paths that I don't have access, would be possible to treat this and print some form of message "permission denied" ?

here is the sample of the error I'm getting PermissionError: [Errno 13] Permission denied: '/data/processos/current/ingestao/ciclico/IngestionManager_2_0_EXP/sbin/call_ingestion_manager_ciclico_exp.sh'

I tried to add some ifs but, it didn't work, now I'm studying exception, but, couldnt fit yet.

the output shoub be this:

    JOB_Command Parent_path Last_Update
0   /data/ingestao_gpdb_nextdm_rep/execucao/call_r...   /data/ingestao_gpdb_nextdm_rep/execucao/call_r...   2022-09-26 11:30:00
1   /data/processos/current/ingestao/ciclico/BACEN...   /data/processos/current/ingestao/                   2022-09-20 19:20:00             
2   /data/processos/current/ingestao/ciclico/BACEN...   /data/processos/current/ingestao/                   2022-09-20 19:20:00
3   /data/processos/current/ingestao/ciclico/BACEN...   /data/processos/current/ingestao/                   2022-09-20 19:20:00
4   /data/processos/current/ingestao/ciclico/BACEN...   /data/processos/current/ingestao/                   2022-09-20 19:20:00
5   /data/processos/current/ingestao/ciclico/Inges...   /data/processos/current/ingestao/                   NaN

could you gus helpo me?

gfernandes
  • 193
  • 1
  • 10
  • I am curious about which line exactly is raining the permission denied error ? We can try to introduce a try-except block there. – kgkmeekg Oct 14 '22 at 15:47
  • Hi @kgkmeekg, thanks for your reply..The error is happening at index 5, the "Last_Update" value is NaN due the permission denied error accessing the Job_Commad path – gfernandes Oct 14 '22 at 15:50

1 Answers1

1

How about this :
We introduce a new function that does the file operation. try-except block to be used here.

def new_function(x):
    return_value = ''

    try
        if Path(x).exists() 
            return_value = datetime.datetime.fromtimestamp(get_fourth_elem(x).stat().st_mtime).strftime("%Y-%m-%d %H:%H:%S") 
        else 
            return_value = np.nan
            
    except PermissionError:
        return_value = "Permission Denied") 
    else:
        return_value = "An unknown error occurred."
    return return_value

And then we can change your lambda function to use the new function:

df_test["Last_Update"] = df_test["JOB_Command"].apply( lambda x: new_function(x))

This, this and this might be helpful to look at. In case you are getting the unknown error, this might help you get the name of the exception.

Note: Since there is no reproducible code available the above snippet might need some debugging.

kgkmeekg
  • 524
  • 2
  • 8
  • 17
  • the thing is this, it's not about to print on the screen, it's about to "save" the column valeu as NaN, so, I need to "access" the path and if I don't have permition, record NaN as the column value – gfernandes Oct 14 '22 at 15:57
  • 1
    You are already doing this in the lambda function I assume. if Path(x).exists() is what is causing you the error. How about we define another function with the try-except block and use that in your lambda instead of defining the whole function there ? – kgkmeekg Oct 14 '22 at 16:01
  • yes, agreed, could you do that? I'm new to python and still studying the things that I need to delivery – gfernandes Oct 14 '22 at 16:05
  • just tried your approach, it gave me the "permission denied" message in the Last_Update column for the sample I mentioned, but, it didnt give me the last update for any path found in the Job_Command(and I should have some there), could that be because of the dataype? – gfernandes Oct 14 '22 at 16:32
  • basically I had 3 Permission Denied and all the rest were "An unknown error occurred." – gfernandes Oct 14 '22 at 16:37
  • inside the if it gave me the dates I was expecting(didnt count) inside the else it gave me "nan" values in the except it gave 3 permission denied in the last else it gavem me the An unknown error occurred – gfernandes Oct 14 '22 at 16:43
  • I am a bit confused. So did it work or not ? I suggest posting a reproducible code if you want more clarification. – kgkmeekg Oct 14 '22 at 16:49
  • not as it should, it worked for the sample I mentioned that I was getting the permission denied. but, there are paths that I have rights to access, and it should give me the date and it give me the message An unknown error occurred – gfernandes Oct 14 '22 at 16:51
  • Maybe this will help ? https://stackoverflow.com/q/18176602/4168707 – kgkmeekg Oct 14 '22 at 17:00
  • sure, I'll have a look, so far, thanks man... – gfernandes Oct 14 '22 at 17:04