Step Functions are AWS structures that control the flow of lambdas (or other events). All my lambdas use Python (but Lambdas can use most major languages). Throughout the process my step function sends status updates back to the client (the client triggered it via API). Let's say it progresses through these updates: Started -> In Progress -> Finishing -> Done. For handled errors it will send an 'Error' status back to the client. So the client could see a timeline like this: Started -> In Progress -> Errored. This is ideal - so the user knows the process has stopped.
But when there are unexpected/unhandled errors the client never really knows and the timeline might sit at 'In Progress' indefinitely - the user doesn't know what happened. So I started looking into the built-in Step Function error handling. I like this option because I can create a 'Catch' function for each lambda or event where I can communicate back to the client if there is an error. The downside to this was that it really made the step function template/design messy see the before/after screenshots below.
AFTER---------------
The template code that generates these graphs doesn't look much better. So I considered an alternative which seems similarly messy. I could add a single try/except block within each lambda for the entire lambda - to catch any/all errors. For example:
def lambda_handler(event, context):
try:
#Execute function tasks
except:
#Communicate back to client that there was an error
Similar to the step function 'Catch' functions this would ensure that I catch and communicate any error. But this seems like a bad idea just because of what it is (adding blanket/blind try/except).
So right now I'm stuck between messy/repeated code and try/except-ing everything. Am I implementing step function 'Catch' incorrectly? Am I missing a better way to handle unknown Python errors? Is there another approach entirely?