3

Related question on SO (by myself earlier today): Why does error traceback show edited script instead of what actually ran? Now I know why it happens, then I want to now how I can deal with it.

I see some questions like How do I debug efficiently with spyder in Python? and How do I print debug messages in the Google Chrome JavaScript Console? well received, so I suppose asking about debugging practices is on-topic, right?

Background

I write a script that raises exception at line n, run it from terminal, add a line in the middle while the script is still running, and save the modified file. So the script file is modified while the interpreter is running it. Especially the line number of the very line that will raise exception has changed. The error traceback report by the Python interpreter shows me line n of the "modified" version of the script, not of the actual "running" version.

Minimal Example

Let's say I run a script:

import time

time.sleep(5)
raise Exception

and while the interpreter is stuck at time.sleep(5), I add a line after that one.

So now I have:

import time

time.sleep(5)
print("Hello World")
raise Exception

Then the interpreter wakes up from sleep, the next command, raise Exception is executed, and the program terminates with the following traceback.

Traceback (most recent call last):
  File "test/minimal_error.py", line 4, in <module>
    print("Hello World")
Exception

So it correctly reports the line number (from the original script, so actually useless if we only have the modified script) and the error message ("Exception"). But it shows a totally wrong line of code that actually raised the error; if it were to be of any help, raise Exception should be displayed, not print("Hello World"), which wasn't even executed by the interpreter.

Why this matters

In real practice, I implement one part of a program, run it to see if that part is doing fine, and while it is still running, I move on to the next thing I have to implement. And when the script throws an error, I have to find which actual line of code caused the error. I usually just read the error message and try to deduce the original code that caused it.

Sometimes it isn't easy to guess, so I copy the script to clipboard and rollback the code by undoing what I've written after running the script, check the line that caused error, and paste back from clipboard. Sometimes this is very annoying because it isn't always possible to remember the exact state of the script when I ran it. ("Do I need to undo more to rollback? Or is this the exact script I ran?")

Sometimes the script will run for more than 10 minutes, or even an hour before it raises an exception. In such case, "rollback by undo" is practically impossible. Sometimes I even don't know how long script will run before actually running it. I apparently can't just sit and keep my script unmodified before it terminates.

Question

By what practice can I correctly track down the command that caused the exception?

One hypothetical solution is to copy the script to a new file every time I want to run it, run the copied version, and keep editing the original one. But I think this is too bothersome to do every ten minutes, whenever I need to run a script to see if it works well.

Another way is to git-commit every time I want to run it, so that I can come back and see the original version when I need to, but this will make the commit history very dirty, so I think this is even worse than the other one.

I also tried python -m pdb -m script.py, but it shows the same "modified version of line n", just as the plain traceback does.

So is there any practical solution I can practice, say, every ten minutes?

Ignatius
  • 2,745
  • 2
  • 20
  • 32
  • You don't get that traceback from the code you posted, at least not from a freshly imported module or if you run it as a script. The line that raises the exception is `raise Exception`. – chepner Apr 03 '19 at 16:53
  • @chepner, I run the script (without the `print(...)` line) from terminal like `python test.py`, then add the `print(...)` line while the interpreter is stuck at `time.sleep(5)`, save the script, the interpreter is done with sleeping and reaches `raise`, so the exception is raised, and the traceback shows the line with `print(...)`. I think I made it clear in the "Minimal Example" but sorry if I didn't. How do you think I can make it clearer? – Ignatius Apr 03 '19 at 16:58
  • 6
    I'm pretty sure the practical solution is "don't do this". Don't modify running code. – chepner Apr 03 '19 at 16:59
  • You could write something to automate your hypothetical solution. `mktemp` might be useful. – wjandrea Apr 03 '19 at 17:26

4 Answers4

3

Instead of committing every time you run the script, simply use git stashing, this way you will not add dirty commits to your history.

So before you run a script, git stash your local changes, inspect the error, then git stash pop.

Read more about git stash here.

This solution assumes that the script running is at the HEAD of the current branch,


Another solution if the above condition doesn't apply, is to create an arbitrary branch, call it (running-script), git stash your local changes that are not yet commited, checkout to this new branch, git apply stash and run the script. Then checkout back to your original branch, re-apply the stash and resume your work.

You could simply write a bash script file that automates this process as follows

git stash
git checkout -b running-script # potential param
git stash apply stash
RUN script # replace with the actual command to run the script in the background
git checkout original-branch # potential param
git stash apply stash

You could have the running-script and original-branch passed to the bash file as params.

Ahmed Ragab
  • 836
  • 5
  • 10
1

@chepner's comment is valid:

I'm pretty sure the practical solution is "don't do this". Don't modify running code.

As a relatively simple workaround, you could accomplish this with a bash script (or similar scripted approach in whatever environment you are using if bash isn't available).

For bash, a script like the one below would work. It takes the filename as a parameter and uses date to create a unique temporary filename, then copies the file to it and executes it. In this manner, you always have a static copy of the running code and you can use aliasing to make it trivial to use:

filename=$1

# extract file name and extension
extension="${filename##*.}"
filename="${filename%.*}"

# create a unique temporary name (using date)
today=`date +%Y-%m-%d-%H:%M:%S` # or whatever pattern you desire
newname="$filename-$today.$extension"

# copy and run the python script
cp $1 $newname
echo "Executing from $newname..."
/path/to/python $newname

# clean it up when done, if you care to
rm $newname

You could then alias this to python if you want so you don't have to think about doing it, with something like this in your .bashrc or .bash_aliases:

alias python="source path/to/copy_execute.sh"

Although it may be better to give it a different name, like

alias mypy="source path/to/copy_execute.sh"

Then, you can run your script, modify, and run some more with mypy myscript.py and you won't ever be editing the currently executing code.

One drawback is that while this script will clean up and delete the files after it is done running, it will create a lot of temp files that will be around while it runs. To get around this, you could always copy to somewhere in /tmp or elsewhere where the temporary files won't get in the way. Another issue is that this get's more complicated for large code bases which you may not want to copy all over the place. I'll leave that one to you.

A similar approach could be crafted for Windows with powershell or cmd.

Salvatore
  • 10,815
  • 4
  • 31
  • 69
1

I'm probably going to give an oversimplified answer, and may not be applicable in every scenario.

Use PyCharm

I usually work with code that takes from minutes to hours to finish and I need to constantly run it to see how it is performing and I continue coding while it runs. If it fails, I receive the original line that threw the error.

I also have to run it in an GUI-less Ubuntu server, so this is how I do it to receive the right error every time:

  1. Code in Pycharm.
  2. Test in PyCharm and continue coding. (I get the right error if it fails)
  3. Once I'm comfortable with performance, I move it to the server and run it again (I also get the right error here)
bleand
  • 366
  • 2
  • 14
1

I am not saying that it will be completely avoided, but you may reduce this error. If you are coding all your logic in a single file then stop doing it.

Here are the few recommendation..

  1. split your code logic into multiple files. examples ..

    • utility,
    • helper,
    • model,
    • component,
    • train,
    • test,
    • feature
  2. make your function as small as 10 lines (if possible)

  3. if you are using class that should not be more that 125 lines
  4. file size should not cross 150 line

Now if there is any Exceptions occur then it traceback might spread into more number of file and i guess not all files get modified in one shot to implement your changes. Good news is, if your exception started from a file which you have not changed then its easy to catch that line and fix it, else it will be a minimum effort to find exact line.

if you are also using git and you have not committed then you can also compare revision to get exact code which might causing error.

Hope this minimize your problem.