0

I have this code:

SP500 = pd.read_csv("http://trading.chrisconlan.com/SPstocks_current.csv", header=-1, names=['Symbol']).copy()
SP500=list(SP500.Symbol)
SP500=['AVGO', 'GM', 'FDX', 'goog']

threads = []
lock = threading.Lock()
offset = 1
multiply = 1
num_of_threads = 4
for i in range(0, num_of_threads):
    t = threading.Thread(target=digest_distros, args=(SP500, i * multiply, i * multiply + offset))
    t.start()
    threads.append(t)
for t in threads:
    t.join()

This is the func

def digest_distros(SPT500, start, finish):
    for stock in SP500[start:finish]:

        daily, stock_distro = get_daily_adjusted(stock)
        if daily is None:
            continue
        monthly_adjusted_close=get_monthly_adjusted(stock)
        if monthly_adjusted_close is None:
            continue

        with lock:
            print "\n"
            print "##############   " + stock + "   ##############"
            print daily[['low', 'high', 'open', 'adjusted close']].tail(1)
            print "\n"

            curr_monthly_adjusted=monthly_adjusted_close[-1]
            print "##########################"
            print "current monthly adjusted close is: {}".format(curr_monthly_adjusted)
            required_value_for_signal=find_min_signal_value(monthly_adjusted_close)
            print "Required value for signal tommorow is : {}".format(required_value_for_signal)
            print "##########################"

            print "\n"

            spans = [0.3, 0.5, 1, 2,3,5]
            for span in spans:
                mean=stock_distro[span][0]
                std=stock_distro[span][1]
                if abs(curr_monthly_adjusted-required_value_for_signal) < 3:
                    print "Time span is {:.3f} years, daily change mean {:.3f}, daily change std {:.3f}".format(span,mean,std)
                    z_value=calculate_z_value(required_value_for_signal, curr_monthly_adjusted, mean, std)
                    # if z_value>0.3:
                    print "Probability is: {:.3f}".format(z_value)

When running, if the code reaches the code inside the for loop or inside the if statement (I think), I lose my lock...

Can't understand why.

Example output for mixed printing.

######## GM #

current monthly adjusted close is: 37.64 Required value for signal tommorow is : 38.5

#

Time span is 0.300 years, daily change mean -0.083, daily change std 0.692 Probability is: 0.869 Time span is 0.500 years, daily change mean 0.004, daily change std 0.663 Probability is: 0.904 Time span is 1.000 years, daily change mean 0.009, daily change std 0.531 Probability is: 0.949

Time span is 2.000 years, daily change mean 0.018, daily change std 0.512############## AVGO ##############

Probability is: 0.957 Time span is 3.000 years, daily change mean 0.005, daily change std 0.495 Probability is: 0.960 Time span is 5.000 years, daily change mean 0.011, daily change std 0.477 Probability is: 0.966

Gil Hamilton
  • 11,973
  • 28
  • 51
  • 1
    If you run this with unbuffered stdout (the `-u` flag on the command line), is the output still interleaved? Or, alternatively, if you add an explicit `flush` after the `for` loop, is the output still interleaved? If the answer to either one is no, you don't actually have any problem with locking; you just have unflushed writes when you give up the lock. – abarnert Mar 12 '18 at 19:07
  • Im running this in jupyter. how do I "inject" the commands you are talking about? – koren maliniak Mar 12 '18 at 20:16
  • To inject the `flush`, just edit your source code to add `sys.stdout.flush()` at the end of the `with` block (still indented within it, but after the rest of the code). To inject the `-u`… I'm not sure. I found [this question](https://stackoverflow.com/questions/37534440/passing-command-line-arguments-to-argv-in-jupyter-ipython-notebook), but you'd think there must be an easier way? Alternatively, you can search SO for how to programmatically switch `sys.stdout` to an unbuffered stream and edit your source—much less ideal, but maybe a usable fallback if needed? – abarnert Mar 12 '18 at 20:22
  • why is it so complicated? isnt LOCK suppose to work out of the box for the whole block? – koren maliniak Mar 12 '18 at 21:43
  • Lock doesn’t lock every file, stream, etc. in the universe and force them all to flush their buffers. That would be terrible. – abarnert Mar 12 '18 at 22:04
  • Looks like the std flush did it. I didn't do the other thing. Can you explain what exactly in the code is not locked? Why is it not working only at the end for loop? – koren maliniak Mar 13 '18 at 10:05

1 Answers1

0

The simplest fix to get what you want is to flush the output buffers right before you exit the lock. Like this:

with lock:
    # ... stuff
    print(stuff)
    # etc.
    sys.stdout.flush()

Why is this necessary? Because print doesn't actually put anything on the terminal. What it does is put something in the sys.stdout buffer, so that it will eventually get to the screen. That "eventually" could well be after you've released the lock.

Locking works just fine, and there's nothing in your code that isn't locked; the problem is that locking doesn't force every file—or other buffered type—in the universe to become unbuffered. Think about it this way: if you send data over the network from inside a lock, there's no way to guarantee that the other side has received the data by the time you release the lock (except to wait, still holding the lock, for some acknowledgement).

abarnert
  • 354,177
  • 51
  • 601
  • 671