Improve the implementation of worldquant 101 alpha factors using numpy

Question

I was trying to implement 101 quant trading factors that were published by WorldQuant (https://arxiv.org/pdf/1601.00991.pdf).

A typical factor is about processing stocks' price and volume information along with both time dimension and stock dimension. Take the example of alpha factor #4: (-1 * Ts_Rank(rank(low), 9)). This is a momentum alpha signal. low is a panel of stocks' low price within a certain time period. rank is a cross-sectional process of ranking panel’s each row (a time snapshot). Ts_Rank is a time-series process of moving_rank panel’s each column (a stock) with a specified window.

Intuitively, the Pandas dataframe or NumPy matrix should fit the implementation of 101 alpha factors. Below is the best implementation using NumPy I got so far. However, the performance was too low. On my Intel core i7 windows machine, it took around 45 seconds to run the alpha #4 factor with a 5000 (trade dates) by 200 (stocks) matrix as input.

I also came across DolphinDB, a time series database with built-in analytics features (https://www.dolphindb.com/downloads.html). For the same factor Alpha#4, DolphinDB ran for mere 0.04 seconds, 1000 times faster than the NumPy version. However, DolphinDB is commercial software. Does anybody know better python implementations? Or any tips to improve my current python code to achieve performance comparable to DolphinDB?

Numpy implementation (based on https://github.com/yli188/WorldQuant_alpha101_code)

import numpy as np

def rankdata(a, method='average', *, axis=None):
    # this rankdata refer to scipy.stats.rankdata (https://github.com/scipy/scipy/blob/v1.9.1/scipy/stats/_stats_py.py#L9047-L9153)
    if method not in ('average', 'min', 'max', 'dense', 'ordinal'):
        raise ValueError('unknown method "{0}"'.format(method))

    if axis is not None:
        a = np.asarray(a)
        if a.size == 0:
            np.core.multiarray.normalize_axis_index(axis, a.ndim)
            dt = np.float64 if method == 'average' else np.int_
            return np.empty(a.shape, dtype=dt)
        return np.apply_along_axis(rankdata, axis, a, method)

    arr = np.ravel(np.asarray(a))
    algo = 'mergesort' if method == 'ordinal' else 'quicksort'
    sorter = np.argsort(arr, kind=algo)

    inv = np.empty(sorter.size, dtype=np.intp)
    inv[sorter] = np.arange(sorter.size, dtype=np.intp)

    if method == 'ordinal':
        return inv + 1

    arr = arr[sorter]
    obs = np.r_[True, arr[1:] != arr[:-1]]
    dense = obs.cumsum()[inv]

    if method == 'dense':
        return dense

    # cumulative counts of each unique value
    count = np.r_[np.nonzero(obs)[0], len(obs)]

    if method == 'max':
        return count[dense]

    if method == 'min':
        return count[dense - 1] + 1

    # average method
    return .5 * (count[dense] + count[dense - 1] + 1)

def rank(x):
    return rankdata(x,method='min',axis=1)/np.size(x, 1)

def rolling_rank(na):
    return rankdata(na.transpose(),method='min',axis=0)[-1].transpose()    

def ts_rank(x, window=10):
    a_rolled = np.lib.stride_tricks.sliding_window_view(x, window,axis = 0)
    return np.append(np.full([window-1,np.size(x, 1)],np.nan),rolling_rank(a_rolled),axis = 0)


def alpha004(data):
    return -1 * ts_rank(rank(data), 9)

import time

# The input is a 5000 by 200 matrix, where the row index represents trade date and the column index represents security ID. 
data=np.random.random((5000, 200))
start_time = time.time()
alpha004(data)
print("--- %s seconds ---" % (time.time() - start_time))

--- 44.85099506378174 seconds ---

DolphinDB implementation

def WQAlpha4(low){
    return -mrank(rowRank(low, percent=true), true, 9)
}

// The input is a 5000 by 200 matrix, where the row index represents trade date and the column index represents security ID.
low = rand(1000.0,5000:200);
timer WQAlpha4(low);

Time elapsed: 44.036 ms (0.044s)

Something like DolphinDB is a specialised product and can focus on pulling out all of the stops, probably using a low-level programming language. It will be impractical to beat unless you also drop down to a better language. — Reinderien, Nov 05 '22 at 14:33

score 3 · Answer 1 · answered Nov 28 '22 at 23:35

This part of the code:

return np.apply_along_axis(rankdata, axis, a, method)

...is going to be quite slow. Function application like this means more of the computation runs in Python, and relatively little of it runs in C.

There's a much faster solution available here, if you're okay with a slight change in how your rank function is defined. Specifically, the below code is equivalent to changing from method='min' to method='ordinal'. On a test dataset of random numbers, it agrees with your method 95% of the time, and only disagrees by 1 where it is different.

By using argsort along the axis, numpy can do the entire calculation without dropping into Python.

def rank(x):
    return (data.argsort(axis=1).argsort(axis=1) + 1) / np.size(x, 1)


def ts_rank(x, window=10):
    a_rolled = np.lib.stride_tricks.sliding_window_view(x, window, axis = 0)
    rolling_rank_fast = (a_rolled.argsort(axis=2).argsort(axis=2) + 1)[:, :, -1]
    # Fill initial window - 1 rows with nan
    initial_window = np.full([window-1,np.size(x, 1)],np.nan)
    return np.append(initial_window,rolling_rank_fast,axis = 0)


def alpha004(data):
    return -1 * ts_rank(rank(data), 9)

Benchmarking this, I find it runs roughly 100x faster.

This is a great solution! On my laptop, it takes only 0.37201786 seconds. It's a rocket-style speedup! — Good Will, Jan 16 '23 at 13:39

Improve the implementation of worldquant 101 alpha factors using numpy

1 Answers1