Say I have a memory buffer with a vector of type std::decimal::decimal128
(IEEE754R) elements, can I wrap and expose that as a NumPy array, and do fast operations on those decimal vectors, like for example compute variance or auto-correlation over the vector? How would I do that best?

- 21,353
- 10
- 64
- 97
1 Answers
Numpy does not support such a data type yet (at least on mainstream architectures). Only float16, float32, float64 and the non standard native extended double (generally with 80 bits) are supported. Put it shortly, only floating-point types natively supported by the target architecture. If the target machine support 128 bit double-precision numbers, then you could try the numpy.longdouble
type but I do not expect this to be the case. In practice, x86 processors does not support that yet as well as ARM. IBM processors like POWER9 supports that natively but I am not sure they (fully) support the IEEE-754R standard. For more information please read this. Note that you could theoretically wrap binary data in Numpy types but you will not be able to do anything (really) useful with it. The Numpy code can theoretically be extended with new types but please note that Numpy is written in C and not C++ so adding the std::decimal::decimal128
in the source code will not be easy.
Note that if you really want to wrap such a type in Numpy array without having to change/rebuild the Numpy code, could wrap your type in a pure-Python class. However, be aware that the performance will be very bad since using pure-Python object prevent all the optimization done in Numpy (eg. SIMD vectorization, use of fast native code, specific algorithm optimized for a given type, etc.).

- 41,678
- 6
- 29
- 59
-
Mmh, ok. FWIW, I'm not talking about (double double) binary floating points like `float128` (base 2), but about decimal floating points (base 10). Both GCC and LLVM support that at compiler level (on all archs!), C++ and Rust support it at the language level, and MongoDB and Apache Arrow support it at the data level. At the CPU machine instruction level, I only know these machines that support it: IBM z Series and Digital VAX. How do people process financial/money values in NumPy then? – oberstet Feb 02 '22 at 19:17
-
IOW, eg: How do I store a vector of money values and compute its sum in NumPy correctly? also found this, not sure, all very confusing https://stackoverflow.com/questions/9062562/what-is-the-internal-precision-of-numpy-float128 – oberstet Feb 02 '22 at 19:20
-
Ok. Well, Numpy does not really target financial systems, it is meant to be a [numerical computing tool for scientists](https://numpy.org/) (it was even designed mainly for physicist in the first place). People of these fields do not need decimal-based numbers but native floating-point for relatively high-performance applications. Thus, I am not sure Numpy is the right tool for you. – Jérôme Richard Feb 03 '22 at 00:55
-
1Note that the answer still applies: neither decimal-based types nor large floating-point ones. You can see that directly in the [Numpy code](https://github.com/numpy/numpy/blob/main/numpy/typing/_dtype_like.py). Thus, your only solution so far it to use a pure-Python object type wrapped in Numpy arrays. Note that there is a `Decimal` type in Python in the standard module `decimal`. I advise you to use Cython so to wrap your vector of type `std::decimal::decimal128` in a dedicated module if you want a fast code. This may require a significant development effort though regarding your needs. – Jérôme Richard Feb 03 '22 at 01:09
-
1thanks for your hints and tips! I guess you are right. I won't (obviously) undertake such a huge effort (as trying to add it as a scalar type), but will try to use either plain `float64` or use counting and multiply with `Decimal` (on the counts results). and yeah, because SIMD. either plain NumPy or Numba. and the latter also doesn't support it https://numba.readthedocs.io/en/stable/reference/types.html#basic-types – oberstet Feb 03 '22 at 01:23