Convert Numpy array of floats to ints proportionately (balancing chemical equation)

Question

I have a code that balances the chemical equations. The only problem is that I want to convert the final solution i.e. 1D np array of floats to integers. Obviously, I can not directly round it to nearest integers, that would mess up the balancing. One way is to multiply it with a number that will convert the floats to integers(type does not matter). See below for an example.

>>> coeffs=equation_balancer(reactants=["H2","O2"], products=["H2O"])
>>> coeffs
{"H2": 1.0, "O2": 0.5, 'H2O1': 1.0}
>>> import numpy as np
>>>  np.asarray([i for i in coeffs.values()])
array([1. , 0.5, 1.])

if the final array is multiplied by 2, then the fractions (floats) can be removed.

PS to show an example above, I changed back to np, since the equation_balancer uses scipy.linalg.solve to balance the equation.

>>> np.asrray([i for i in coeffs.values()])*2
array([2., 1., 2.])

How to get this number that on multiplication with array gives the integer-valued array? The actual type of array does not matter.

One way would be to multiply the array with highest denominator i.e. multiples of 10. And then find the highest common factor:

>>> c=np.asrray([i for i in coeffs.values()])*10
>>> factor = np.gcd.reduce(c.astype(int))
>>> factor
5
>>> c/factor
array([2., 1., 2.])

In the above case finding the 10*n that is defined by the number of highest decimal places, is crucial. I don't know how to code it at the moment. Is there any other approach that would be more suitable? Any help.

fountainhead · Accepted Answer · 2021-02-02T05:03:00.410

This seems to work:

(Credit to this SO answer on how to convert a floating point number into a tuple of "minimal" integer numerator and integer denominator -- rather than some freaksihly large numerator and denominator)

import numpy as np
from fractions import Fraction

# A configurable param.
# Keep this small to avoid frekish large results.
# Increase it only in rare cases where the coeffs
# span a "huge" scale.
MAX_DENOM = 100

fractions = [Fraction(val).limit_denominator(MAX_DENOM)
             for val in coeffs.values()]
ratios = np.array([(f.numerator, f.denominator) for f in fractions])
# As an alternative to the above two statements, uncomment and use
# below statement for Python 3.8+
# ratios = np.array([Fraction(val).limit_denominator(MAX_DENOM).as_integer_ratio()
#                    for val in coeffs.values()])

factor = np.lcm.reduce(ratios[:,1])
result = [round(v * factor) for v in coeffs.values()]

# print
result

Output for coeffs = {"H2": 1.0, "O2": 0.5, 'H2O1': 1.0}:

[2, 1, 2]

Output for coeffs = {"H2": 0.5, "N2":0.5, "O2": 1.5, "H1N1O3":1.0}:

[1, 1, 3, 2]

Output for coeffs = {"H2": 1.0, "O3": (1/3), "H2O1":1.0}:

[3, 1, 3]

Output for coeffs = {"H4": 0.5, "O7": (1/7), "H2O1":1.0}:

[7, 2, 14]

Output for coeffs = {"H2": .1, "O2": 0.05, 'H2O1': .1}:

[2, 1, 2]

nice answer +1, never heard of the `fractions` module before, but where did you find `H4` or `O7`, certainly not on this Earth :) — Stef, Feb 01 '21 at 14:37

Maxwell Redacted · Answer 2 · 2021-02-01T14:03:50.870

I am not entirely happy with my solution but it seems to work alright, let me know what you think, I am essentially converting the float to a string and counting the number of characters after the decimal place, it will work as long as the values are always float

import numpy as np

coeffs = {"H2": .1, "O2": 0.05, 'H2O1': .1}

n = max([len(str(i).split('.')[1]) for i in coeffs.values()])

c=np.array([i for i in coeffs.values()])*10**n
factor = np.gcd.reduce(c.astype(np.uint64))

print((c/factor).astype(np.uint64))

source and other solutions: Easy way of finding decimal places

Testing: running some possible difficult cases examples converting back

primes = [3,5,7,11,13,17,19,23,29,79] ## some prime numbers 

primes_over_1 = [1/i for i in primes]

for i in range(1, len(primes_over_1) - 1):
  coeffs = {"H2": primes_over_1[i-1], "O2": primes_over_1[i], 'H2O1': primes_over_1[i+1]}

  print('coefs: ', [a for a in coeffs.values()])

  n = max([len(str(a).split('.')[1]) for a in coeffs.values()])

  c=np.array([a for a in coeffs.values()])*10**n
  factor = np.gcd.reduce(c.astype(np.uint64))

  coeffs_asInt = (c/factor).astype(np.uint64)

  print('as int:', coeffs_asInt)

  coeffs_back = coeffs_asInt.astype(np.float64)*(factor/10**n)

  coeffs_back_str = ["{0:.16g}".format(a) for a in coeffs_back] 
  print('back:  ', coeffs_back_str)

  print('########################################################\n')

output:

coefs:  [0.3333333333333333, 0.2, 0.14285714285714285]
as int: [8333333333333333 5000000000000000 3571428571428571]
back:   ['0.3333333333333334', '0.2', '0.1428571428571428']
########################################################

coefs:  [0.2, 0.14285714285714285, 0.09090909090909091]
as int: [5000000000000000 3571428571428571 2272727272727273]
back:   ['0.2', '0.1428571428571428', '0.09090909090909093']
########################################################

coefs:  [0.14285714285714285, 0.09090909090909091, 0.07692307692307693]
as int: [14285714285714284  9090909090909092  7692307692307693]
back:   ['0.1428571428571428', '0.09090909090909093', '0.07692307692307694']
########################################################

coefs:  [0.09090909090909091, 0.07692307692307693, 0.058823529411764705]
as int: [2840909090909091 2403846153846154 1838235294117647]
back:   ['0.09090909090909091', '0.07692307692307693', '0.05882352941176471']
########################################################

coefs:  [0.07692307692307693, 0.058823529411764705, 0.05263157894736842]
as int: [2403846153846154 1838235294117647 1644736842105263]
back:   ['0.07692307692307693', '0.05882352941176471', '0.05263157894736842']
########################################################

coefs:  [0.058823529411764705, 0.05263157894736842, 0.043478260869565216]
as int: [1838235294117647 1644736842105263 1358695652173913]
back:   ['0.05882352941176471', '0.05263157894736842', '0.04347826086956522']
########################################################

coefs:  [0.05263157894736842, 0.043478260869565216, 0.034482758620689655]
as int: [6578947368421052 5434782608695652 4310344827586207]
back:   ['0.05263157894736842', '0.04347826086956522', '0.03448275862068966']
########################################################

coefs:  [0.043478260869565216, 0.034482758620689655, 0.012658227848101266]
as int: [21739130434782608 17241379310344828  6329113924050633]
back:   ['0.04347826086956522', '0.03448275862068966', '0.01265822784810127']
########################################################

did you try your solution for a trivial case like `coeffs = {"H2": 1/3, "O2": 1/3, 'H2O1': 1/3}`? (which should give 1:1:1) — Stef, Feb 01 '21 at 11:35
Does that case exist? EDIT: I see what you mean, I interpreted you meant 1/3 as a string which would mean there was no '.' — Maxwell Redacted, Feb 01 '21 at 11:38
I ran my code with coeffs = {"H2": 1/3, "O2": 1/3, 'H2O1': 1/3} and it does produce [1,1,1], but that experiment does not prove it will work for all cases.... — Maxwell Redacted, Feb 01 '21 at 11:47
no, I meant the case when `equation_balancer` returns `1/3` for each component which will be **printed** as `0.3333333333333333`. I get `[-1552204.2910258 -1552204.2910258 -1552204.2910258]` with your code. — Stef, Feb 01 '21 at 11:53
I am getting [1,1,1] when I run it with coeffs = {"H2": 1/3, "O2": 1/3, 'H2O1': 1/3} — Maxwell Redacted, Feb 01 '21 at 12:35
going for a run, back in 30 mins, take a look at my edit Stef — Maxwell Redacted, Feb 01 '21 at 12:46
I have been using google colab which is Python 3.6.9 and numpy 1.19.5, I have just tested on my own machine which has python 3.8.6 and numpy 1.19.4 and have been able to recreate the error you experienced so it is effected by the python version, I will diagnose — Maxwell Redacted, Feb 01 '21 at 13:38
datatypes! I theorise we may both have python 32 bit installed - since it is the default when you download, therefore int is 32 bit, so therefore we need to specify np.uint64 instead of int (np.int32 also broken) - see edit — Maxwell Redacted, Feb 01 '21 at 14:02
it works with `np.gcd.reduce(c.astype(np.int64))` and it's clear to me now. `int` in `astype(int)` is equivalent to `np.int_`. This is `np.int32` on Windows (even for 64 bit) or `np.int64` for Linux, see [here](https://stackoverflow.com/a/36279321/3944322). (`c` was about 3e+15 which of course didn't fit into 32 bits), So +1 for all your time and effort :) — Stef, Feb 01 '21 at 14:28

Convert Numpy array of floats to ints proportionately (balancing chemical equation)

2 Answers2