1

I know that bit shift multiplication used in operations of fixed point math , for example if i need to multiply two float values i should multiply it on scale factor (for example in that case 20) and after that, i should multiply result values as integer values and after that i should return they to normal presentation of numbers, should divide again on scale factor how to perform that with bit shift operations?

Based on this article : 5.4 Fixed-point arithmetic

I have tried this code example below, and i expected that result floatResShift and floatResNormal would be same but they are different, what i'm doing is wrong?:

        float mul1 = 18.579434f;
        float mul2 = 34.307951f;

        int shiftMul1 = (int)((2 ^ 32) * mul1);
        int shiftMul2 = (int)((2 ^ 32) * mul2);

        var resultMul = shiftMul1 * shiftMul2;
        float floatResShift = resultMul >> 32; // wrong value
        float floatResNormal = mul1 * mul2; //expected value

UPDATE:

fixed point arithmetic explanation:

Using fixed-point arithmetic to calculate the result of A · B when A = 2.5 and B = 8.4 using 32-bit integers would involve the following operations:

Decide upon a scaling factor. This depends largely upon what kind of numbers are likely to be seen. As the numbers in this example are so low, it is less important, and 16 fractional bits (bits to the right of the radix point) are acceptable. The scaling factor will then be f = 216 = 65536. This format is known as Q15.16 (15 bits to the left of the radix point, 16 to the right and one bit for a sign).

Multiply Ai and Bi using normal integer multiplication. Ri = Ai · Bi = 163840 · 550502 = 90194247680. The reason for such a large number is that both Ai and Bi were scaled into our Q15.16 format, so the number that results from the multiplication is essentially (A · f) · (B · f) = A · B · f2.

In order to bring our result back into the Q15.16 format, the result must thus be divided by the scaling factor. This too can be done using bit shift arithmetic, but for simplicity’s sake division is used here. Ri/f = 90194247680/65536 = 1376255 which is our result in Q15.16 format

To turn the number back into a normal real number, one only needs to cast it into the format desired and divide by the scaling factor again, so: 1376255.0/65536.0 = 20.999985 which is near the expected number 21.

Scale numbers with the scaling factor. In binary arithmetic this can be accomplished using bit shifts, but for simplicity we will use multiplication by the scaling factor. Ai = A·f = 2.5·65536 = 163840 and B · f = 8.4 · 65536 = 550502.4 which is then truncated turn it into an integer, so Bi = 550502.

To turn the number back into a normal real number, one only needs to cast it into the format desired and divide by the scaling factor again, so: 1376255.0/65536.0 = 20.999985 which is near the expected number 21.

How to make the same like in comment above but with bit shift. And with big float values after point

I have tried with code above, but with no luck.

For example i need to multiply two values 18.579434f and 34.307951f but using fixed point arithmetic.

UPDATE:

I have tried this with less scale factor but with no luck.

SOLUTION:

Maybe i don't clearly explained the question, but i fix the problem and i found a solution:

Thanks for all, question is closed, here is complete code with fixed point multiplication:

    float mul1 = 18.579434f;
    float mul2 = 34.307951f;

    int scaleFactor = (int) Math.Pow(2, 20);

    long shiftMul1 = (int)((scaleFactor) * mul1);
    long shiftMul2 = (int)((scaleFactor) * mul2);

    var resultMul = shiftMul1 * shiftMul2;
    float floatResShift = resultMul >> 40; 
    float floatResNormal = mul1 * mul2; // the result floatResNormal almost same as floatResShift
testCoder
  • 7,155
  • 13
  • 56
  • 75
  • 2
    Look at the intermediate values. –  Dec 24 '12 at 18:15
  • What is wroing with intermediate values? – testCoder Dec 24 '12 at 18:20
  • I have updated it! Look at comments in last two lines. – testCoder Dec 24 '12 at 18:22
  • My notes: `int` is "too small" given the [intended?] factor, `2 ^ 32` does not mean "to the power of", and try expressing `>> 32` in terms of a division. –  Dec 24 '12 at 18:23
  • 1
    you multiply twice by 2^32... and as pst says, it should be 2<<32 – edeboursetty Dec 24 '12 at 18:23
  • I don't get any exception about overflow, i thought it true, I have replace int wit long but the result is not equals the `floatResNormal` – testCoder Dec 24 '12 at 18:25
  • what i am saying is `mul1*(2^32)*mul2*(2^32) = mul1*mul2*(2^64)` – edeboursetty Dec 24 '12 at 18:26
  • @d--b how to make multiplication with shift operation. – testCoder Dec 24 '12 at 18:30
  • 1
    Assuming you fix these problems and make the code do what you seem to have in mind, you'd be using 0.32 fixpoint on numbers that are outside that range. Also, why? A floating point multiplication is only a tiny bit slower than an integer multiplication, all the other stuff you're doing immediately destroys the advantage. – harold Dec 24 '12 at 18:30
  • not necessarily 32 scale factor it simply for example, it would be nice if i get code with any scale factor but it should work, who can explain how to accomplish that? – testCoder Dec 24 '12 at 18:33
  • @testCoder: to be honest, I have no clue what you're trying to do – edeboursetty Dec 24 '12 at 18:33
  • @testCoder If you fix the issue(s) listed in my comment and note d--b's comment then .. it works (but is really just an exercise in math). Try with a smaller integral literal scale factor, say, of 10 first to avoid issues with overflow, improper shifting, and the need to force wider types. –  Dec 24 '12 at 18:34
  • Have you looked at the [`decimal`](http://msdn.microsoft.com/en-us/library/364x0z75(v=vs.100).aspx) data type? What you are trying to do with `float` values is not bit shifting, but adjusting the exponent. – HABO Dec 24 '12 at 18:56

2 Answers2

1
var k = 20;
var k_2 = k/2;
var p = 1 << k;

float mul1 = 18.579434f;
float mul2 = 34.307951f;

int shiftMul1 = (int)(p * mul1);
int shiftMul2 = (int)(p * mul2);

//fixed point multiplication         
var resultMul = ((shiftMul1 >> k_2) * (shiftMul2 >> k_2));

float floatResShift = ((float)resultMul)/p;
float floatResNormal = mul1 * mul2; 

Console.WriteLine("{0} {1}", floatResNormal, floatResShift);

Output:

637,4223 637,4043
Serj-Tm
  • 16,581
  • 4
  • 54
  • 61
0

Check this code

/*
Fixed Point Arithmatic structure and relevant methods. Simple fixed point structure included as well.

Created from information and code gathered here: http://stackoverflow.com/questions/605124/fixed-point-math-in-c

May be used for anything without permission.

To quote the original author (x4000 of stackoverflow.com):
"The accuracy of these functions as they are coded here is more than enough for my purposes, but if you need more you can increase the SHIFT AMOUNT on FInt.
Just be aware that if you do so, the constants on [trigonomic] functions will then need to be divided by 4096 and then multiplied by whatever your new SHIFT AMOUNT requires.
You're likely to run into some bugs if you do that and aren't careful, so be sure to run checks against the built-in Math functions to make sure that your results aren't
being put off by incorrectly adjusting a constant."

Code credit: x4000 of stackoverflow.com

Compiled into a usable source file by: Paul Bergeron

Date: 7/1/2009

More fixed point functions can be found written in Java here: http://home.comcast.net/~ohommes/MathFP/

*/

public struct FInt
{
    public long RawValue;
    public const int SHIFT_AMOUNT = 12; //12 is 4096

    public const long One = 1 << SHIFT_AMOUNT;
    public const int OneI = 1 << SHIFT_AMOUNT;
    public static FInt OneF = new FInt( 1, true );

    #region Constructors
    public FInt( long StartingRawValue, bool UseMultiple )
    {
        this.RawValue = StartingRawValue;
        if ( UseMultiple )
            this.RawValue = this.RawValue << SHIFT_AMOUNT;
    }
    public FInt( double DoubleValue )
    {
        DoubleValue *= (double)One;
        this.RawValue = (int)Math.Round( DoubleValue );
    }
    #endregion

    public int IntValue
    {
        get { return (int)( this.RawValue >> SHIFT_AMOUNT ); }
    }

    public int ToInt()
    {
        return (int)( this.RawValue >> SHIFT_AMOUNT );
    }

    public double ToDouble()
    {
        return (double)this.RawValue / (double)One;
    }

    public FInt Inverse
    {
        get { return new FInt( -this.RawValue, false ); }
    }

    #region FromParts
    /// <summary>
    /// Create a fixed-int number from parts.  For example, to create 1.5 pass in 1 and 500.
    /// </summary>
    /// <param name="PreDecimal">The number above the decimal.  For 1.5, this would be 1.</param>
    /// <param name="PostDecimal">The number below the decimal, to three digits.
    /// For 1.5, this would be 500. For 1.005, this would be 5.</param>
    /// <returns>A fixed-int representation of the number parts</returns>
    public static FInt FromParts( int PreDecimal, int PostDecimal )
    {
        FInt f = new FInt( PreDecimal );
        if ( PostDecimal != 0 )
            f.RawValue += ( new FInt( PostDecimal ) / 1000 ).RawValue;

        return f;
    }
    #endregion

    #region *
    public static FInt operator *( FInt one, FInt other )
    {
        return new FInt( ( one.RawValue * other.RawValue ) >> SHIFT_AMOUNT, false );
    }

    public static FInt operator *( FInt one, int multi )
    {
        return one * (FInt)multi;
    }

    public static FInt operator *( int multi, FInt one )
    {
        return one * (FInt)multi;
    }
    #endregion

    #region /
    public static FInt operator /( FInt one, FInt other )
    {
        return new FInt( ( one.RawValue << SHIFT_AMOUNT ) / ( other.RawValue  ), false );
    }

    public static FInt operator /( FInt one, int divisor )
    {
        return one / (FInt)divisor;
    }

    public static FInt operator /( int divisor, FInt one )
    {
        return (FInt)divisor / one;
    }
    #endregion

    #region %
    public static FInt operator %( FInt one, FInt other )
    {
        return new FInt( ( one.RawValue ) % ( other.RawValue ), false );
    }

    public static FInt operator %( FInt one, int divisor )
    {
        return one % (FInt)divisor;
    }

    public static FInt operator %( int divisor, FInt one )
    {
        return (FInt)divisor % one;
    }
    #endregion

    #region +
    public static FInt operator +( FInt one, FInt other )
    {
        return new FInt( one.RawValue + other.RawValue, false );
    }

    public static FInt operator +( FInt one, int other )
    {
        return one + (FInt)other;
    }

    public static FInt operator +( int other, FInt one )
    {
        return one + (FInt)other;
    }
    #endregion

    #region -
    public static FInt operator -( FInt one, FInt other )
    {
        return new FInt( one.RawValue - other.RawValue, false );
    }

    public static FInt operator -( FInt one, int other )
    {
        return one - (FInt)other;
    }

    public static FInt operator -( int other, FInt one )
    {
        return (FInt)other - one;
    }
    #endregion

    #region ==
    public static bool operator ==( FInt one, FInt other )
    {
        return one.RawValue == other.RawValue;
    }

    public static bool operator ==( FInt one, int other )
    {
        return one == (FInt)other;
    }

    public static bool operator ==( int other, FInt one )
    {
        return (FInt)other == one;
    }
    #endregion

    #region !=
    public static bool operator !=( FInt one, FInt other )
    {
        return one.RawValue != other.RawValue;
    }

    public static bool operator !=( FInt one, int other )
    {
        return one != (FInt)other;
    }

    public static bool operator !=( int other, FInt one )
    {
        return (FInt)other != one;
    }
    #endregion

    #region >=
    public static bool operator >=( FInt one, FInt other )
    {
        return one.RawValue >= other.RawValue;
    }

    public static bool operator >=( FInt one, int other )
    {
        return one >= (FInt)other;
    }

    public static bool operator >=( int other, FInt one )
    {
        return (FInt)other >= one;
    }
    #endregion

    #region <=
    public static bool operator <=( FInt one, FInt other )
    {
        return one.RawValue <= other.RawValue;
    }

    public static bool operator <=( FInt one, int other )
    {
        return one <= (FInt)other;
    }

    public static bool operator <=( int other, FInt one )
    {
        return (FInt)other <= one;
    }
    #endregion

    #region >
    public static bool operator >( FInt one, FInt other )
    {
        return one.RawValue > other.RawValue;
    }

    public static bool operator >( FInt one, int other )
    {
        return one > (FInt)other;
    }

    public static bool operator >( int other, FInt one )
    {
        return (FInt)other > one;
    }
    #endregion

    #region <
    public static bool operator <( FInt one, FInt other )
    {
        return one.RawValue < other.RawValue;
    }

    public static bool operator <( FInt one, int other )
    {
        return one < (FInt)other;
    }

    public static bool operator <( int other, FInt one )
    {
        return (FInt)other < one;
    }
    #endregion

    public static explicit operator int( FInt src )
    {
        return (int)( src.RawValue >> SHIFT_AMOUNT );
    }

    public static explicit operator FInt( int src )
    {
        return new FInt( src, true );
    }

    public static explicit operator FInt( long src )
    {
        return new FInt( src, true );
    }

    public static explicit operator FInt( ulong src )
    {
        return new FInt( (long)src, true );
    }

    public static FInt operator <<( FInt one, int Amount )
    {
        return new FInt( one.RawValue << Amount, false );
    }

    public static FInt operator >>( FInt one, int Amount )
    {
        return new FInt( one.RawValue >> Amount, false );
    }

    public override bool Equals( object obj )
    {
        if ( obj is FInt )
            return ( (FInt)obj ).RawValue == this.RawValue;
        else
            return false;
    }

    public override int GetHashCode()
    {
        return RawValue.GetHashCode();
    }

    public override string ToString()
    {
        return this.RawValue.ToString();
    }

    #region PI, DoublePI
    public static FInt PI = new FInt( 12868, false ); //PI x 2^12
    public static FInt TwoPIF = PI * 2; //radian equivalent of 260 degrees
    public static FInt PIOver180F = PI / (FInt)180; //PI / 180
    #endregion

    #region Sqrt
    public static FInt Sqrt( FInt f, int NumberOfIterations )
    {
        if ( f.RawValue < 0 ) //NaN in Math.Sqrt
            throw new ArithmeticException( "Input Error" );
        if ( f.RawValue == 0 )
            return (FInt)0;
        FInt k = f + FInt.OneF >> 1;
        for ( int i = 0; i < NumberOfIterations; i++ )
            k = ( k + ( f / k ) ) >> 1;

        if ( k.RawValue < 0 )
            throw new ArithmeticException( "Overflow" );
        else
            return k;
    }

    public static FInt Sqrt( FInt f )
    {
        byte numberOfIterations = 8;
        if ( f.RawValue > 0x64000 )
            numberOfIterations = 12;
        if ( f.RawValue > 0x3e8000 )
            numberOfIterations = 16;
        return Sqrt( f, numberOfIterations );
    }
    #endregion

    #region Sin
    public static FInt Sin( FInt i )
    {
        FInt j = (FInt)0;
        for ( ; i < 0; i += new FInt( 25736, false ) ) ;
        if ( i > new FInt( 25736, false ) )
            i %= new FInt( 25736, false );
        FInt k = ( i * new FInt( 10, false ) ) / new FInt( 714, false );
        if ( i != 0 && i != new FInt( 6434, false ) && i != new FInt( 12868, false ) &&
            i != new FInt( 19302, false ) && i != new FInt( 25736, false ) )
            j = ( i * new FInt( 100, false ) ) / new FInt( 714, false ) - k * new FInt( 10, false );
        if ( k <= new FInt( 90, false ) )
            return sin_lookup( k, j );
        if ( k <= new FInt( 180, false ) )
            return sin_lookup( new FInt( 180, false ) - k, j );
        if ( k <= new FInt( 270, false ) )
            return sin_lookup( k - new FInt( 180, false ), j ).Inverse;
        else
            return sin_lookup( new FInt( 360, false ) - k, j ).Inverse;
    }

    private static FInt sin_lookup( FInt i, FInt j )
    {
        if ( j > 0 && j < new FInt( 10, false ) && i < new FInt( 90, false ) )
            return new FInt( SIN_TABLE[i.RawValue], false ) +
                ( ( new FInt( SIN_TABLE[i.RawValue + 1], false ) - new FInt( SIN_TABLE[i.RawValue], false ) ) /
                new FInt( 10, false ) ) * j;
        else
            return new FInt( SIN_TABLE[i.RawValue], false );
    }

    private static int[] SIN_TABLE = {
        0, 71, 142, 214, 285, 357, 428, 499, 570, 641,
        711, 781, 851, 921, 990, 1060, 1128, 1197, 1265, 1333,
        1400, 1468, 1534, 1600, 1665, 1730, 1795, 1859, 1922, 1985,
        2048, 2109, 2170, 2230, 2290, 2349, 2407, 2464, 2521, 2577,
        2632, 2686, 2740, 2793, 2845, 2896, 2946, 2995, 3043, 3091,
        3137, 3183, 3227, 3271, 3313, 3355, 3395, 3434, 3473, 3510,
        3547, 3582, 3616, 3649, 3681, 3712, 3741, 3770, 3797, 3823,
        3849, 3872, 3895, 3917, 3937, 3956, 3974, 3991, 4006, 4020,
        4033, 4045, 4056, 4065, 4073, 4080, 4086, 4090, 4093, 4095,
        4096
    };
    #endregion

    private static FInt mul( FInt F1, FInt F2 )
    {
        return F1 * F2;
    }

    #region Cos, Tan, Asin
    public static FInt Cos( FInt i )
    {
        return Sin( i + new FInt( 6435, false ) );
    }

    public static FInt Tan( FInt i )
    {
        return Sin( i ) / Cos( i );
    }

    public static FInt Asin( FInt F )
    {
        bool isNegative = F < 0;
        F = Abs( F );

        if ( F > FInt.OneF )
            throw new ArithmeticException( "Bad Asin Input:" + F.ToDouble() );

        FInt f1 = mul( mul( mul( mul( new FInt( 145103 >> FInt.SHIFT_AMOUNT, false ), F ) -
            new FInt( 599880 >> FInt.SHIFT_AMOUNT, false ), F ) +
            new FInt( 1420468 >> FInt.SHIFT_AMOUNT, false ), F ) -
            new FInt( 3592413 >> FInt.SHIFT_AMOUNT, false ), F ) +
            new FInt( 26353447 >> FInt.SHIFT_AMOUNT, false );
        FInt f2 = PI / new FInt( 2, true ) - ( Sqrt( FInt.OneF - F ) * f1 );

        return isNegative ? f2.Inverse : f2;
    }
    #endregion

    #region ATan, ATan2
    public static FInt Atan( FInt F )
    {
        return Asin( F / Sqrt( FInt.OneF + ( F * F ) ) );
    }

    public static FInt Atan2( FInt F1, FInt F2 )
    {
        if ( F2.RawValue == 0 && F1.RawValue == 0 )
            return (FInt)0;

        FInt result = (FInt)0;
        if ( F2 > 0 )
            result = Atan( F1 / F2 );
        else if ( F2 < 0 )
        {
            if ( F1 >= 0 )
                result = ( PI - Atan( Abs( F1 / F2 ) ) );
            else
                result = ( PI - Atan( Abs( F1 / F2 ) ) ).Inverse;
        }
        else
            result = ( F1 >= 0 ? PI : PI.Inverse ) / new FInt( 2, true );

        return result;
    }
    #endregion

    #region Abs
    public static FInt Abs( FInt F )
    {
        if ( F < 0 )
            return F.Inverse;
        else
            return F;
    }
    #endregion

}

public struct FPoint
{
    public FInt X;
    public FInt Y;

    public FPoint( FInt X, FInt Y )
    {
        this.X = X;
        this.Y = Y;
    }

    public static FPoint FromPoint( Point p )
    {
        FPoint f = new FPoint();
        f.X = (FInt)p.X;
        f.Y = (FInt)p.Y;
        return f;
    }

    public static Point ToPoint( FPoint f )
    {
        return new Point( f.X.IntValue, f.Y.IntValue );
    }
}
edgarmtze
  • 24,683
  • 80
  • 235
  • 386
  • Thanks for code, but how to use that class, and why it so huge, in article which i commented: `In binary arithmetic this can be accomplished using bit shifts` is there any more shortest solution. – testCoder Dec 24 '12 at 18:54
  • Alright, that answer was to solve general fix point arithmetic, let me do a summary – edgarmtze Dec 24 '12 at 18:56