11

I have a Qt C++ application where there is a GUI thread in which some floating point calculation happens. It also opens QWebView where there is a flash player with some video.

It is obvious that closing of QWebView interfere on new next floating point operation. So pow(double, double) returns definite but incorrect values.

In one case it returned values 1000 times more then the correct one. Another time it returned 1.#inf when used with arguments pow(10.0, 2.0).

I have to mention that it is tested on different computers and is not specific to a particular CPU.

Do you have any suggestion about how to locate the place in Webkit that does something wrong with co-processor and how to prevent it?

Sample (x64 only)

Environment: Qt 4.7.4, C++, HTML and flowplayer

cpp

wrongpow::wrongpow(QWidget *parent, Qt::WFlags flags)
: QMainWindow(parent, flags)
{
 QVBoxLayout* layout = new QVBoxLayout(0);  
 m_view = new QWebView(this);  
 m_view->setMinimumSize(400, 400);
 m_view->settings()->setAttribute(QWebSettings::PluginsEnabled, true);
 m_view->settings()->setAttribute(QWebSettings::LocalContentCanAccessRemoteUrls, true);
 layout->addWidget(m_view);
 QDir dir(QApplication::applicationDirPath());
 dir.cd("media");
 m_view->load(QUrl(QFileInfo(dir, "index.html").absoluteFilePath()));

 QPushButton* button = new QPushButton(QLatin1String("Click on video start"), this);
 layout->addWidget(button);
 Q_ASSERT(connect(button, SIGNAL(clicked()), this, SLOT(closeView()))); 

 setLayout(layout);
 adjustSize();
}  
Q_SLOT void wrongpow::closeView()
{
 delete m_view;
 m_view = NULL;

 double wrongResult = pow(10.0, 2.0);
 Q_ASSERT(wrongResult == 100.0);
}

html

<div id='player' style='width:100%; height:100%;'>
    <object width='100%' height='100%' id='_494187117' name='_494187117' data='js/plugins/flowplayer-3.2.18.swf' type='application/x-shockwave-flash'>                      
        <param name='wmode' value='opaque'>         
        <param name='flashvars' value='config={&quot;clip&quot;:{&quot;url&quot;:&quot;mp4:vod/demo.flowplayer/buffalo_soldiers.mp4&quot;,&quot;scaling&quot;:&quot;fit&quot;,&quot;provider&quot;:&quot;hddn&quot;,&quot;live&quot;:true},&quot;plugins&quot;:{&quot;hddn&quot;:{&quot;url&quot;:&quot;js/plugins/flowplayer.rtmp-3.2.13.swf&quot;,&quot;netConnectionUrl&quot;:&quot;rtmp://r.demo.flowplayer.netdna-cdn.com/play&quot;}},&quot;canvas&quot;:{&quot;backgroundGradient&quot;:&quot;none&quot;}}'>
    </object>
</div>  

Here is a fully working program with sources: Download 15MB

Ezee
  • 4,214
  • 1
  • 14
  • 29
  • Please add code for reproducing the problem. How do you think does webkit interfere with pow()-function? – Sebastian Lange Jul 17 '14 at 06:31
  • @sebastian-lange It does interfere in some way. Look here: [Screenshot](http://i.imgur.com/FuquLxN.png) I found out that only the first call of pow returns wrong result. Also I'll do my best to make a working sample because there are thousands of code lines in the real application. – Ezee Jul 17 '14 at 07:33
  • 3
    @Ezee Your steps to reproduce are a screenshot of two lines of code in XCode? You completely misunderstand what these are for. We believe you. We are not asking to provide PROOF that it happens. We need to REPRODUCE the problem in order to give you the correct explanation. How are we supposed to go from a screenshot with two lines of codes to there? (Note that “we” is StackOverflow, not me in particular.) – Pascal Cuoq Jul 17 '14 at 07:36
  • Make a copy of your project, remove as much code as possible without losing the behaviour you have. Then when you have that, edit your question and add the full code of that example project. You might as well find the cause of the problem by doing that. – Tim Meyer Jul 17 '14 at 07:59
  • @pascal-cuoq Done. Reproduced the problem in a sample. Please let me know if I can add smth else to help to reproduce it. – Ezee Jul 17 '14 at 10:59
  • 3
    @Ezee This is a brilliant question and I will put a bounty on it if it is not answered in 48h. A general remark is that floating-point math is implemented inside a FPU that has a state (notably the current rounding mode). The state is changed in an imperative manner (to compute `2.0 + 3.0` rounded up, the sequence of instructions is “change to ‘round up’ mode, then compute `2.0 + 3.0`, then change rounding mode again—or not”). Some elementary floating-point functions (e.g. `sin` and `pow`) are implemented with subtle floating-point arithmetic that can go COMPLETELY wrong if the rounding-mode… – Pascal Cuoq Jul 17 '14 at 11:53
  • … has been left to a different value than usual by another part of the program (and this can be in a library or in a Framework such as webkit). This blog post shows an example of `sin` going completely wrong on Linux: http://blog.frama-c.com/index.php?post/2011/09/14/Linux-and-floating-point%3A-nearly-there – Pascal Cuoq Jul 17 '14 at 11:54
  • Have you tested without loading Flash? Flash Player is known for doing some strange things. – MrEricSir Jul 21 '14 at 19:11
  • @Ezee In your Zip file can you also provide us with your MSVCR90D.dll? I'm in need of a disassembly of your implementation of `pow()` to see what it is vulnerable to, IOW should I hunt for illicit alterations of the x87 control word or the SSE control word. – Iwillnotexist Idonotexist Jul 22 '14 at 01:57
  • 1
    MSDN states that both for the [x87](http://msdn.microsoft.com/en-us/library/ms235300.aspx) and [SSE](http://msdn.microsoft.com/en-us/library/yxty7t75.aspx) control words, _"A callee that modifies any of the fields within FPCSR must restore them before returning to its caller. Furthermore, a caller that has modified any of these fields must restore them to their standard values before invoking a callee unless by agreement the callee expects the modified values."_. The standard precision and rounding mode on entry/exit to any function is Double and Round-to-Nearest. Can you verify it's the case? – Iwillnotexist Idonotexist Jul 22 '14 at 03:13
  • @iwillnotexist-idonotexist Suprisingly, no. It isn't, if I did it correctly. I used `_controlfp` to get the control word and `_statusfp` to get the status word. The control word is always `0x8001F` which is OK. And the status word became `0x8001D` (overflow) after calculating `pow`. – Ezee Jul 23 '14 at 15:10
  • @iwillnotexist-idonotexist Added MSVCR90D.dll to the zip-file. – Ezee Jul 23 '14 at 15:22
  • @MrEricSir Without loading Flash it won't crash and it won't show video also. – Ezee Jul 23 '14 at 15:24
  • @iwillnotexist-idonotexist The registers window shows that SSE registers stayed unchanged afger removing m_view, but registers x87 ST0-ST3 became 1#SNAN. When `pow` had executed, ST5-ST7 became 1#IND and MXCSR bacame `0x1FAF`. CTRL was always `0x027F`. – Ezee Jul 23 '14 at 15:43
  • @Ezee The bounty is about to expire. Would you gather your last comments about additional investigation in an answer so that I can accept that? – Pascal Cuoq Jul 25 '14 at 11:21

3 Answers3

6

Reason of the incorrect result of pow is that x87 registers ST0-ST3 became 1#SNAN and so TAGS = 0xFF. It happens during destruction of QWebView containing a video flash object. The control words of x87 and SSE contain correct values. Tested Adove Flash library: NPSWF64_14_0_0_125.dll

It happens when WebCore::PluginView::stop method calls a destructor of the Adobe Flash plugin object.

NPError npErr = m_plugin->pluginFuncs()->destroy(m_instance, &savedData);

Here is the procedure (NPSWF64.dll) which spoil the registers (actually it uses MMX registers associated with x87 registers):

mov         qword ptr [rsp+8],rcx 
mov         qword ptr [rsp+10h],rdx 
mov         qword ptr [rsp+18h],r8 
push        rsi  
push        rdi  
push        rbx  
mov         rdi,qword ptr [rsp+20h] 
mov         rsi,qword ptr [rsp+28h] 
mov         rcx,qword ptr [rsp+30h] 
cmp         rcx,20h 
jle         000000002F9D8A2D 
sub         rcx,20h 

// writes wrong values to the registers:
movq        mm0,mmword ptr [rsi] 
movq        mm1,mmword ptr [rsi+8] 
movq        mm2,mmword ptr [rsi+10h] 
movq        mm3,mmword ptr [rsi+18h] 

add         rsi,20h 
movq        mmword ptr [rdi],mm0
movq        mmword ptr [rdi+8],mm1 
movq        mmword ptr [rdi+10h],mm2 
movq        mmword ptr [rdi+18h],mm3 
add         rdi,20h 
sub         rcx,20h 
jge         000000002F9D89F1 
add         rcx,20h 
rep movs    byte ptr [rdi],byte ptr [rsi] 
pop         rbx  
pop         rdi  
pop         rsi  
ret         

To prevent wrong calculation of pow caused by this bug it's needed to restore the register values. I use the simplest way to do that. When the plugin is destroyed I call pow with some arguments and it restores the registers. The next call will be correct.
There is a more complicated (but probably correct) way to do the same by writing new values to the registers using methods from float.h library.

Ezee
  • 4,214
  • 1
  • 14
  • 29
2

Just to add to this slightly, I think I've come across the same problem (in a 64 bit C# application that calls Math.Cos and also happens to show a Flash video in the .NET Windows Forms WebBrowser control).

Similarly, in my case, it seems that the following code in the 64 bit Flash OCX (16.0.0.305) leaves behind invalid values in the MM0, MM1, MM2, MM3 registers (this code appears to be some sort of fast memory copy):

C:\Windows\System32\Macromed\Flash\Flash64_16_0_0_305.ocx
00000000`2f9833c0 48894c2408      mov     qword ptr [rsp+8],rcx
00000000`2f9833c5 4889542410      mov     qword ptr [rsp+10h],rdx
00000000`2f9833ca 4c89442418      mov     qword ptr [rsp+18h],r8
00000000`2f9833cf 56              push    rsi
00000000`2f9833d0 57              push    rdi
00000000`2f9833d1 53              push    rbx
00000000`2f9833d2 488b7c2420      mov     rdi,qword ptr [rsp+20h]
00000000`2f9833d7 488b742428      mov     rsi,qword ptr [rsp+28h]
00000000`2f9833dc 488b4c2430      mov     rcx,qword ptr [rsp+30h]
00000000`2f9833e1 4881f920000000  cmp     rcx,20h
00000000`2f9833e8 7e43            jle     Flash64_16_0_0_305!DllUnregisterServer+0x3c2a2d (00000000`2f98342d)
00000000`2f9833ea 4881e920000000  sub     rcx,20h
00000000`2f9833f1 0f6f06          movq    mm0,mmword ptr [rsi]
00000000`2f9833f4 0f6f4e08        movq    mm1,mmword ptr [rsi+8]
00000000`2f9833f8 0f6f5610        movq    mm2,mmword ptr [rsi+10h]
00000000`2f9833fc 0f6f5e18        movq    mm3,mmword ptr [rsi+18h]
00000000`2f983400 4881c620000000  add     rsi,20h
00000000`2f983407 0f7f07          movq    mmword ptr [rdi],mm0
00000000`2f98340a 0f7f4f08        movq    mmword ptr [rdi+8],mm1
00000000`2f98340e 0f7f5710        movq    mmword ptr [rdi+10h],mm2
00000000`2f983412 0f7f5f18        movq    mmword ptr [rdi+18h],mm3
00000000`2f983416 4881c720000000  add     rdi,20h
00000000`2f98341d 4881e920000000  sub     rcx,20h
00000000`2f983424 7dcb            jge     Flash64_16_0_0_305!DllUnregisterServer+0x3c29f1 (00000000`2f9833f1)
00000000`2f983426 4881c120000000  add     rcx,20h
00000000`2f98342d f3a4            rep movs byte ptr [rdi],byte ptr [rsi]
00000000`2f98342f 5b              pop     rbx
00000000`2f983430 5f              pop     rdi
00000000`2f983431 5e              pop     rsi
00000000`2f983432 c3              ret

The above code in the Flash OCX is executed as the web browser navigates away from the page showing the Flash video.

Before this code executes the floating point registers were as follows (examined using WinDbg.exe):

fpcw=027f fpsw=3820 fptw=0080
st0= 6.00000000000000000000000...0e+0001 (0:4004:f000000000000000)
st1= 0.00000000000000000000000...0e+0000 (0:0000:0000000000000000)
st2= 0.00000000000000000000000...0e+0000 (0:0000:0000000000000000)
st3= 0.00000000000000000000000...0e+0000 (0:0000:0000000000000000)
st4= 0.00000000000000000000000...0e+0000 (0:0000:0000000000000000)
st5= 1.00000000000000000000000...0e+0000 (0:3fff:8000000000000000)
st6= 8.94231504669236176852000...0e-0001 (0:3ffe:e4ec5b1b9b742000)
st7= 0.00000000000000000000000...0e+0000 (0:0000:0000000000000000)
mxcsr=00001fa4

After the above code was executed the floating point registers were as follows:

fpcw=027f fpsw=0020 fptw=00ff
st0=-1.#SNAN000000000000000000...0e+0000 (1:7fff:0000000000000000)
st1=-1.#SNAN000000000000000000...0e+0000 (1:7fff:0000000000000000)
st2=-1.#SNAN000000000000000000...0e+0000 (1:7fff:0000000000000000)
st3=-1.#SNAN000000000000000000...0e+0000 (1:7fff:0000000000000000)
st4= 1.00000000000000000000000...0e+0000 (0:3fff:8000000000000000)
st5= 8.94231504669236176852000...0e-0001 (0:3ffe:e4ec5b1b9b742000)
st6= 0.00000000000000000000000...0e+0000 (0:0000:0000000000000000)
st7= 6.00000000000000000000000...0e+0001 (0:4004:f000000000000000)
mxcsr=00001fa4

At this point the floating point registers appear to be in a "corrupt" state.

So later, back in the C# program, when performing Math.Cos the fld opcode fails as it attempts to push the value 2.0 onto the floating point stack (ie. as it attempts to prepare for computing the cosine of 2.0):

COMDouble::Cos:
000007FE`E1D01570  movsd   mmword ptr [rsp+8],xmm0
000007FE`E1D01576  fld     qword ptr [rsp+8]
000007FE`E1D0157A  fcos
000007FE`E1D0157C  fstp    qword ptr [rsp+8]
000007FE`E1D01580  movsd   xmm0,mmword ptr [rsp+8]
000007FE`E1D01586  ret

Immediately before executing the fld opcode the floating point registers were exactly as they were left by the Flash OCX (ie. as above):

fpcw=027f fpsw=0020 fptw=00ff
st0=-1.#SNAN000000000000000000...0e+0000 (1:7fff:0000000000000000)
st1=-1.#SNAN000000000000000000...0e+0000 (1:7fff:0000000000000000)
st2=-1.#SNAN000000000000000000...0e+0000 (1:7fff:0000000000000000)
st3=-1.#SNAN000000000000000000...0e+0000 (1:7fff:0000000000000000)
st4= 1.00000000000000000000000...0e+0000 (0:3fff:8000000000000000)
st5= 8.94231504669236176852000...0e-0001 (0:3ffe:e4ec5b1b9b742000)
st6= 0.00000000000000000000000...0e+0000 (0:0000:0000000000000000)
st7= 6.00000000000000000000000...0e+0001 (0:4004:f000000000000000)
mxcsr=00001fa4

Immediately after the fld opcode was executed the floating point registers were as follows (notice that st0 is #IND rather than the expected value of 2.00000000000000000000000...0e+0000):

fpcw=027f fpsw=3a61 fptw=00ff
st0=-1.#IND0000000000000000000...0e+0000 (1:7fff:c000000000000000)
st1=-1.#SNAN000000000000000000...0e+0000 (1:7fff:0000000000000000)
st2=-1.#SNAN000000000000000000...0e+0000 (1:7fff:0000000000000000)
st3=-1.#SNAN000000000000000000...0e+0000 (1:7fff:0000000000000000)
st4=-1.#SNAN000000000000000000...0e+0000 (1:7fff:0000000000000000)
st5= 1.00000000000000000000000...0e+0000 (0:3fff:8000000000000000)
st6= 8.94231504669236176852000...0e-0001 (0:3ffe:e4ec5b1b9b742000)
st7= 0.00000000000000000000000...0e+0000 (0:0000:0000000000000000)
mxcsr=00001fa4

This then means that the fcos opcode attempts to compute the cosine of #IND. And so it ends up that the return value of Math.Cos in C# is Double.NaN instead of the expected -0.416146836547142.

Dissecting the floating point status word (FPSW) value of 0x3a61 above indicates that the problem is "an attempt to load a value into a register which is not free":

3a61: 0011 1010 0110 0001

    TOP (13,12,11):          111
    C3,C2,C1,C0 (14,10,9,8): 0 010  (ie. C1 is 1)  <-- loading a value into a register which is not free
    IR (7):                  0  Interrupt Request
    SF (6):                  1  Stack Fault  <-- loading a value into a register which is not free
    P (5):                   1  Precision
    U (4):                   0  Underflow
    O (3):                   0  Overflow
    Z (2):                   0  Zero Divide
    D (1):                   0  Denormalised
    I (0):                   1  Invalid Operation

Note that the problem only happens the first time that Math.Cos(2) is attempted (ie. soon after the navigation away from the Flash video page). Second and subsequent attempts to calculate Math.Cos(2) succeed.

The problem also occurs if the WebBrowser control is disposed while a Flash video is playing (or is paused).

So some potential workarounds are:

  1. Perform one dummy math computation and ignore the result.

  2. Write a 64 bit DLL which exports a function that executes the finit or emms opcodes (to reset the state of the floating point registers). Call that function from C#.

  3. Run as a 32 bit process.

Note: throwing a dummy exception in C# and then catching that exception did not help (this is suggested for other floating point corruption problems).

Here is some C# code that reproduces the problem (ensure this compiles and runs as 64 bit; click the Calculate button after the video has started playing).

using System;
using System.IO;
using System.Windows.Forms;

namespace FlashTest
{
    public partial class TestForm : Form
    {
        public TestForm()
        {
            // Windows 7 SP1 64 bit
            // Internet Explorer 11 (11.0.9600.17633)
            // Flash Player 16.0.0.305
            // Visual Studio 2013 (12.0.31101.00)
            // .NET 4.5 (4.0.30319.34209)

            InitializeComponent();
            addressTextBox.Text = "http://www.youtube.com/v/JVGdyC9CvFQ?autoplay=1";
            GoButtonClickHandler(this, EventArgs.Empty);
        }

        private void GoButtonClickHandler(object sender, EventArgs e)
        {
            string path = Path.Combine(Path.GetTempPath(), Path.GetRandomFileName() + ".html");
            File.WriteAllText(path, string.Format(@"<html><body>
                <object classid=""clsid:D27CDB6E-AE6D-11CF-96B8-444553540000"" width=""100%"" height=""100%"" id=""youtubeviewer"">
                    <param name=""movie"" value=""{0}"">
                </object></body></html>", addressTextBox.Text));
            webBrowser.Navigate(path);
        }

        private void CalculateButtonClickHandler(object sender, EventArgs e)
        {
            webBrowser.DocumentCompleted += DocumentCompletedHandler;
            webBrowser.Navigate("about:blank");
        }

        private void DocumentCompletedHandler(object sender, WebBrowserDocumentCompletedEventArgs e)
        {
            webBrowser.DocumentCompleted -= DocumentCompletedHandler;
            MessageBox.Show("Math.Cos(2) returned " + Math.Cos(2));
        }
    }
}

namespace FlashTest
{
    partial class TestForm
    {
        /// <summary>
        /// Required designer variable.
        /// </summary>
        private System.ComponentModel.IContainer components = null;

        /// <summary>
        /// Clean up any resources being used.
        /// </summary>
        /// <param name="disposing">true if managed resources should be disposed; otherwise, false.</param>
        protected override void Dispose(bool disposing)
        {
            if (disposing && (components != null))
            {
                components.Dispose();
            }
            base.Dispose(disposing);
        }

        #region Windows Form Designer generated code

        /// <summary>
        /// Required method for Designer support - do not modify
        /// the contents of this method with the code editor.
        /// </summary>
        private void InitializeComponent()
        {
            this.webBrowser = new System.Windows.Forms.WebBrowser();
            this.addressTextBox = new System.Windows.Forms.TextBox();
            this.goButton = new System.Windows.Forms.Button();
            this.calculateButton = new System.Windows.Forms.Button();
            this.SuspendLayout();
            // 
            // webBrowser
            // 
            this.webBrowser.Anchor = ((System.Windows.Forms.AnchorStyles)((((System.Windows.Forms.AnchorStyles.Top | System.Windows.Forms.AnchorStyles.Bottom) 
            | System.Windows.Forms.AnchorStyles.Left) 
            | System.Windows.Forms.AnchorStyles.Right)));
            this.webBrowser.Location = new System.Drawing.Point(12, 41);
            this.webBrowser.MinimumSize = new System.Drawing.Size(20, 20);
            this.webBrowser.Name = "webBrowser";
            this.webBrowser.Size = new System.Drawing.Size(560, 309);
            this.webBrowser.TabIndex = 3;
            // 
            // addressTextBox
            // 
            this.addressTextBox.Anchor = ((System.Windows.Forms.AnchorStyles)(((System.Windows.Forms.AnchorStyles.Top | System.Windows.Forms.AnchorStyles.Left) 
            | System.Windows.Forms.AnchorStyles.Right)));
            this.addressTextBox.Location = new System.Drawing.Point(12, 14);
            this.addressTextBox.Name = "addressTextBox";
            this.addressTextBox.Size = new System.Drawing.Size(398, 20);
            this.addressTextBox.TabIndex = 0;
            // 
            // goButton
            // 
            this.goButton.Anchor = ((System.Windows.Forms.AnchorStyles)((System.Windows.Forms.AnchorStyles.Top | System.Windows.Forms.AnchorStyles.Right)));
            this.goButton.Location = new System.Drawing.Point(416, 12);
            this.goButton.Name = "goButton";
            this.goButton.Size = new System.Drawing.Size(75, 23);
            this.goButton.TabIndex = 1;
            this.goButton.Text = "&Go";
            this.goButton.UseVisualStyleBackColor = true;
            this.goButton.Click += new System.EventHandler(this.GoButtonClickHandler);
            // 
            // calculateButton
            // 
            this.calculateButton.Anchor = ((System.Windows.Forms.AnchorStyles)((System.Windows.Forms.AnchorStyles.Top | System.Windows.Forms.AnchorStyles.Right)));
            this.calculateButton.Location = new System.Drawing.Point(497, 12);
            this.calculateButton.Name = "calculateButton";
            this.calculateButton.Size = new System.Drawing.Size(75, 23);
            this.calculateButton.TabIndex = 2;
            this.calculateButton.Text = "&Calculate";
            this.calculateButton.UseVisualStyleBackColor = true;
            this.calculateButton.Click += new System.EventHandler(this.CalculateButtonClickHandler);
            // 
            // TestForm
            // 
            this.AutoScaleDimensions = new System.Drawing.SizeF(6F, 13F);
            this.AutoScaleMode = System.Windows.Forms.AutoScaleMode.Font;
            this.ClientSize = new System.Drawing.Size(584, 362);
            this.Controls.Add(this.webBrowser);
            this.Controls.Add(this.goButton);
            this.Controls.Add(this.addressTextBox);
            this.Controls.Add(this.calculateButton);
            this.Name = "TestForm";
            this.Text = "Adobe Flash Test";
            this.ResumeLayout(false);
            this.PerformLayout();

        }

        #endregion

        private System.Windows.Forms.WebBrowser webBrowser;
        private System.Windows.Forms.TextBox addressTextBox;
        private System.Windows.Forms.Button goButton;
        private System.Windows.Forms.Button calculateButton;
    }
}
1

Apparently you hit a bug in the webkit version of your Qt. I cannot reproduce in QT 5.3 with QtCreator builds using MSVC 13 x64 in release and debug.

There are already some bugs reported for QtWebKit with floating points:

fjardon
  • 7,921
  • 22
  • 31