2

Symbolic calculations performed manually or by a computer algebra system may be faulty or hold only subject to certain assumptions. A classical example is sqrt(x^2) == x which is not true in general but it does hold if x is real and non-negative.

Are there examples where proof assistants/checkers such as Coq, Isabelle, HOL, Metamath, or others are used to certify correctness of symbolic calculations? In particular, I am interested in calculus and linear algebra examples such as solving definite or indefinite integrals, differential equations, and matrix equations.

Update: To be more concrete, it would be interesting to know whether there are examples of undergraduate assignments in calculus and linear algebra that could be formally solved (possibly with the help of a proof assistant) such that the solution can be automatically verified by a proof checker. A very simple example assignment for Lean is here.

calcert
  • 33
  • 4

3 Answers3

5

For the Coq proof assistant there are several libraries to help with that. One matching your request quite well is Coquelicot (https://gitlab.inria.fr/coquelicot/coquelicot). The Coquelicot team made an exercise and participated in the French baccalauréat - I would say comparable more to a college than a high school math exam - and finished proofs for a good part of the exercises. The proofs can be found in the examples here (https://gitlab.inria.fr/coquelicot/coquelicot/-/tree/master/examples). I thought about translating the exercises and solutions to English.

But this was quite a few years ago and meanwhile there are very powerful tools for specific applications. E.g. there is coq-interval (https://gitlab.inria.fr/coqinterval/interval) which fully automatically does Coq proofs of rather complicated inequalities, say that a high order polynomial matches a sine function in a certain interval with a certain maximum deviation. It does this by Taylor decomposition and computing upper bounds for the residual. It can also do error proofs for a wide range of numerical integrals. A new feature added recently is the ability to do proven correct plots.

A tool for proving in Coq the error between infinite precision real and floating point computations is Gappa (https://gitlab.inria.fr/gappa/gappa).

Another very interesting Coq development is CoRN (https://github.com/coq-community/corn), a formalization of constructive reals in Coq. Constructive Reals are true real numbers which do compute. Essentially a constructive real number is an algorithm to compute a number to any desired precision together with a proof that this algorithm converges. One can prove that such numbers fulfill all usual properties of real numbers. An interesting side effect of constructive reals is that they need only LPO as axiom, while in classical reals the existence of the real numbers itself is an axiom. Any computation you do in CoRN, say pi>3, is automatically proven correct.

All these tools are included in Coq Platform, a common distribution of the Coq proof assistant.

There is more and this is steadily increasing. I would say it is not that far in the future that we have a usable proven correct CAS.

M Soegtrop
  • 1,268
  • 2
  • 8
3

The only thing that comes to my mind is that Isabelle/HOL can replay SMT proofs (as produced e.g. by Z3 or CVC4), e.g. involving integer and real arithmetic. For computer algebra systems, I don't know of any comparable examples.

The problem is that computer algebra systems tend not to be set up in a way where they can output a detailed certificate for their simplifications – if they were able to do that, one could attempt to replay that in a theorem prover. But it would have to go beyond purely equational reasoning, since many rules (such as your example) require proving inequalities as preconditions.

If computer algebra systems were able to output a trace of their computations as a list of rewrite rules that were used, including how to prove each of their preconditions, one could in principle replay such a trace in a theorem prover – but that would of course require that every rule used by the CAS has a corresponding rule in the theorem prover (this is roughly how replaying SMT proofs works in Isabelle). However, I do not know of any projects like this.

There are, on the other hand, various examples where CASs are used to compute some easily verifiable (but hard to compute) result, e.g. factoring a polynomial, isolating the roots of a real polynomial, Wilf–Zeilberger witnesses, and then verifying that this is really a valid result in a theorem prover. However, this does not involve certifying the computation process of the CAS, just the result.

Manuel Eberl
  • 7,858
  • 15
  • 24
  • Thanks for your elaborate reply. CAS was just an example. The justification could also be produced manually. I was more thinking of an applied perspective on proof assistants where I am wondering whether one could formalize undergrad assignments in calculus and linear algebra which would make use of more general theorems already formalized. – calcert Nov 15 '21 at 18:02
  • 1
    I don't really understand your question then. Of course one can formalise undergraduate assignments with a theorem prover. Much more complicated things have been formalised in theorem provers. Now, whether any particular problem can be formalised in a particular system with particular library (without adding lots of missing library material first), that is another question. But in general I'd say there's a good chance that it is possible. (although the person doing it should perhaps not be a total beginner and know the library) – Manuel Eberl Nov 15 '21 at 22:29
  • 1
    As an example, I formalised the majority of a textbook on analytic number theory as part of my PhD thesis, and I wrote an Isabelle method that will solve most concrete undergraduate-level limit problems on real numbers fully automatically. – Manuel Eberl Nov 15 '21 at 22:30
2

for demonstration purposes, I prepared a small "fake exercise" both to illustrate what it means to verify a calculation and to illustrate the most graphical approaches available in Coq (this shows some of the things you can do in Nov. 2021).

It can be seen on github at github.com:ybertot/osxp_demos_coq, especially the file sin_properties.v.

The demonstration follows this path:

  • Show that we can state and prove automatically a statement giving a "safe approximation" of PI (that's the name of the mathematical constant in the Coq library).

  • Show that Coq can be used to plot a known mathematical function, in this case the sin function between 0 and PI. This relies on a connexion to gnuplot for the graphical display. I am afraid that gnuplot will not be included in the Coq platform mentioned by M. Soegtrop in another answer.

  • Show that we can also plot the function sin(1/x) using Coq A correct plot of function sin(1/x) between 0.01 and 0.2 (the plot is actually preserved as a pdf file on the github repository)

  • Show that a generic function plotter actually returns a misleading result in that case (the generic function plotter is gnuplot). An incorrect plot of function sin(1/x), the line does not reach value 1 between 0.03 and 0.04 as it should The misleading plot is also given in the github repository as a pdf file.

  • The next step is to show that we can prove guaranties of intervals for some computations, sometimes automatically using the interval tactic, and sometimes the interval tactic fails to conclude. The important point here is that the command fails to conclude, instead of giving an answer that cannot be trusted. When this happens, users can rely on knowledge and mathematical reasoning to obtain the desired result. The demo shows how to prove that for any x in a certain range, the sin function is guaranteed to have a positive value.

  • The next step is about proving that sin x < x for every positive x, it shows that mathematical reasoning can rely on various techniques of mathematics:

    • decomposing the interval in two parts,
    • using the mean value theorem,
    • computing the derivative of x - sin x (and this can be done automatically in Coq),
    • relying on the fact that cos is known to be strictly decreasing between 0 and PI.

This is just a short demo, which is also meant to explain how a theorem prover has to be used differently from a pocket calculator, because just returning an approximation without qualification for the value of a mathematical formula is a process that cannot really be trusted.

The original question also includes questions about computing integrals. The interval package also contains facilities for this.

Yves
  • 3,808
  • 12
  • 12