Formal verification of containers

Discussion:

Joachim Breitner

2018-02-01 04:17:50 UTC

Hi containers maintainers,

you might have already read it between the lines, but some people here
at Penn are currently using hs-to-coq to translate the containers
library to Coq and verify it. We just reached a milestone by verifying
a portion of Data.IntSet large enough to instantiate Coqâs
FSetInterface:
https://github.com/antalsz/hs-to-coq/blob/master/examples/containers/theories/IntSetProofs.v

There is work on verifying Data.Set as well:
https://github.com/antalsz/hs-to-coq/blob/master/examples/containers/theories/SetProofs.v

Isnât that exciting?

It raises the question whether there is value in integrating these
proofs in some way into the development process of containers. One
could imagine adding a job to travis that translates containerâs master
into Coq and checks if the proofs go through.

On the one hand, thatâs of course great, because itâd make containers
one of the most thoroughly verified pieces of Haskell in use out there.

On the other hand, weâd have to choose the least bad of a bunch of bad
options when it comes to future contributions (I am listing unfeasible
options as well):

- we essentially freeze the code
- weâd lose the verification again quickly, once someone makes a non-
trivial change to the code
- weâd have to restrict contributions to those who are able to update
the proofs
- weâd still welcome contributions that break the proofs, but let them

sit in pull requests until someone updates the proofs on the PR and
then merge
- (maybe a bit far fetched, but who knows)
we get funding to pay someone to keep the proofs in sync, without
slowing down other contributions

I honestly donât know which of these are the most viable, or if none of
them make any sense and we should not hook up the verification with
containerâs development.

What do you think?

Cheers,
Joachim

--
Joachim Breitner
***@joachim-breitner.de
http://www.joachim-breitner.de/

David Feuer

2018-02-01 05:02:14 UTC

Permalink

I don't think we should limit containers development like that. I do
imagine it would be valuable to have a separate package, perhaps called
verified-containers, that tries to stay a few weeks or months behind
containers.

Post by Joachim Breitner
Hi containers maintainers,
you might have already read it between the lines, but some people here
at Penn are currently using hs-to-coq to translate the containers
library to Coq and verify it. We just reached a milestone by verifying
a portion of Data.IntSet large enough to instantiate Coqâs
https://github.com/antalsz/hs-to-coq/blob/master/examples/
containers/theories/IntSetProofs.v
https://github.com/antalsz/hs-to-coq/blob/master/examples/
containers/theories/SetProofs.v
Isnât that exciting?
It raises the question whether there is value in integrating these
proofs in some way into the development process of containers. One
could imagine adding a job to travis that translates containerâs master
into Coq and checks if the proofs go through.
On the one hand, thatâs of course great, because itâd make containers
one of the most thoroughly verified pieces of Haskell in use out there.
On the other hand, weâd have to choose the least bad of a bunch of bad
options when it comes to future contributions (I am listing unfeasible
- we essentially freeze the code
- weâd lose the verification again quickly, once someone makes a non-
trivial change to the code
- weâd have to restrict contributions to those who are able to update
the proofs
- weâd still welcome contributions that break the proofs, but let them
sit in pull requests until someone updates the proofs on the PR and
then merge
- (maybe a bit far fetched, but who knows)
we get funding to pay someone to keep the proofs in sync, without
slowing down other contributions
I honestly donât know which of these are the most viable, or if none of
them make any sense and we should not hook up the verification with
containerâs development.
What do you think?
Cheers,
Joachim
--
Joachim Breitner
http://www.joachim-breitner.de/
_______________________________________________
Libraries mailing list
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

Henning Thielemann

2018-02-01 10:42:38 UTC

Permalink

Post by David Feuer
I don't think we should limit containers development like that. I do
imagine it would be valuable to have a separate package, perhaps called
verified-containers, that tries to stay a few weeks or months behind
containers.

I am happy to change all my dependencies from 'containers' to
'containers-verified' (for easier association in a lexicographically
ordered list).

Joachim Breitner

2018-02-01 14:15:01 UTC

Permalink

Post by Henning Thielemann

I am happy to change all my dependencies from 'containers' to
'containers-verified' (for easier association in a lexicographically
ordered list).

that is an interesting idea, and a nicely generalizable pattern,
David.

I guess there are a view variants:

Should containers-verified export only the verified subset of the API,
or all of it? The former is more honest, the latter easier to switch to
for users.

Should it
a) simply depend on a particular version of containers, and re-export
the functions, or
b) be a real code copy, or
c) be a code copy with the exception of the data types, so that you
can interoperate with the existing containers library?

b (maybe even with hiding the âŠ.Internals module) prevents users from
creating non-well-formed trees, but it would prevent you from using
containers-verified in a library that exports the types as part of its
API. Maybe one could add âtoVerifiedâ and âfromVerifiedâ functions that
convert between the two, with run-time checks?

Cheers,
Joachim

--
Joachim Breitner
***@joachim-breitner.de
http://www.joachim-breitner.de/

Henning Thielemann

2018-02-01 14:35:30 UTC

Permalink

Post by Joachim Breitner
Should containers-verified export only the verified subset of the API,
or all of it? The former is more honest, the latter easier to switch to
for users.

... then the question: Should containers-verified export Data.Map or
Data.Map.Verified? If it exports Data.Map but not all other modules of
'containers', then there would be no way to access all modules of
'containers'. So maybe containers-verified should export
Data.Map.Verified. But then you can as well add Data.Map.Verified to
'containers'.

Joachim Breitner

2018-02-01 16:09:46 UTC

Permalink

Hi,

Post by Henning Thielemann

Post by Joachim Breitner
Should containers-verified export only the verified subset of the API,
or all of it? The former is more honest, the latter easier to switch to
for users.

Valid point. I guess we should just verify all of it :-)

There is the option of using package-qualified imports!
https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/glasgow_exts.html?package-qualified-imports

The question would be whether we put a little burden on those users who
need to mix-and-match, or on those who would be happy with the portion
of the API provided by containers-verified.

Not renaming also makes it easier to merge changes from containers, but
not by a lot.

Post by Henning Thielemann
But then you can as well add Data.Map.Verified to
'containers'.

Not really, the main reason to split it off is to not impose the burden
of the verification infrastructure on containerâs maintainers and/or
contributors.

Cheers,
Joachim

--
Joachim ânomeataâ Breitner
***@joachim-breitner.de
https://www.joachim-breitner.de/

David Feuer

2018-02-01 16:58:23 UTC

Permalink

containers-verified should only export the verified part, for sure. I think
it should mostly re-export, but if it needs to use a different version of a
function (perhaps temporarily) then it can do so. For example, if
containers makes a function faster at the expense of some complication,
then containers-verified may want to stick with the old version until the
new one can be verified (or someone can come up with a simpler fast one).

Post by Joachim Breitner
Hi

Post by Henning Thielemann

I am happy to change all my dependencies from 'containers' to
'containers-verified' (for easier association in a lexicographically
ordered list).

that is an interesting idea, and a nicely generalizable pattern,
David.
Should containers-verified export only the verified subset of the API,
or all of it? The former is more honest, the latter easier to switch to
for users.
Should it
a) simply depend on a particular version of containers, and re-export
the functions, or
b) be a real code copy, or
c) be a code copy with the exception of the data types, so that you
can interoperate with the existing containers library?
b (maybe even with hiding the âŠ.Internals module) prevents users from
creating non-well-formed trees, but it would prevent you from using
containers-verified in a library that exports the types as part of its
API. Maybe one could add âtoVerifiedâ and âfromVerifiedâ functions that
convert between the two, with run-time checks?
Cheers,
Joachim
--
Joachim Breitner
http://www.joachim-breitner.de/
_______________________________________________
Libraries mailing list
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

Tom Murphy

2018-02-02 02:50:06 UTC

Permalink

(My 2 cents, not as a containers developer)

First of all, bravo! This is great news.

Post by Joachim Breitner
- weâd still welcome contributions that break the proofs, but let them
sit in pull requests until someone updates the proofs on the PR and
then merge

If we imagine, as David does, that updating proofs will take a couple of
weeks (or months), I'd imagine an advantage of this option is that:
- The percent of containers users who only use features older than weeks
or months from bleeding edge: 95+%
- The percent of end-users who will realistically switch their code from
containers to containers-verified: certainly less than 20%

If we can seamlessly switch the vast majority of users to this new verified
code, and be able to say that a ubiquitous component of Haskell development
is formally verified, I'd say that's a huge accomplishment.

An alternative proposal would be to have a "containers-unverified"
*upstream* instead of a "containers-verified" downstream. This would also
neatly sidestep the problem of if "containers-verified" should export
unverified code: it's just "containers" as usual, just with some extra
verification.

Tom

David Feuer

2018-02-02 03:03:09 UTC

Permalink

It sounds nice, but keeping the verified version up to date requires a
steady supply of very highly skilled labor. I personally don't think it
reasonable to even consider making containers itself the verified version
unless the verifiers are able to

1. Stay fairly up to date for a good while (say five or six years, after
the current researchers have gotten their degrees/professorships), AND
2. At the end of that period, explain their plan for continuing the effort.

Post by Tom Murphy
(My 2 cents, not as a containers developer)
First of all, bravo! This is great news.
On Wed, Jan 31, 2018 at 11:17 PM, Joachim Breitner <

Post by Joachim Breitner
- weâd still welcome contributions that break the proofs, but let them
sit in pull requests until someone updates the proofs on the PR and
then merge

If we imagine, as David does, that updating proofs will take a couple of
- The percent of containers users who only use features older than weeks
or months from bleeding edge: 95+%
- The percent of end-users who will realistically switch their code from
containers to containers-verified: certainly less than 20%
If we can seamlessly switch the vast majority of users to this new
verified code, and be able to say that a ubiquitous component of Haskell
development is formally verified, I'd say that's a huge accomplishment.
An alternative proposal would be to have a "containers-unverified"
*upstream* instead of a "containers-verified" downstream. This would also
neatly sidestep the problem of if "containers-verified" should export
unverified code: it's just "containers" as usual, just with some extra
verification.
Tom
_______________________________________________
Libraries mailing list
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

Gershom B

2018-02-02 03:17:43 UTC

Permalink

In the initial message, Joachim suggested âOne could imagine adding a job to travis that translates containerâs master into Coq and checks if the proofs go through.â

It would seem to me that an ideal situation would be if there were âproof-testsâ similar to doctests. Essentially, portions of code that have corresponding Coq proofs about them have an annotation that indicates the proofs, and indicates to the automated job that they are to be tested against those proofs. When changes are made that do not have corresponding updates to the proofs, then the corresponding annotations would need to be commented-out (or perhaps, replaced with some other annotation that explicitly indicates a skipped or missing proof).

This would let the proper containers core be verified, with mechanically-checked annotations indicating where the verification hasd and has not been performaed. Some sort of re-exported subset of only the verified stuff could then be automatically extracted from this setup for users that wanted to restrict themselves to only this subset.

âg

David Feuer

2018-02-02 03:25:21 UTC

Permalink

Oh, I'm perfectly fine with *checking* everything. But to give the users
who want verification a consistent API, the verified package will probably
have to include old or modified versions of some functions or even modules.

Post by Gershom B
In the initial message, Joachim suggested âOne could imagine adding a job
to travis that translates containerâs master into Coq and checks if the
proofs go through.â
It would seem to me that an ideal situation would be if there were
âproof-testsâ similar to doctests. Essentially, portions of code that have
corresponding Coq proofs about them have an annotation that indicates the
proofs, and indicates to the automated job that they are to be tested
against those proofs. When changes are made that do not have corresponding
updates to the proofs, then the corresponding annotations would need to be
commented-out (or perhaps, replaced with some other annotation that
explicitly indicates a skipped or missing proof).
This would let the proper containers core be verified, with
mechanically-checked annotations indicating where the verification hasd and
has not been performaed. Some sort of re-exported subset of only the
verified stuff could then be automatically extracted from this setup for
users that wanted to restrict themselves to only this subset.
âg

Joachim Breitner

2018-02-02 04:01:42 UTC

Permalink

Hi,

Post by David Feuer
Oh, I'm perfectly fine with *checking* everything. But to give the
users who want verification a consistent API, the verified package
will probably have to include old or modified versions of some
functions or even modules.

at the moment, we donât have to modify containers. To be precise, we
donât have to modify itâs source. The translation to Coq does a few
modification, such as highestBitMask, which uses raw pointers, which we
cannot translate.

I donât expect that this will change. But it is true that if containers
changes their code in a way that breaks the proofs (which is not every
change â we work on what GHC sees after parsing and renaming, so a fair
amount of changes might not even affect the proofs), then containers-
verified would have to stick with the old version.

A viable approach might be a containers-verified package that depends
on a precise version of containers and re-exports the verified subset
of containers. It might lag a few versions behind, but for something
stable like containers, that might still be useful for people who want
bragging rights about using verified code.

Practically, the verification has not uncovered any bugs yet (unless
one counts the incomplete implementation of `valid` for IntSet, but
that only affects the test suite), so users will not gain too muchâŠ

Cheers,
Joachim

--
Joachim Breitner
***@joachim-breitner.de
http://www.joachim-breitner.de/

Joachim Breitner

2018-02-08 23:21:02 UTC

Permalink

Hi,

Post by Joachim Breitner
A viable approach might be a containers-verified package that depends
on a precise version of containers and re-exports the verified subset
of containers. It might lag a few versions behind, but for something
stable like containers, that might still be useful for people who want
bragging rights about using verified code.

here is how this could look like:
https://hackage.haskell.org/package/containers-verified-0.5.11.0/candidate
https://github.com/nomeata/containers-verified

I hope that we can extend the covered API somemore in the next weeks :-)

Cheers,
Joachim

--
Joachim Breitner
***@joachim-breitner.de
http://www.joachim-breitner.de/

David Feuer

2018-02-08 23:50:28 UTC

Permalink

Interesting! I look forward to seeing more. One possible twist would be to
wrap the re-exported types in newtype wrappers to hide the unverified
instances, perhaps in a separate Wrapped module. But that might be more
trouble than it's worth. When you get to it, there are some hard-coded size
limits in Data.Map.alterF for word sizes below 61 bits that I would *love*
to see checked formally; I trust those less than pretty much anything else
in the package, and testing them is hard.

Post by Joachim Breitner
Hi,

https://hackage.haskell.org/package/containers-verified-0.5.11.0/candidate
https://github.com/nomeata/containers-verified
I hope that we can extend the covered API somemore in the next weeks :-)
Cheers,
Joachim
--
Joachim Breitner
http://www.joachim-breitner.de/
_______________________________________________
Libraries mailing list
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries