Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Advice on refreshing the ocaml extraction target? #16

Open
Armael opened this issue Sep 29, 2022 · 4 comments
Open

Advice on refreshing the ocaml extraction target? #16

Armael opened this issue Sep 29, 2022 · 4 comments

Comments

@Armael
Copy link
Contributor

Armael commented Sep 29, 2022

Hi, very nice project!

Next week I will be at the 11th MirageOS retreat; my project for the week would be to try to look at FSCQ and see how hard it would be to extract it to OCaml and package it as a MirageOS-compatible library.
I've only quickly looked at the makefile/build instructions currently; it looked to me that there there is some existing setup for OCaml extraction but that ExtractOcaml.v might be outdated?

More generally speaking, I wondered if you'd have any relevant advice that could be helpful for the task :-).
I'm also interested on any information on the structure of the development / what are the entrypoints of the Coq library / the API that should be exposed or not, if you have :-).

@zeldovich
Copy link
Member

Hi,

Thanks for your interest! This code base isn't actively maintained anymore, and from a cursory look, it appears to have bit-rotted: at least the first thing that fails is due to the fact that the latest version of Coq no longer supports the omega tactic, so FSCQ would need to be updated to use lia instead.

I think the Ocaml extraction did work at one point, but I don't remember what its state was when we last worked on FSCQ. The main thing to look at is whether the Ocaml native code (e.g., in mllib) is up-to-date with the Haskell equivalent (e.g., in hslib). There's also an Ocaml fuse wrapper in ocamlfuse, which we imported from a third party; I don't recall how robust it is compared to the Haskell fuse wrapper we used, but this part might not matter as much for your use case.

Our more recent work on verifying file systems is https://github.com/mit-pdos/daisy-nfsd and its underlying components like https://github.com/mit-pdos/go-journal but it's written in Go, so it might be less appealing for your Ocaml use case.

@Armael
Copy link
Contributor Author

Armael commented Sep 30, 2022

Thanks for the quick answer and for the details!
I'll keep you updated wrt what I manage to do regarding bitrot. Comparing the status of the current ocaml code with the haskell equivalent is also good advice, thanks.
(Your more recent work looks indeed interesting as well, but as you mention it is probably harder to directly compile it as an OCaml library.)

@Armael
Copy link
Contributor Author

Armael commented Oct 10, 2022

Hello again! Reporting back from the progress I made during the retreat :-).

  • I made the Coq proofs work again, after some minor tweaks. I submitted this as PR Repair proofs (coq 8.13) #17. This is still with Coq 8.13, and there are still warnings (related to uses of omega and hint declarations) -- I worked on fixing the warnings in another branch, but got stuck near the end; I'll try to revisit this at some point later.
  • I refreshed the OCaml wrapper code to be up-to-date with the Haskell equivalent, and fixed the bugs I could find through light testing (as always, fixing the bugs took more time than writing the bugs, especially ones caused by slightly wrong implementations in word.ml :-)). I've attached the current state of that work as PR Extraction to OCaml + OCaml wrapper code #18 (based on top of Repair proofs (coq 8.13) #17).
  • The main remaining issue seem to be performance. The main common cause seems to be differences between eager and lazy evaluation. Some cases I could easily fix using +/- ad-hoc ways; but there seem to be a more general issue of the code doing list manipulations that are (I assume?) efficient enough with lazy lists, but terribly inefficient with strict lists. This is e.g. repeatedly appending lists, or, a more surprising instance, calling repeat to create a list with ~200M elements (indread in BlockPtr.v).
  • Possible ways to address the performance issues could be to swap out at extraction time the lists with lazy lists (but I wouldn't be surprised if OCaml lazy lists are less efficient than Haskell lazy lists), or maybe a (lazy) data-structure with good complexity for e.g. append as well? But I would like to understand the code better before going that way blind. It would be helpful to get some intuition about whether one could "just" tweak the code to do list manipulations differently, in a way that would be efficient in OCaml... any clues? :-)

@zeldovich
Copy link
Member

Thanks for the two PRs!

The lack of lazy evaluation sounds familiar -- I think we also ran into issues there and stopped trying to use the Ocaml extraction. Lazy evaluation has its own performance overheads, and we were hoping to avoid some of these issues by switching to Ocaml, but our FSCQ implementation seems to effectively rely on it for reasonable performance in a number of places, so it was easier to instead improve performance in Haskell by adding strictness annotations, instead of getting the Ocaml implementation to run fast.

It's certainly a possibility to change the FSCQ implementation so that it doesn't generate huge data structures in the first place (like the giant repeat in indread, as you discovered), but it would require tweaking the proofs, which is time-consuming. Replacing the list with a more complex data structure at extraction time would avoid the need to tweak proofs, but my intuition is that you won't get great performance that way (but perhaps still better than the degenerate cases you're seeing now), and the complexity of these data structures might undermine the supposed verification guarantees.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants