Kunz, César (2009) Proof preservation and program compilation. PhD thesis Informatique temps réel, robotique et automatique, CMA- Centre de mathématiques appliquées, ENSMP p.165.
Full text available as:
|
|
Abstract
Software applications have gained a notable role in our everyday
activities, mobile code applications being a significant portion of
these software agents. The mobile code paradigm entails the
distribution of applications from the code producer to heterogeneous
client environments in which they are executed. An extended practice
of this paradigm consists in the development of third party
components, that are transferred across an untrusted network such as
Internet and finally integrated in a host execution environment such
as a PC or a cellular phone.
Naturally, this computational environment opens the door to the
deployment of malicious code in client workstations.
In some cases a potential misbehavior of the mobile code does not
constitute a serious risk, for example when the integrity of the data
affected by the execution is irrelevant, or when the execution
architecture imposes strong constraints on the computational
capabilities of the foreign application.
There still are, however, situations in which it is essential to
verify the functional correctness of the code before executing it, for
instance when
the confidentiality of critical data such as credit card
information could be compromised, or when the execution environment
does not have a special mechanism to monitor excessive resource
consumption.
George Necula proposed a technique to provide trust to code
consumers about the program correctness without trusting the code
producer side. The technique, Proof Carrying Code
(PCC), consists in deploying code together
with a formal proof of its correctness.
Correctness is an inherent property of the received code that cannot
be inferred from the identity of the code producer.
That naturally puts PCC in advantage
with respect to methods based on trusting a third party authority.
Indeed, a signature from a trusted authority it is not sufficient to
provide absolute trust on the execution of the received code.
From its origins, the typical mechanism of PCC to generate
certificates relies on a certifying compiler, an extension of a
standard compiler that automatically produces certificates for decidable
safety policies.
Extending the set of enforceable properties is challenging
and in the most general case it requires human interaction.
One possibility, is to verify the final executable code
from scratch. However, the lack of structure
makes low level code verification a less natural, and hence more
daunting, task. This, together with the fact that
verification environments target mostly high level code, makes more
appropriate the idea of moving the generation of certificates at the
source code level. The main drawback of producing certificates
ensuring source code correctness is that they do not entail
correctness of the final compiled code.
Several techniques may be proposed to transfer evidence of program
correctness from the source code to the executable counterpart. That
includes, for instance, deploying the source program and original
certificates in addition to the executable code and certifying the
correctness of the compilation process. However, this approach is not
satisfactory, since in addition to require availability of the source
code, a certificate ensuring compiler correctness can be prohibitively
large.
A more viable alternative is to provide a mechanism to generate
certificates for compiled code from certificates
of the original source program. Compilers are complex procedures
composed of several steps, in which the original program is
progressively transformed into intermediate lower level
representations. Barthe et al. and
Pavlova
have shown that original certificates are preserved, up to minor
differences, along with the first compiler phase: a nonoptimizing
compilation from source level to an unstructured intermediate
representation. However, compiler optimizations applied
to the intermediate representations represent a challenge since
a priori certificates cannot be reused. In this thesis, we analyze
how optimizations affect the validity of certificates and propose a
mechanism, Certificate Translation, to transform certificates in order
to overcome the effects of program transformations, rendering feasible
the generation of certificates for executable mobile code at the
source level.
Chapter 2 introduces certificate translation for
common compiler optimizations applied in the context of an
intermediate RTL language. The main difficulties are presented, and a
classification of program transformations is given in terms of the
effort required to transform the original certificates. A general
scheme is provided to cover transformations that performs arithmetic
simplifications such as constant propagation and common subexpression
elimination. In addition, ad-hoc techniques are proposed for other
standard optimizations that do not correspond to this classification.
Chapter 3 studies the existence of
certificate translators in a mild extension of an abstract
interpretation framework that includes a notion of certificates.
This abstract setting provides considerable advantages with respect to
the approach of Chapter 2 since it allows us to
extend certificate translation to diverse
programming languages and several analysis environments. Certificate
transformation is studied in this abstract setting for simple program
transformations which can be composed to represent a wide variety of
program optimizations, for instance those considered in
Chapter 2.
In a second part of the thesis, we extend certificate translation to
less typical settings, that can be of interest in PCC
scenarios. More precisely, the chapters in the second part consider
the aspect oriented paradigm, hybrid verification methods and parallel
programs executing in memory hierarchies.
For each setting, appropriate analysis and verification settings are
provided. Then, the effect of common transformations over verification
results are studied.
| Item Type: | PhD Thesis (PhD) |
|---|---|
| PhD Supervisor: | Barthe, Gilles |
| Date: | 03 February 2009 |
| Board of examiners: | Jensen, Thomas and Hofmann, Martin and Leroy, Xavier and Müller, Peter and Barthe, Gilles and Grégoire, Benjamin and Peña, Ricardo |
| Ecole Doctorale: | ED 084 SCIENCES ET TECHNOLOGIES DE L'INFORMATION ET DE LA COMMUNICATION |
| Discipline: | Informatique temps réel, robotique et automatique |
| Collection (Fonds): | Mines ParisTech (ENSMP) |
| Institution: | ENSMP |
| Department: | CMA- Centre de mathématiques appliquées |
| Subjects: | 1. Mathematics and Applications |
| Uncontrolled Keywords: | Mobile Code, Program Transformation, Program Verification, Static analysis, Abstract interprétation, Compiler, Computer security, Code mobile, Transformation de programme, Vérification de programme, Analyse statique, Interprétation abstraite, Compilateur, Sécurité informatique |
| ID Code: | 4940 |
| Deposited By: | César Kunz |
| Deposited On: | 17 April 2009 |
References
1. E. Albert, G. Puebla, and M. V. Hermenegildo. Abstraction-carrying code. In
F. and A. Voronkov, editors, Logic for Programming Artificial Intel ligence and
Reasoning, number 3452 in Lecture Notes in Computer Science, pages 380–397.
Springer-Verlag, 2005.
2. Alpern, B., Carter, L., and Ferrante, J. Modeling parallel computers as memory
hierarchies. In Proc. Programming Models for Massively Paral lel Computers,
1993.
3. F. Y. Bannwart and P. M¨
uller. A program logic for bytecode. Electronic Notes
in Theoretical Computer Science, 141:255–273, 2005.
4. M. Barnett, K. R. M. Leino, and W. Schulte. The Spec# programming system:
An overview. In G. Barthe, L. Burdy, M. Huisman, J.-L. Lanet, and T. Muntean,
editors, Construction and Analysis of Safe, Secure and Interoperable Smart De-
vices: Proceedings of the International Workshop CASSIS 2004, volume 3362 of
Lecture Notes in Computer Science, pages 151–171. Springer-Verlag, 2005.
5. C. W. Barrett, Y. Fang, B. Goldberg, Y. Hu, A. Pnueli, and L. D. Zuck. Tvoc:
A translation validator for optimizing compilers. In K. Etessami and S. K.
Ra jamani, editors, CAV, number 3576 in Lecture Notes in Computer Science,
pages 291–295. Springer-Verlag, 2005.
6. G. Barthe, B. Gr´egoire, C. Kunz, and T. Rezk. Certificate translation for opti-
mizing compilers. In K. Yi, editor, Static Analysis Symposium, number 4134 in
Lecture Notes in Computer Science, pages 301–317. Springer-Verlag, 2006.
7. G. Barthe, B. Gr´egoire, and M. Pavlova. Preservation of proof obligations for
Java. In International Joint Conference on Automated Reasoning, Lecture Notes
in Computer Science. Springer-Verlag, 2008. To appear.
8. G. Barthe and C. Kunz. Certificate translation in abstract interpretation. In
European Symposium on Programming, number 4960 in Lecture Notes in Com-
puter Science, pages 368–382. Springer-Verlag, 2008.
9. G. Barthe, D. Naumann, and T. Rezk. Deriving an information flow checker
and certifying compiler for Java. In Symposium on Security and Privacy. IEEE
Press, 2006.
10. G. Barthe, T.Rezk, and A. Saabas. Proof obligations preserving compilation.
In T. Dimitrakos, F. Martinelli, P. Ryan, and S. Schneider, editors, Workshop
on Formal Aspects in Security and Trust, number 3866 in Lecture Notes in
Computer Science, pages 112–126. Springer-Verlag, 2005.
11. N. Benton. Simple relational correctness proofs for static analyses and program
transformations. In N. D. Jones and X. Leroy, editors, Principles of Program-
ming Languages, pages 14–25. ACM Press, 2004.
12. Y. Bertot, B. Gr´egoire, and X. Leroy. A structured approach to proving compiler
optimizations based on dataflow analysis. In JC. Filliˆ
atre, C. Paulin-Mohring,
and B. Werner, editors, TYPES, volume 3839 of Lecture Notes in Computer
Science, pages 66–81. Springer, 2004.
13. F. Besson, T. Jensen, and D. Pichardie. Proof-Carrying Code from Certified Ab-
stract Interpretation and Fixpoint Compression. Theoretical Computer Science,
364(3):273–291, 2006. Extended version.
14. F. Besson, T. Jensen, D. Pichardie, and T. Turpin. Result certification for
relational program analysis. Research Report 6333, IRISA, September 2007.
15. Fr´ed´eric Besson, Thomas P. Jensen, and Tiphaine Turpin. Small witnesses for
abstract interpretation-based proofs. In Programming Languages and Systems:
Proceedings of the 16th European Symposium on Programming, ESOP 2007,
number 4421 in Lecture Notes in Computer Science, pages 268–283. Springer-
Verlag, 2007.
16. S. Blazy, Z. Dargaye, and X. Leroy. Formal verification of a c compiler front-end.
In J. Misra, T. Nipkow, and E. Sekerinski, editors, FM, volume 4085 of Lecture
Notes in Computer Science, pages 460–475. Springer, 2006.
17. L. Burdy and M. Pavlova. Java bytecode specification and verification. In
Symposium on Applied Computing, pages 1835–1839. ACM Press, 2006.
18. A. Chaieb. Proof-producing program analysis. In K. Barkaoui, A. Cavalcanti,
and A. Cerone, editors, ICTAC, volume 4281 of Lecture Notes in Computer
Science, pages 287–301. Springer-Verlag, 2006.
19. P. Cousot and R. Cousot. Abstract interpretation: a unified lattice model for
static analysis of programs by construction or approximation of fixpoints. In
Principles of Programming Languages, pages 238–252, 1977.
20. P. Cousot and R. Cousot. Systematic design of program analysis frameworks.
In Principles of Programming Languages, pages 269–282, 1979.
21. W. J. Dally, F. Labonte, A. Das, P. Hanrahan, J. Ho Ahn, J. Gummara ju,
M. Erez, N. Jayasena, I. Buck, T. J. Knight, and U. J. Kapasi. Merrimac:
Supercomputing with streams. In International Conference on Supercomputing,
page 35. ACM, 2003.
22. D. S. Dantas and D. Walker. Harmless advice. In J. Gregory Morrisett and
Simon L. Peyton Jones, editors, Principles of Programming Languages, pages
383–396. ACM Press, 2006.
23. K. Fatahalian, D. R. Horn, T. J. Knight, L. Leem, M. Houston, J. Y. Park,
M. Erez, M. Ren, A. Aiken, W. J. Dally, and P. Hanrahan. Sequoia: program-
ming the memory hierarchy. In Conference on Supercomputing, page 83. ACM
Press, 2006.
24. J. D. Guttman and M. Wand. Special issue on VLISP. Lisp and Symbolic
Computation, 8(1/2), March 1995.
25. Nicolas Halbwachs and Mathias P´eron. Discovering properties about arrays in
simple programs. In R. Gupta and S. P. Amarasinghe, editors, PLDI, pages
339–348. ACM, 2008.
26. M. V. Hermenegildo and F. Rossi. Strict and nonstrict independent and-
parallelism in logic programs: Correctness, efficiency, and compile-time condi-
tions. J. Log. Program., 22(1):1–45, 1995.
27. M. Houston, J. Young Park, M. Ren, T. Knight, K. Fatahalian, A. Aiken, W. J.
Dally, and P. Hanrahan. A portable runtime interface for multi-level memory
hierarchies. In M. L. Scott, editor, PPOPP. ACM, 2008.
28. U. J. Kapasi, S. Rixner, W. J. Dally, B. Khailany, J. Ho Ahn, P. R. Mattson, and
J. D. Owens. Programmable stream processors. IEEE Computer, 36(8):54–62,
2003.
29. T. J. Knight, J. Young Park, M. Ren, M. Houston, M. Erez, K. Fatahalian,
A. Aiken, W. J. Dally, and P. Hanrahan. Compilation for explicitly man-
aged memory hierarchies. In K. A. Yelick and J. M. Mellor-Crummey, editors,
PPOPP, pages 226–236. ACM, 2007.
30. Shriram Krishnamurthi, Kathi Fisler, and Michael Greenberg. Verifying as-
pect advice modularly. In Richard N. Taylor and Matthew B. Dwyer, editors,
SIGSOFT FSE, pages 137–146. ACM, 2004.
31. P. Laud, T. Uustalu, and V. Vene. Type systems equivalent to data-flow analyses
for imperative languages. Theoretical Computer Science, 364(3):292–310, 2006.
32. K. Rustan M. Leino. Specifying and verifying programs in spec#. In I. Virbit-
skaite and A. Voronkov, editors, Ershov Memorial Conference, volume 4378 of
Lecture Notes in Computer Science, page 20. Springer, 2006.
33. K. Rustan M. Leino and W. Schulte. Exception safety for c#. In SEFM, pages
218–227. IEEE Computer Society, 2004.
34. Sorin Lerner, Todd Millstein, Erika Rice, and Craig Chambers. Automated
soundness proofs for dataflow analyses and transformations via local rules. In
POPL ’05: Proceedings of the 32nd ACM SIGPLAN-SIGACT symposium on
Principles of programming languages, pages 364–377, New York, NY, USA, 2005.
ACM.
35. X. Leroy. Coinductive big-step operational semantics. In Programming Lan-
guages and Systems: Proceedings of the 15th European Symposium on Program-
ming, ESOP 2006, volume 3924 of Lecture Notes in Computer Science, pages
54–68. Springer-Verlag, 2006.
36. X. Leroy. Formal certification of a compiler back-end or: programming a com-
piler with a proof assistant. In J. G. Morrisett and S. L. Peyton Jones, editors,
Principles of Programming Languages, pages 42–54. ACM Press, 2006.
37. Francesco Logozzo and Manuel F¨
ahndrich. On the relative completeness of
bytecode analysis versus source code analysis. In Laurie Hendren, editor, CC,
volume 4959 of Lecture Notes in Computer Science, pages 197–212. Springer,
2008.
38. A. Min´e. Weakly Relational Numerical Abstract Domains. PhD thesis, ´
Ecole
Polytechnique, Palaiseau, France, December 2004. http://www.di.ens.fr/
~mine/these/these- color.pdf.
39. P. M¨
uller and M. Nordio. Proof-transforming compilation of programs with
abrupt termination. In SAVCBS ’07: Proceedings of the 2007 conference on
Specification and verification of component-based systems, pages 39–46, New
York, NY, USA, 2007. ACM.
40. G. C. Necula. Proof-carrying code. In Principles of Programming Languages,
pages 106–119, New York, NY, USA, 1997. ACM Press.
41. G. C. Necula and P. Lee. Safe kernel extensions without run-time checking.
In Operating Systems Design and Implementation, pages 229–243, Seattle, WA,
October 1996. USENIX Assoc.
42. G. C. Necula and P. Lee. The design and implementation of a certifying com-
piler. In Programming Languages Design and Implementation, volume 33, pages
333–344, New York, NY, USA, 1998. ACM Press.
43. G.C. Necula. Compiling with Proofs. PhD thesis, Carnegie Mellon University,
October 1998. Available as Technical Report CMU-CS-98-154.
44. George C. Necula. Translation validation for an optimizing compiler. ACM
SIGPLAN Notices, 35(5):83–94, 2000.
45. M. Nordio, P. M¨
uller, and B. Meyer. Formalizing proof-transforming compila-
tion of eiffel programs. Technical Report 587, ETH Zurich, 2008.
46. M. Nordio, P. M¨
uller, and B. Meyer. Proof-transforming compilation of eiffel
programs. In R. Paige, editor, TOOLS-EUROPE, Lecture Notes in Business
and Information Processing. Springer-Verlag, 2008.
47. S. Owicki and D. Gries. An axiomatic proof technique for parallel programs.
Acta Informatica Journal, 6:319–340, 1975.
48. M. Pavlova. Java bytecode verification and its applications. Th´ese de doctorat,
sp´ecialit´e informatique, Universit´e Nice Sophia Antipolis, France, January 2007.
49. A. Pnueli, E. Singerman, and M. Siegel. Translation validation. In B. Steffen,
editor, Tools and Algorithms for the Construction and Analysis of Systems,
volume 1384 of Lecture Notes in Computer Science, pages 151–166. Springer-
Verlag, 1998.
50. X. Rival. Symbolic Transfer Functions-based Approaches to Certified Compila-
tion. In Principles of Programming Languages, pages 1–13. ACM Press, 2004.
51. A. Saabas and T. Uustalu. A compositional natural semantics and Hoare logic
for low-level languages. Theoretical Computer Science, 373(3):273–302, 2007.
52. A. Saabas and T. Uustalu. Type systems for optimizing stack-based code. In
M. Huisman and F. Spoto, editors, Bytecode Semantics, Verification, Analysis
and Transformation, volume 190(1) of Electronic Notes in Theoretical Computer
Science, pages 103–119. Elsevier, 2007.
53. A. Saabas and T. Uustalu. Program and proof optimizations with type systems.
Journal of Logic and Algebraic Programming, 77(1–2):131–154, 2008.
54. S. Seo, H. Yang, and K. Yi. Automatic Construction of Hoare Proofs from
Abstract Interpretation Results. In A. Ohori, editor, Asian Programming Lan-
guages and Systems Symposium, volume 2895 of Lecture Notes in Computer
Science, pages 230–245. Springer-Verlag, 2003.
55. Z. Shao, V. Trifonov, B. Saha, and N. Papaspyrou. A type system for certified
binaries. ACM Trans. Program. Lang. Syst., 27(1):1–45, 2005.
56. M. Strecker. Formal Verification of a Java Compiler in Isabelle. In A. Voronkov,
editor, Conference on Automated Deduction, volume 2392 of Lecture Notes in
Computer Science, pages 63–77, London, UK, 2002. Springer-Verlag.
57. D. Tarditi, J. Gregory Morrisett, Perry Cheng, Chris Stone, Robert Harper, and
Peter Lee. TIL: A type-directed optimizing compiler for ML. In Programming
Languages Design and Implementation, pages 181–192. Association of Comput-
ing Machinery, 1996.
58. J. Tristan and X. Leroy. Formal verification of translation validators: a case
study on instruction scheduling optimizations. SIGPLAN Not., 43(1):17–27,
2008.
59. M. Wildmoser, A. Chaieb, and T. Nipkow. Bytecode analysis for proof carry-
ing code. In F. Spoto, editor, Bytecode Semantics, Verification, Analysis and
Transformation, volume 141 of Electronic Notes in Theoretical Computer Sci-
ence. Elsevier, 2005.
60. M. Wildmoser, T. Nipkow, G. Klein, and S. Nanz. Prototyping proof carrying
code. In J-J. Levy, E. W. Mayr, and J. C. Mitchell, editors, Theoretical Computer
Science, pages 333–347. Kluwer Academic Publishing, August 2004.
61. H. Xi and S. Xia. Towards array bound check elimination in java tm virtual
machine language. In CASCON ’99: Proceedings of the 1999 conference of the
Centre for Advanced Studies on Col laborative research, page 14. IBM Press,
1999.
62. L. D. Zuck, A. Pnueli, Y. Fang, and B. Goldberg. Voc: A translation validator
for optimizing compilers. Electronic Notes in Theoretical Computer Science,
65(2), 2002.
Table of content
Part I Foundations of Certificate Translation
1 Certificate Translation alongside Standard RTL
Optimizations - 3
1.0.1 Contents - 6
1.1 PCC Setting - 7
1.1.1 RTL Language - 7
1.1.2 Operational Semantics - 9
1.1.3 Verification Condition Generator - 9
1.1.4 Certified Programs - 11
1.1.5 Soundness of PCC Infrastructure - 12
1.2 Preservation of Proof Obligations - 14
1.2.1 Source Language - 14
1.2.2 Compilation - 16
1.2.3 Preservation of Proof Obligations - 17
1.3 Certificate Translation for Common Optimizations - 18
1.3.1 Overview - 18
1.3.2 Constant Propagation - 22
Description - 22
Certifying analyzer - 22
Certificate translation - 23
1.3.3 Loop Induction Variable Strength Reduction - 25
Description - 25
Certifying analyzer - 26
Certificate translation - 27
1.3.4 Common Subexpression Elimination - 29
Description - 29
1.3.5 Copy Propagation - 30
Description - 30
Certificate translation - 31
1.3.6 Dead Register Elimination - 32
Description - 32
Certificate translation - 34
1.3.7 Unreachable Code Elimination - 36
Description - 36
Certificate translation - 37
1.3.8 Register Allocation - 37
Description - 37
Certificate translation - 38
1.3.9 Function Inlining - 39
Description - 39
Certificate translation - 40
1.3.10 Summary - 41
2 An Abstract Interpretation Model of Certificate Translation 43
2.1 Introduction - 43
2.2 Program, Semantics, Abstract Interpretation - 45
2.2.1 Program, Semantics - 45
2.2.2 Abstract Interpretations of Program Semantics - 46
Solutions - 49
2.2.3 Certified Solutions - 50
2.3 Certifying Analyzers - 53
2.4 Certificate Translation - 56
2.4.1 Code Duplication - 56
2.4.2 Edge Transformation - 58
2.4.3 Program Representation Abstraction - 63
2.4.4 Second-Order Analysis-Based Optimizations - 68
2.4.5 Concurrency - 71
2.4.6 Example - 77
Part II Case Studies
3 Specification Preserving Advice Weaving - 87
3.1 Introduction - 87
3.2 A basic motivating example - 87
3.3 A simple AOP language - 89
3.3.1 Syntax - 89
3.3.2 Semantics - 89
3.4 Verification of baseline code - 91
3.5 Verifying AOP programs - 92
3.5.1 Specification-preserving advices - 92
3.5.2 Example - 93
3.5.3 Harmless advices - 94
3.5.4 Beyond harmless advices - 94
3.6 Compiling advices - 95
3.6.1 Target language - 95
3.6.2 Compiler - 96
3.7 Certificate translation - 97
3.7.1 Verification of SBL programs - 98
3.7.2 Preservation of validity - 99
3.8 Increasing the Power of Verification - 100
4 Preservation of Proof Obligations in Hybrid
Verification Environments - 105
4.1 Setting - 106
4.2 Preservation of solutions - 108
4.3 Preservation of proof obligations - 115
4.4 From hybrid VCgen to VCgen - 120
5 Certified Analysis in Hierarchical Memory Models - 123
5.1 A primer on Sequoia - 123
5.2 Analyzing and reasoning about Sequoia programs - 127
5.2.1 Program Analysis - 127
5.2.2 Program Safety - 129
5.2.3 Program Verification - 132
5.2.4 Example Program - 135
5.3 Certificate translation - 135
5.4 General framework - 137
5.4.1 Certifying analyzers - 137
5.4.2 Certificate translation - 139
5.4.3 Copy propagation for arrays - 140
5.5 Sequoia Specific Optimizations - 141
5.5.1 SPMD Distribution - 141
5.5.2 Exec Grouping - 143
5.5.3 Copy Grouping - 145
Part III Related Work and Conclusion
References - 161
Repository Staff Only: edit this item