CS350: Secure Compilation

 

Instructor: Marco Patrignani (follow link for website and MAIL)

Below you will find the following information for the course:

-- general course information,

-- prerequisites and interest,

-- location and timeslots,

-- evaluation and grading,

-- high-level syllabus outline,

-- class and lecture outline.

Some entries are still TBD, please contact the instructor if you have questions.



Piazza and CANVAS links:

Canvas: https://canvas.stanford.edu/courses/117207

Piazza: https://piazza.com/class/k8eqw4gsmfh2ug



General course information:

This course will explore the nascent field of secure compilation, which sits at the intersection between security and programming languages. The perspective from which we teach this field is that of formal methods and  foundations of computer security.


The goal of a secure compiler is to compile programs in order to preserve  source security properties like data confidentiality and code integrity. This  is challenging because attackers operating at the level of the compiler output are inherently more powerful than attackers in the source language. For example, target attackers can mount buffer overflows that bypass compiled code security abstractions that derive e.g., from compiled source level type annotations.


This course will describe:

- the threat model that secure compilers consider

  1. -correctness criteria for secure compilation,

  2. -specific instances of secure compilers

  3. -proof techniques for secure compilation.


Correctness criteria define that a compiler is secure. The course will explain why can we be sure that presented criteria have security meaning (i.e., what kinds of attacks can be defended against and which not).

Secondly, the course will discuss specific instances of secure compilers and how they achieve security (i.e., what mechanism — including types, crypto, security architecture, etc.— do these compilers exploit to realise one of the presented criteria and what are their tradeoffs).

Finally the course will cover proof techniques for secure compilation, i.e., how do you formally prove that a compiler adheres to one of the presented criteria and therefore is secure.



Prerequisites and interests:

Students should have a basic understanding of programming and of how

a compiler works (CS143, CS243). A background in formal languages and semantics is a plus (CS358, CS242)


Given the formal view on the subject, this course should appeal to students

with interest in security, formal methods, foundations of computer science.



Location and Time Slots:

Location:

    on zoom, link on piazza

Time slot:

    T-TH 10:30 - 11:50



Evaluation and Grading:

Evaluation will be based on:

  1. -6 optional assignments:   

       [ ass1.pdf ] [ ass2.pdf ] [ ass3.pdf ] [ ass4.pdf ] [ ass6.pdf ] [ ass7.pdf ]

  1. -1 non-optional student presentation

       [ ass5.pdf ] of a paper chosen from this list: [ paper-list-20.pdf ]

  1. -1 final exam consisting of a written text + oral discussion (which we’ll run on zoom)

       [ exam.pdf ]


Depending on the student’s interest, part of the evaluation can also be based on a project (research- or educational-focussed) related to the topics discussed in class. This can be discussed with the instructor and will be adjudicated based on student motivation and interest.


Please note that depending on the amount of attendees, some modification in the evaluation criteria may be necessary. All changes will be communicated to the students at the beginning of the class when the number of attendees is known.



High-level Syllabus Outline:

Below is a more precise outline of what we will cover in class, grouped by notions. Please note that there is not a direct mapping between notions and lectures, so covering a notion may take 1 lecture while covering another one may take 5. Also, some topics here may vary, though lightly.


Mostly we will follow my preliminary lecture notes:

  1. 350-bt.pdf  main notes

   The original .tex files for this document are here [ bt.tex | cmds.tex ]

  1. 350-asm-lang.pdf  assembly language formalisation.

And the following recommended reading: (no deadline, do during first 2 weeks)

    Formal Approaches to Secure Compilation. [ .pdf ]

    Secure Compilation and Hyperproperty Preservation. [ .pdf ]

    Journey Beyond Full Abstraction. [ .pdf ]

    Robustly Safe Compilation. [ .pdf ]

    Fully-Abstract Compilation by Approximate Back-Translation. [ .pdf ]

all these papers are available on my homepage -> publications


  1. Notions 1: motivation behind secure compilation

  2. Notions 2: program equivalences for security; expressing security properties with equivalences, classic examples. Traces for security: expressing security properties and hyperproperties (HP) with traces, HP diagram (safety, liveness, hypersafety). Examples: confidentiality, integrity, noninterference, specific properties

  3. Notions 3: full abstraction (FAC) theory: definition, justification for security, attacks and threat model, violations of FAC, common mistakes. Proving FAC: approaches, what is a backtranslation, context and traces backtranslation, when to use what

  4. Notions 4: trace preserving compilation: definition, justification, attacks, analogy with FAC, threat model

  5. Notions 5: Proving FAC. Recap of PL semantics notions: syntax/semantics, reading rules. Source language, target language: syntax and semantics, examples of equivalent programs. Compiler definition, compiler correctness. Backtranslation definition, backtranslation correctness. Target extension with context inspection, target traces, change backtranslation from context-based to trace-based

  6. Notions 6: Robustness, Robust property preservation, property-ful and -free criteria and their equivalence, different backtranslation classes. Robustly safe compilation (RSP) RSP in detail.

  7. Notions 7: Proving RSP starting from what we saw in Notions 5,  comparison with FAC proof

  8. Notions 8: Advanced Robust Criteria: RHSP in detail, example of preserving NI, RSCHP, RHP. Relational (hyper)properties and their reason to exist.



Class (and Lectures) Outline:


w1: apr 7 9

lecture 1:

introduction & details

recap PL: source lang with types, target language without types, compiler

Suggested reading:

https://xavierleroy.org/publi/compcert-CACM.pdf

https://xavierleroy.org/courses/EUTypes-2019/slides.pdf


lecture 2:

compiler correctness (CC)

       partial programs, contexts, compiler for partials, compositional CC

security props

limitations of CC

Suggested reading:

https://www.ccs.neu.edu/home/amal/papers/next700ccc.pdf


w2: apr 14 16

lecture 3:

contextual equivalence to express integrity and confidentiality

contextual equivalence formally

       formalising assembly languages


lecture 4:

fully abstract compilation (fac): general statement, motivation, preservation

fac proofs: deriving reflection from CC

fac reflection for our compiler


w3: apr 21 23

lecture 5:

fac proofs: proving preservation

fac preservation: need a context backtranslation (ctx bt)

fix the ctx bt: inject/extract

fac preservation via bt correctness

Suggested reading:

https://pdfs.semanticscholar.org/aa2e/a7d0f1dd1301e2fb7479acfb2e1f8e53cbc4.pdf


lecture 6:

add target reflection and try the BT

define trivial target traces, fac with traces, derive what is needed of the traces

fac preservation via traces, fa traces


w4: apr 28 30

lecture 7:

impure languages in source and target, ceq for stateful languages, correct compilation

formalising protection mechanisms: capabilities

fac for stateful languages and new encodable security properties

complexity of deriving a fac proof, high-level scheme


lecture 8:

traces as a generic security mechanism,

       properties and hyperproperties and relational properties

adding interaction traces to our languages as well as I/O,

robust property preservation as a criterion for secure compilation


w5: may 5 7

lecture 9:

different criteria that preserve different classes

equivalent characterisation and why are they equivalent


lecture 10:

safety preservation

liveness preservation

rsc definition, implications


w6: may 12 14

lecture 11:

rsc via correctness of trace-based BT and via cross-language relation

rfrsc via an extension of the trace-based BT

different forms of BT for our languages: can we make a context based BT

proving rtp, rhp, rrhp with our context-based BT


lecture 12:

stlc and ulc, erasing compiler

compiler correctness and cross-language logical relation

fa compiler, intuition, uval, emuldv


w7: may 19 21

lecture 13:

approximate bt, pseudotype LR

technicalities of the approx BT: in/case, in/dn, emulate, inject/extract

fac reflection and preservation via approx BT


lecture 14:

presentations


w8: may 26 28

lecture 15:

presentations


lecture 16:

presentations


w9: jun 2 4

oral exams:

discuss your exam, some assignments and some course questions