The Stanford Software Seminar is usually held on Mondays 3-4 pm in Gates 104. To subscribe to the seminar mailing list, send email to software-research-join@lists.stanford.edu from the email address you wish to subscribe. Likewise, to unsubscribe, send an email to software-research-leave@lists.stanford.edu from the subscribed email address. In either case, the subject and body of your email will be ignored.
| Upcoming Talks |
None.
| Previous Talks |
| Date | 3:00-4:00 pm, Monday, May 14, 2007 |
| Place | Gates 104 |
| Speaker | Manuvir Das, Microsoft |
| Title | Formal Specifications on Industrial-Strength Code - From Myth to Reality |
| Abstract |
The research community has long understood the value of formal specifications in building robust software. However, the adoption of any specifications beyond run-time assertions in industrial software has been limited. All of this has changed at Microsoft in the last few years. Today, formal specifications are a mandated part of the software development process in the largest Microsoft product groups. Millions of specifications have been added, and tens of thousands of bugs have been exposed and fixed in future versions of products under development. In addition, Windows public interfaces are formally specified and the Visual Studio compiler understands and enforces these specifications, meaning that programmers anywhere can now use formal specifications to make their software more robust. The goal of this talk is to share the technical, social and practical story of how we were able to move organizations with thousands of programmers to an environment where the use of formal specifications is routine. |
| Speaker Bio | Manuvir Das leads the Program Analysis research group in the Center for Software Excellence at Microsoft Corporation, and is an affiliate faculty member at the University of Washington. His research interests are in inventing and applying techniques from Programming Languages, Compilers, and Systems to the software engineering process. Manuvir holds a bachelor's degree in Computer Science from IIT Bombay, and a PhD in Computer Science from the University of Wisconsin-Madison. |
| Date | 4:00-5:00 pm, Thursday, April 26, 2007 |
| Place | Gates 463a (Theory Lounge) |
| Speaker | Philip Wadler, University of Edinburgh |
| Title | Faith, Evolution, and Programming Languages |
| Abstract | Faith and evolution provide complementary--and sometimes conflicting--models of the world, and they also can model the adoption of programming languages. Adherents of competing paradigms, such as functional and object-oriented programming, often appear motivated by faith. Families of related languages, such as C, C++, Java, and C#, may arise from pressures of evolution. As designers of languages, adoption rates provide us with scientific data, but the belief that elegant designs are better is a matter of faith. This talk traces one concept, second-order quantification, from its inception in the symbolic logic of Frege through to the generic features introduced in Java 5, touching on features of faith and evolution. The remarkable correspondence between natural deduction and functional programming informed the design of type classes in Haskell. Generics in Java evolved directly from Haskell type classes, and are designed to support evolution from legacy code to generic code. Links, a successor to Haskell aimed at AJAX-style three-tier web applications, aims to reconcile some of the conflict between dynamic and static approaches to typing. |
| Speaker Bio | Philip Wadler is Professor of Theoretical Computer Science at the University of Edinburgh. He holds a Royal Society-Wolfson Research Merit Fellowship and is a Fellow of the Royal Society of Edinburgh. Previously, he worked or studied at Avaya Labs, Bell Labs, Glasgow, Chalmers, Oxford, CMU, Xerox Parc, and Stanford, and has visited as a guest professor in Paris, Sydney, and Copenhagen. Prof. Wadler appears at position 70 on Citeseer's list of most-cited authors in computer science; is a winner of the POPL Most Influential Paper Award; and sits on the ACM Sigplan Executive Committee. He contributed to the designs of Haskell, Java, and XQuery, and is a co-author of XQuery from the Experts (Addison Wesley, 2004) and Generics and Collections in Java (O'Reilly, 2006). He has delivered invited talks in locations ranging from Aizu to Zurich. |
| Date | 3:00-4:00 pm, Monday, April 9, 2007 |
| Place | Gates 463A |
| Speaker | Corina Pasareanu, NASA Ames |
| Title | Abstract State Matching and Symbolic Execution for Software Model Checking |
| Abstract |
This talk describes abstraction based model checking techniques that compute under-approximations of the feasible behaviors of the system under analysis. The techniques perform model checking with explicit, or symbolic, execution and abstract state matching. Model checking explores the concrete program transitions while storing abstract versions of the explored states, as specified by an abstraction mapping. State matching determines whether an abstract state is being revisited, in which case the model checker backtracks. Applications include program error detection and test input generation. This is joint work with Willem Visser (SEVEN Networks), Saswat Anand (Georgia Institute of Technology), and Radek Pelanek (Masaryk University). |
| Date | 3:00-4:00 pm, Monday, March 5, 2007 |
| Place | Gates 104 |
| Speaker | Gary T. Leavens, Iowa State University |
| Title | JML: Expressive Modular Reasoning for Java |
| Abstract |
The Java Modeling Language (JML) is used to specify, check, and verify detailed designs for Java classes and interfaces. JML is an open, international, collaborative effort among 20 research groups and projects. This talk briefly describes background on the JML effort, including its tool support for runtime assertion checking, extended static checking, static verification, unit testing, etc. Subtyping and dynamic dispatch in object-oriented languages pose a challenge for modular reasoning, which is the key to practical tools. Moreover, a specification language must be clear and expressive to be used by practicing programmers. Modular reasoning is done by using supertype abstraction -- reasoning based on static type information. Expressiveness comes from a rich set of features for specifying methods. This talk describes how specification inheritance in JML forces behavioral subtyping, through a discussion of semantics and examples. Behavioral subtyping, together with a set of methodological restrictions, makes modular reasoning with supertype abstraction valid. This work was supported in part by NSF grant CCF-0429567. The work on behavioral subtyping in particular is based on joint work with Prof. David A. Naumann of Stevens University. See http://jmlspecs.org for more information about JML. |
| Date | 2:00-3:00 pm, Monday, February 26, 2007 |
| Place | Gates 104 |
| Speaker | David Monniaux, National Center for Scientific Research (France) |
| Title | Verification of device drivers: asynchronous "intelligent" peripherals |
| Abstract |
It is common in current embedded systems to use powerful off-the-shelf technologies such as programmable I/O controllers doing direct-to-memory access, handling worklists etc., for instance for USB busses. This poses the question of verifying properties of the driver, since the driver cannot be analyzed in isolation but must be asynchronously composed with a specification of the controller. We'll see some results of the verification of properties of a simulation of a USB OHCI controller asynchronously composed with an industrial device driver, using the Astree static analyzer. We'll then discuss future works and challenges. |
| Date | 3:00-4:00 pm, Monday, February 5, 2007 |
| Place | Gates 104 |
| Speaker | Michael Hicks, University of Maryland |
| Title | Modular Information Hiding and Type Safe Linking for C |
| Abstract |
The overarching goal of my research is to explore means to develop more flexible, reliable, and secure software systems. In this talk, I will present a brief overview of my work, and in particular focus on CMod, a novel tool that provides a sound module system for C. CMod works by enforcing a set of four rules that are based on principles of modular reasoning and on current programming practice. CMod's rules flesh out the convention that .h header files are module interfaces and .c source files are module implementations. Although this convention is well known, developing CMod's rules revealed there are many subtleties in applying the basic pattern correctly. We have proven formally that CMod's rules enforce both information hiding and type-safe linking. We evaluated CMod on a number of benchmarks, and found that most programs obey CMod's rules, or can be made to with minimal effort, while rule violations reveal brittle coding practices including numerous information hiding violations and occasional type errors. |
| Date | 3:00-4:00 pm, Monday, January 29, 2007 |
| Place | Gates 104 |
| Speaker | Radu Rugina, Cornell University |
| Title | Practical Shape Analysis |
| Abstract |
Shape analyses are aimed at extracting invariants that describe the "shapes" of heap-allocated recursive data structures. Although existing shape analyses have been successful at verifying complex heap manipulations, they have had limited success at being practical for larger programs. In this talk I will present two practical approaches to heap analysis and their application to error detection, program verification, and compiler transformations. First, I will present a reference counting shape analysis where the compiler uses local reasoning about individual heap cells, instead of global reasoning about the entire heap. Second, I will present a heap analysis by contradiction, where the analysis checks the absence of heap errors by disproving their presence. These techniques are both sufficiently precise to accurately analyze a large class of heap manipulation algorithms, and sufficiently lightweight to scale to larger programs. |
| Date | 3:00-4:00 pm, Monday, January 8, 2007 |
| Place | Gates 104 |
| Speaker | Michael Ernst, MIT |
| Title | Refactoring for parameterizing Java classes |
| Abstract |
Type safety and expressiveness of many existing Java libraries and their client applications would improve, if the libraries were upgraded to define generic classes. Efficient and accurate tools exist to assist client applications to use generics libraries, but so far the libraries themselves must be parameterized manually, which is a tedious, time-consuming, and error-prone task. We present a type-constraint-based algorithm for converting non-generic libraries to add type parameters. The algorithm handles the full Java language and preserves backward compatibility, thus making it safe for existing clients. Among other features, it is capable of inferring wildcard types and introducing type parameters for mutually-dependent classes. We have implemented the algorithm as a fully automatic refactoring in Eclipse. We evaluated our work in two ways. First, our tool parameterized code that was lacking type parameters. We contacted the developers of several of these applications, and in all cases where we received a response, they confirmed that the resulting parameterizations were correct and useful. Second, to better quantify its effectiveness, our tool parameterized classes from already-generic libraries, and we compared the results to those that were created by the libraries' authors. Our tool performed the refactoring accurately ? in 87% of cases the results were as good as those created manually by a human expert, in 9% of cases the tool results were better, and in 4% of cases the tool results were worse. |
| Date | 3:00-4:00 pm, Monday, December 4, 2006 |
| Place | Gates 104 |
| Speaker | Koushik Sen, University of California, Berkeley |
| Title | Concolic Testing of Sequential and Concurrent Programs |
| Abstract | Testing with manually generated test cases is the primary technique used in industry to improve reliability of software--in fact, such testing is reported to account for 50%-80% of the typical cost of software development. I will describe Concolic Testing, a systematic and efficient method which combines random and symbolic testing. Concolic testing enables automatic and systematic testing of large programs, avoids redundant test cases and does not generate false warnings. Experiments on real-world software show that concolic testing can be used to effectively catch generic errors such as assertion violations, memory leaks, uncaught exceptions, and segmentation faults. Combined with dynamic partial order reduction techniques and predictive analysis, concolic testing is effective in catching concurrency bugs such as data races and deadlocks as well as specification related bugs. I will describe my experience with building two concolic testing tools, CUTE for C programs and jCUTE for Java programs, and applying these tools to real-world software systems. I will briefly describe some ongoing projects to improve and extend concolic testing. |
| Date | 3:00-4:00 pm, Thursday, November 30, 2006 |
| Place | Gates 104 |
| Speaker | Rupak Majumdar, University of California, Los Angeles |
| Title | What's Next for Software Verification? |
| Abstract | Over the last few years, software verification based on predicate abstraction and counterexample-guided refinement has been a successful technique for performing automatic and precise static analysis of programs. Other than device driver protocols, though, software verifiers have so far mostly been applied to simple properties of systems. For these programs and properties, we show a simple syntactic algorithm to construct approximate program invariants that is surprisingly effective, suggesting perhaps that software verification tools are overkill for these applications. For deeper properties, we describe our recent attempts to verify a memory management subsystem and outline several challenges that must be met before the verification will succeed. We describe some partial progress. In particular, we describe a technique to infer predicates in the presence of data structures such as lists and sets, and a type system that extracts word-level relationships from code containing bitwise operations. |
| Date | 3:00-4:00 pm, Monday, November 6, 2006 |
| Place | Gates 104 |
| Speaker | Ranjit Jhala, University of California, San Diego |
| Title | Permissive Interfaces |
| Abstract |
A modular program analysis considers components independently and provides succinct summaries for each component, which can be used when checking the rest of the system. Consider a system comprising of two components, a library and a client. A temporal summary, or interface, of the library specifies legal sequences of library calls. The interface is safe if no call sequence violates the library's internal invariants; the interface is permissive if it contains every such sequence. Modular program analysis requires full interfaces, which are both safe and permissive: the client does not cause errors in the library if and only if it makes only sequences of library calls that are allowed by the full interface of the library. Previous interface-based methods have focused on safe interfaces, which may be too restrictive and thus reject good clients. We present an algorithm for automatically synthesizing software interfaces that are both safe and permissive. The algorithm generates interfaces as graphs whose vertices are labeled with predicates over the library's internal state, and whose edges are labeled with library calls. The interface state is refined incrementally until the full interface is constructed. In other words, the algorithm automatically synthesizes a typestate system for the library, against which any client can be checked for compatibility. We present an implementation of the algorithm which is based on the BLAST model checker, and we evaluate some case studies. Joint work with Thomas A. Henzinger and Rupak Majumdar. |
| Date | 2:30-3:30 pm, Tuesday, October 31, 2006 |
| Place | Gates 104 |
| Speaker | Guy L. Steele Jr., Sun Microsystems Laboratories |
| Title | Parallel Programming and Code Selection in Fortress |
| Abstract | As part of the DARPA program for High Productivity Computing Systems, the Programming Language Research Group at Sun Microsystems Laboratories is developing Fortress, a language intended to support large-scale scientific computation with the same level of portability that the Java programming language provided for multithreaded commercial applications. One of the design principles of Fortress is that parallelism be encouraged everywhere; for example, it is intentionally just a little bit harder to write a sequential loop than a parallel loop. Another is to have rich mechanisms for encapsulation and abstraction; the idea is to have a fairly complicated language for library writers that enables them to write libraries that present a relatively simple set of interfaces to the application programmer. Thus Fortress is as much a framework for language developers as it is a language for coding scientific applications. We will discuss ideas for using a rich parameterized polymorphic type system to organize multithreading and data distribution on large parallel machines. The net result is similar in some ways to data distribution facilities in other languages such as HPF and Chapel, but more open-ended, because in Fortress the facilities are defined by user-replaceable and -extendable libraries rather than wired into the compiler. A sufficiently rich type system can take the place of certain kinds of flow analysis to guide certain kinds of code selection and optimization, again moving policymaking out of the compiler and into libraries coded in the Fortress source language. |
| Date | 3:00-4:00 pm, Monday, October 16, 2006 |
| Place | Gates 104 |
| Speaker | Zhendong Su, University of California, Davis |
| Title | Scalable and Accurate Tree-based Detection of Code Clones |
| Abstract |
Detecting code clones has many software engineering applications. Existing approaches either do not scale to large code bases or are not robust against code modifications. In this talk, I will present an efficient algorithm for identifying similar subtrees. The algorithm is based on a novel characterization of subtrees with numerical vectors in the Euclidean space R^n and an efficient algorithm to cluster these vectors w.r.t. the Euclidean distance metric. Subtrees with vectors in one cluster are considered similar. We have implemented our tree similarity algorithm as a clone detection tool called Deckard and evaluated it on large code bases written in C and Java including the Linux kernel and JDK. Our experiments show that Deckard is both scalable and accurate. It is also language independent, applicable to any language with a formally specified grammar. Joint work with Lingxiao Jiang, Ghassan Misherghi, and Stephane Glondu. |
| Date | 3:00-4:00 pm, Thursday, June 8, 2006 |
| Place | Gates 104 |
| Speaker | Bowen Alpern, IBM T. J. Watson Research Center |
| Title | Early Ruminations on Anticipating Demand for Software Delivered as a Stream |
| Abstract | This is not, at least not primarily, a talk about the economic viability of a particular model of software distribution. Rather, it reports on some preliminary investigations into whether predictive caching techniques might be able to mitigate the first-time performance penalty incurred when software executes as it is being delivered. The context for this work is the PDS (Progressive Deployment System) project at IBM Research which uses application virtualization to deliver an application and its non operating system dependencies as a stream in order to avoid dependency problems that can occur when multiple applications are installed on the same machine. This is joint work with the PDS group. |
| Date | 2:00-3:00 pm, Thursday, May 18, 2006 |
| Place | Gates 104 |
| Speaker | Danny Dig |
| Title | Automated Detection of Refactorings in Evolving Components |
| Abstract |
One of the costs of reusing software components is upgrading applications to use the new version of the components. Upgrading an application can be error-prone, tedious, and disruptive of the development process. An important kind of change in OO components is a refactoring. Refactorings are program transformations that improve the internal design without changing the observable behavior (e.g., renamings, moving methods between classes, splitting/merging of classes). Our previous study showed that more than 80% of the disruptive changes in five different components were caused by refactorings. If the refactorings that happened between two versions of a component could be automatically detected, a refactoring tool could replay them on applications. I will present an algorithm that detects refactorings performed during component evolution. Our algorithm uses a combination of a fast syntactic analysis to detect refactoring candidates and a more expensive semantic analysis to refine the results. The experiments on components ranging from 17 KLOC to 352 KLOC show that our algorithm detects refactorings in real-world components with accuracy over 85%. Joint work with Can Comertoglu, Darko Marinov and Ralph Johnson (UIUC). |
| Date | 3:00-4:00 pm, Monday, May 15, 2006 |
| Place | Gates 104 |
| Speaker | Stephen Freund, Williams College |
| Title | Practical Hybrid Type Checking |
| Abstract |
Software systems typically contain large API's that are only informally and imprecisely specified and hence easily misused. Practical mechanisms for documenting and verifying precise specifications would significantly improve software reliability. The Sage programming language is designed to provide high-coverage checking of expressive specifications. The Sage type language is a synthesis of (unrestricted) refinement types and Pure Type Systems. Since type checking for this language is not statically decidable, Sage uses hybrid type checking, which extends static type checking with dynamic contract checking, automatic theorem proving, and a database of refuted subtype judgments. In this talk, I present the key ideas behind hybrid type checking, the Sage language, and preliminary experimental results suggesting that hybrid type checking of precise specifications is a promising approach for the development of reliable software. I will also discuss more recent work on extending Sage to include mutable objects. This is joint work with Cormac Flanagan, Jessica Gronski, Kenn Knowles, and Aaron Tomb at University of California, Santa Cruz. |
| Date | 3:00-4:00 pm, Monday, April 24, 2006 |
| Place | Gates 104 |
| Speaker | Kathy Yelick, UC Berkeley and Lawrence Berkeley National Lab |
| Title | Compilation Technology for Computational Science |
| Abstract |
The emergence of multicore processors marks the end of an era in computing: whereas hardware developers were largely responsible for exponential performance gains in the past decade, software developers will now be equally responsible if these gains are to continue. The notoriously difficult problem of writing parallel software will now be commonplace. Much of the experience using parallelism for performance resides in the scientific computing community, and while that group has focused on numerical simulations and very large scale parallelism, surely some lessons can be learned. In this talk, I will describe the class of partitioned global address space languages, which have recently received support from the scientific computing community as a possible alternative to message passing and threads. One of these languages, Titanium, is a modest extension of Java with domain-specific extensions for scientific computing and large-scale parallelism. Titanium has proven to be significantly more expressive than message passing and has been used for significant scientific problems, including a parallel simulation of blood flow in the heart and an elliptic solver based on adaptive mesh refinement. Several interesting computer science challenges have arisen from this work, including the need for program analysis techniques specialized to parallel languages, machine-independent optimization strategies, and runtime support for latency hiding. I will describe some recent work in compilation of parallel languages, including thread-aware pointer analysis, sequential consistency enforcement, and model-driven communication optimizations, and give an overview of some open questions in the field. |
| Date | 3:00-4:00 pm, Monday, April 17, 2006 |
| Place | Gates 104 |
| Speaker | Brad Chamberlain, Cray Inc. |
| Title | Chapel: Cray Cascade's High Productivity Language |
| Abstract |
In 2002, DARPA launched the High Productivity Computing Systems (HPCS) program, with the goal of improving user productivity on High-End Computing systems for the year 2010. As part of Cray's research efforts in this program, we have been developing a new parallel language named Chapel, designed to: 1. support a global view of parallel programming with the ability to tune for locality, 2. support general parallel programming including data- and task-parallel codes, as well as nested parallelism, and 3. help narrow the gulf between mainstream and parallel languages. In this talk I will introduce the motivations and foundations for Chapel, describe several core language concepts, and show some sample computations written in Chapel. |
| Date | 2:15-3:15 pm, Friday, March 10, 2006 |
| Place | Gates B8 |
| Speaker | Stephen Fink, IBM T. J. Watson Research Center |
| Title | Effective Typestate Verification in the Presence of Aliasing |
| Abstract |
We describe a novel framework for verification of typestate properties, including several new techniques to precisely treat aliases without undue performance costs. In particular, we present a flow-sensitive, context-sensitive, integrated verifier that utilizes a parametric abstract domain combining typestate and aliasing information. To scale to real programs without compromising precision, we present a staged verification system in which faster verifiers run as early stages which reduce the workload for later, more precise, stages. We have evaluated our framework on a number of real Java programs, checking correct API usage for various Java standard libraries. The results show that our approach scales to hundreds of thousands of lines of code, and verifies correctness for over 95% of the potential points of failure. |
| Date | 4:00-5:15 pm, Thursday, February 23, 2006 |
| Place | Gates 104 |
| Speaker | Vivek Sarkar, IBM T. J. Watson Research Center |
| Title | X10: An Object-Oriented Approach to Non-Uniform Cluster Computing |
| Abstract |
It is now well established that the device scaling predicted by Moore's Law is no longer a viable option for increasing the clock frequency of future uniprocessor systems at the rate that had been sustained during the last two decades. As a result, future systems are rapidly moving from uniprocessor to multiprocessor configurations, so as to use parallelism instead of frequency scaling as the foundation for increased compute capacity. The dominant emerging multiprocessor structure for the future is a Non-Uniform Cluster Computing (NUCC) system with nodes that are built out of multi-core SMP chips with non-uniform memory hierarchies, and interconnected in horizontally scalable cluster configurations such as blade servers. Unlike previous generations of hardware evolution, this shift will have a major impact on existing software. Current OO language facilities for concurrent and distributed programming are inadequate for addressing the needs of NUCC systems because they do not support the notions of non-uniform data access within a node, or of tight coupling of distributed nodes. We have designed a modern object-oriented programming language, X10, for high performance, high productivity programming of NUCC systems. A member of the partitioned global address space family of languages, X10 highlights the explicit reification of locality in the form of places; lightweight activities embodied in async, future, foreach, and ateach constructs; a construct for termination detection (finish); the use of lock-free synchronization (atomic blocks); and the manipulation of cluster-wide global data structures. We present an overview of the X10 programming model and language, experience with our reference implementation, and results from some initial productivity comparisons between the X10 and Java(TM) languages. This is joint work with other members of the X10 core team --- Vijay Saraswat, Raj Barik, Philippe Charles, Christopher Donawa, Christian Grothoff, Allan Kielstra, Igor Peshansky, and Christoph von Praun. |
| Date | 3:00-4:00 pm, Monday, February 6, 2006 |
| Place | Gates 104 |
| Speaker | Kathleen Fisher |
| Title | PADS: Processing Arbitrary Data Sources |
| Abstract |
Many high-volume data sources exist that can be mined very profitably, for example: call detail records, web server logs, network packets, network configuration and log files, provisioning records, credit card records, stock market data, etc. Unfortunately, many such data sources are in formats over which data consumers have no control. A significant effort is required to understand such a data source and write a parser for the data, a process that is both tedious and error-prone. Often, the hard-won understanding of the data ends up embedded in parsing code, making both sharing the understanding and maintaining the parser difficult. Typically, such parsers are incomplete, failing to specify how to handle situations where the data does not conform to the expected format. In this talk, I will describe the PADS project, which aims to provide languages and tools for simplifying the analysis of ad hoc data. We have designed a declarative data-description language, PADS/C, expressive enough to describe the data sources we see in practice at AT&T, including ASCII, binary, EBCDIC (Cobol), and mixed formats. From PADS/C we generate a C library with functions for parsing, manipulating, summarizing, querying, and writing the data. This work is joint with Bob Gruber, Mary Fernandez, David Walker, Yitzhak Mandelbaum, and Mark Daly. |
| Date | 4:15-5:30 pm, Tuesday, November 29, 2005 |
| Place | Gates 104 |
| Speaker | Sanjit Seshia, University of California, Berkeley |
| Title | SAT-Based Decision Procedures and Malware Detection |
| Abstract | SAT-based decision procedures operate by performing a satisfiability-preserving encoding of their input to a Boolean satisfiability (SAT) problem, on which a SAT solver is invoked. In this talk I will present UCLID, a verification tool based on SAT-based decision procedures, and describe an application to detecting malware (e.g., viruses and worms). UCLID's SAT-based decision procedures are for quantifier-free first-order logics involving arithmetic. These have been used within a malware detector that shows greater resilience to malware obfuscations than commercial tools. I will describe the notion of a "semantic signature," the detection algorithm, and experimental results. |
| Date | 3:30-4:30 pm, Tuesday, November 15, 2005 |
| Place | Gates 104 |
| Speaker | Norman Ramsey, Harvard University |
| Title | A Low-level Approach to Reuse for Programming-Language Infrastructure |
| Abstract |
New ideas in programming languages are best evaluated experimentally. But experimental evaluation is helpful only if there is an implementation that is efficient enough to encourage programmers to use the new features. Ideally, language researchers would build efficient implementations by reusing existing infrastructure, but existing infrastructures do not serve researchers well: in high-level infrastructures, many high-level features are built in and can't be changed, and in low-level infrastructures, it is hard to support important *run-time* services such as garbage collection, exceptions, and so on. I am proposing a different approach: for reuse with many languages, an infrastructure needs *two* low-level interfaces: a compile-time interface and a *run-time* interface. If both interfaces provide appropriate mechanisms, the mechanisms can be composed to build many high-level abstractions, leaving the semantics and cost model up to the client. In this talk, I will illustrate these ideas with examples drawn from two parts of the C-- language infrastructure: exception dispatch and procedure calls. I will focus on the mechanisms that make it possible for you to choose the semantics and cost model you want. For exceptions, these mechanisms are drawn from both compile-time and run-time interfaces, and together they enable you to duplicate all the established techniques for implementing exceptions. For procedure calls, the mechanisms are quite different; rather than provide low-level mechanisms that combine to form different kinds of procedure calls, I have found it necessary to extend the compile-time interface to enable direct control of the semantics and cost of procedure calls. I will also sketch some important unsolved problems regarding mechanisms to support advanced control features such as threads and first-class continuations. |
| Date | 4:15-5:30 pm, Tuesday, November 8, 2005 |
| Place | Gates 104 |
| Speaker | Terence Parr, University of San Francisco |
| Title | ANTLR and Computer Language Implementation |
| Abstract |
While parsing has been well understood for decades and a number of decent parser generators exist, anyone who has built an interpreter, translator, or compiler of consequence will tell you that the overall problem of supporting language development has not been adequately solved. Yes, parsing has been solved in theory but many of the strongest parsing strategies and systems are cumbersome in practice. Moreover, parsing is only one component of language implementation and the other pieces have either been studied purely from a compiler point of view or have resulted in powerful but inaccessible solutions that programmers in the trenches are unable or unwilling to use. My goal in this talk is twofold: (1) to convince you that there is still work to be done and interesting problems to solve in the realm of language tools such as IDEs tailored to language development, grammar reuse strategies, automatic grammar construction from sample inputs, and transformation systems that are accessible to the average programmer; (2) to demonstrate a few items from my ANTLR research program such as the LL(*) parsing algorithm, tree grammars, grammar rewrite rules, and ANTLRWorks grammar development environment. |
| Date | 4:15-5:30 pm, Tuesday, November 1, 2005 |
| Place | Gates 104 |
| Speaker | George Necula, University of California, Berkeley |
| Title | Data Structure Specifications via Local Equality Axioms |
| Abstract | We describe a program verification methodology for specifying global shape properties of data structures by means of axioms involving arbitrary predicates on scalar fields and pointer equalities in the neighborhood of a memory cell. We show that such local invariants are both natural and sufficient for describing a large class of data structures. We describe a complete decision procedure for such a class of axioms. The decision procedure is not only simpler and faster than in other similar systems, but has the advantage that it can be extended easily with reasoning for any decidable theory of scalar fields. |
| Date | 4:15-5:30 pm, Tuesday, October 18, 2005 |
| Place | Gates 104 |
| Speaker | Ulfar Erlingsson, Microsoft Research |
| Title | Principles and Applications of Software Control-Flow Integrity |
| Abstract |
Current software attacks often build on exploits that subvert machine-code execution. The enforcement of a basic safety property, Control-Flow Integrity (CFI), can prevent such attacks from arbitrarily controlling program behavior. CFI enforcement is simple, and its guarantees can be established formally, even with respect to powerful adversaries. Moreover, CFI enforcement is practical: it is compatible with existing software and can be done efficiently using software rewriting in commodity systems. Finally, CFI provides a useful foundation for enforcing further security policies. This talk describes CFI and an x86 implementation of CFI enforcement, assesses its security benefits against real-world attacks, and shows how the CFI guarantees can enable efficient software implementations of a protected shadow call stack and of access control for memory regions. This is joint work with Martin Abadi, Mihai Budiu, and Jay Ligatti. More information about the work can be found at http://research.microsoft.com/research/sv/gleipnir/ |
| Date | 2:30-3:45 pm, Monday, October 10, 2005 |
| Place | Gates 104 |
| Speaker | Dan Grossman, University of Washington |
| Title | Strong Atomicity for Today's Programming Languages |
| Abstract |
The data races and deadlocks that riddle threaded applications are an ever-greater impediment to reliable and responsive desktop applications. The lock-based approach to shared-memory concurrency (i.e., the approach taken in current programming languages such as Java) has software-engineering shortcomings compared to atomicity. An atomic construct is a concurrency primitive that executes code as though no other thread has interleaved execution. To ensure correctness, fair scheduling, and reasonable performance, we advocate a logging-and-rollback approach to implementing atomic. Moreover, we believe we can implement atomic well enough on today's commodity hardware to utilize atomicity and further investigate its usefulness. This talk will describe ongoing work designing and implementing languages with atomicity. After describing the advantages of atomicity, we will describe our experience with two prototypes: (1) AtomCaml is a working prototype that extends the mostly-functional language OCaml with atomicity. OCaml does not support true parallelism (it essentially assumes a uniprocessor), which lets us perform key optimizations for this common case. In particular, non-atomic code can run unchanged. (2) AtomJava is a Java extension currently under development. It implements atomicity in terms of locks (which are not visible to programmers) without potential deadlock. The AtomJava compiler produces Java source code that can run on any Java implementation. |