|
| Cynthia Dwork | (Compaq) |
| Sampath Kannan | (Penn) |
| Insup Lee | (Penn) |
| Patrick Lincoln | (SRI) |
| John Mitchell | (Stanford, PI) |
| Ronitt Rubinfeld | (Cornell) |
| Andre Scedrov | (Penn) |

Project Description
Semantic consistency is a critical problem in information systems that
are maintained over time, distributed over many locations, or composed
of separate subsystems that may be implemented in different ways or designed
according to different objectives or assumptions. The basic problem arises
both with static data, such as numbers and text strings, or active computational
entities, such as functions, procedures or objects.
This project addresses the problem of semantic consistency in the transfer
of active information, such as executable program components. To give a
simple example, an organization might have many sub-organizations, stationed
in different locations. If some activity summary is to be assembled, then
this may be done by sending a data-summary program to each site. Each site
then has the responsibility of running the summary program on its own data
and reporting the results. This has several advantages over sending all
the data to one centralized location, including reduced network transmission
and use of distributed computation to reduce the computational load on
any one site. However, there are several semantic consistency problems:
- The program, transmitted to several sites, might assume that the data
at each site is stored in specific ways, or has specific meaning. The semantic
assumptions incorporated into the transmitted code might be violated at
one or more of the receiving sites.
- The sites receiving code to be executed might assume that the code
will execute properly, on specific hardware under certain memory constraints,
without modifying aspects of the computer system other than specific files,
and without initiating network communication other than some specific limited
set of transmissions. These semantic assumptions at the receiving site
might be violated by the transmitted code.
The main thrust of the proposed research will be to develop mechanisms
for instrumenting transmitted active components so that a receiving site
can be assured of certain semantic properties.
The enabling technology for this work will come from two distinct research
areas. The first is logics of programs based on linear logic and the formulas-as-types
correspondence. Based on prior work by three of the participants, we plan
to augment executable code with assertions about the behavior and resource
use of programs. The second underlying area, drawn from complexity theory,
involves approximability of functions, randomized computation, cryptographic
primitives, and probabilistic dialogs for conveying information within
a specified tolerance for uncertainty. Out of the second area comes a probabilistic
approach to program testing, pioneered and developed by two of the participants.
We also expect to make use of techniques from model checking (finite-state
protocol checking and verification) and computational learning theory (allowing
participants in information exchange to use adaptive protocols). The challenges
of the next three to five years involve combining distinct areas that have
developed independently, and bringing theoretical possibilities to fruition
through development and demonstration-level implementation of realistic
and useful protocols and computational environments.
|