My research contributions are divided into several areas that include my work at AT&T Laboratories in data integration, workflow management, electronic document archiving, high-performance messaging infrastructures, and event notification services, my research accomplishments when I was pursuing a Ph.D. degree in Computer Science, and my work in the area of high performance storage systems.
Integration of data stored with multiple heterogeneous sources has received a lot of attention in recent years. A common characteristic of existing integration approaches is the adoption of either a relational or a semi-structured data model. DataGW presents a novel approach to data integration by exporting an attribute-based data model for entities stored with relational databases, spreadsheets, property files, and LDAP directories.
DataGW exports a flat information model. This model consists of several entity classes. Each entity class specifies the required and optional attributes that can be present in an instance of the class. The values of these attributes may be stored with different storage managers, e.g., LDAP servers, database servers, file systems, and so on. The storage location of these values is specified in a meta-entry. Each entity instance can be identified by using one or more of its attribute values.
There exists one meta-entry for each entity class. This meta-entry contians the required and optional class attributes, as well as attributes that can be used to improve performance, specify access rights, and so on. Meta-entries are stored in an LDAP directory. Note that the entity classes supported by the DataGW do not necessarily correspond to the object classes supported by the LDAP model. This is a crucial distinction since LDAP entities may belong to multiple object classes and, thus, consist of attributes specified in these classes, including duplicate attribute names.
Business processes consist of multiple activities that have to be performed in some sequence. Traditionally, the description of the involved activities and the order in which these activities are performed were done by using processing instructions and rules which were expressed in a textual format, if documented at all. For some business processes these rules were often embedded in the logic of the software program or the human agent that was responsible for carrying out an activity. Today, organizations change their infrastructure to improve the efficiency of the business as a whole, and they reengineer the business processes to adopt to the changes that take place in their environment. Consequently, planning and managing all the activities and rules involved in a business process becomes more challenging. These challenges can be answered by expressing the activities and rules involved in a business process in a concrete and systematic way so that they can be used by a software system and become fully automated.
Workflow management is closely related to the automation and reengineering of a business process in an organization. A workflow may describe business process activities at a conceptual level necessary for understanding, evaluating, and redesigning the business process. Workflow management systems (WFMSs) have been introduced to support the design, execution, and monitoring of business processes. However, existing WFMSs do not provide adequate support for handling deadlines, priorities, and transactional and non-transactional operations. In addition, they lack the ability to manage large volumes of activities, and they do not provide sustained availability to cope with system and logical failures. The main focus of my work is to improve the flexibility and robustness of WFMSs by providing enhanced execution and monitoring support.
The focus of the Workflow Infrastructure, Scheduling, and Enactment (WISE) project is to improve the flexibility and robustness of WFMSs by providing enhanced execution and monitoring support. In particular, WISE aims at an architecture that provides:
Today, organizations have to archive an ever-increasing number of documents that are both related to their core business and required to ensure institutional accountability. In addition, organizations have substantial investments in messaging technologies (email and groupware). SaveMe is a document archival system based on network-centric groupware such as Internet standards-based messaging systems. In SaveMe, the actions of archiving, retrieving, and classifying documents are similar to the actions of sending, retrieving, and classifying email into folders. SaveMe leverages existing messaging infrastructures -- the one common denominator sitting on every computer is email -- and, thus, it does not require individual users and IT personnel to learn a new technology. The resulting environment is not intrusive, easier to administer, and a lot easier to deploy.
Hermod is a high performance, extensible architecture for messaging, based on the store and access paradigm. Hermod consists of multiple heterogeneous data stores: for depositing and retrieving message contents; for managing user folders; for representing and managing the semi-structured nature of messages; and for maintaining information about users and user groups. Both performance and functionality are maximized by using the most appropriate state-of-the-art technology, e.g., databases, file systems, LDAP directories, and text search engines, for each type of data store, and by managing carefully the interactions between the data stores. Hermod uses a simple and uniform interface for the various data stores, resulting in an ``open'' internal architecture, which allows new data stores to be plugged into the architecture cleanly. Hermod has an active component, wherein internal state information is exported using an event mechanism, which enables a plethora of value-added messaging services to be added in a modular fashion.
The proliferation of inexpensive workstations and networks has created a new era in distributed computing. In particular, most modern computer applications are distributed in nature, and they require support by distributed computing platforms. The main goal of existing distributed platforms is to provide an infrastructure that supports the rapid development of value-added services. This means it must be possible to build a new service based on events that occur at existing services without requiring major modifications to existing services. In addition, distributed system users, administrators, and developers require tools and services that enable them to monitor the behavior of the system as a whole.
An event notification service is a key enabling technology for meeting all of the above goals. A notification service accepts event descriptions from suppliers and delivers corresponding event notifications to consumers. As part of initiating contact with the service, a supplier specifies the kinds of events it will supply, while a consumer specifies the kinds of events it is interested in. In combination, these specifications allow the notification service to form an efficient event distribution plan. When an event is handed to the notification service by a supplier, the service must deliver a notification for the event to only interested consumers avoiding wastage of important shared resources, such as network bandwidth.
READY is an event notification service that provides efficient, asynchronous, decoupled communication of event notifications. In READY, communication is asynchronous because the act of supplying an event completes as soon as READY receives the event; decoupled because a supplier need not know which processes will be consumers of its events, and a consumer need not know which processes will act as suppliers of interesting events for which it will receive notifications.
The proliferation of inexpensive workstations and networks has created a new era in distributed computing. At the same time, non-traditional applications such as computer-aided design (CAD), computer-aided software engineering (CASE), geographic-information systems (GIS), and office-information systems (OIS) have placed increased demands for high-performance transaction processing on database systems. The combination of these factors gives rise to significant challenges in the design of modern database systems. I have developed novel techniques whose aim is to improve the performance and scalability of these new database systems. These techniques exploit client resources through client-based transaction management.
Client-based transaction management is realized by providing logging facilities locally even when data is shared in a global environment. My work consists of several recovery algorithms which utilize client disks for storing recovery related information (i.e., log records). My algorithms work with both coarse and fine-granularity locking and they do not require the merging of client logs at any time. Moreover, my algorithms support fine-granularity locking with multiple clients permitted to update different portions of the same database page at the same time. The database state is recovered correctly when there is a complex crash as well as when the updates performed by different clients on a page are not present on the disk version of the page, even though some of the updating transactions have committed.
In addition, I have implemented my algorithms in BeSS, and I have studied their performance characteristics using the OO1 database benchmark. The performance results show that client-based logging is superior to traditional server-based logging. This is because client-based logging is an effective way to reduce dependencies on server CPU and disk resources and, thus, prevents the server from becoming a performance bottleneck as quickly when the number of clients accessing the database increases.
My research on storage managers for database systems consists of two systems, EOS and BeSS. Both storage systems are used by numerous research organizations and universities around the world. In addition, BeSS is being productized by NCR as part of a content-based multimedia server for massively parallel architectures.
EOS: is a storage manager that has been prototyped at AT&T Bell Laboratories as a vehicle for research into distributed storage architectures for database systems and specially those that integrate programming languages and databases. EOS' overall goal is to provide fast and transparent access to persistent objects independent of their size and their physical location in a distributed computing environment based on a client-server architecture. EOS objects are uninterpreted byte strings which can range in size from a few bytes to gigabytes. Large objects, spanning multiple pages, can be accessed and updated transparently as if they were small objects or via byte range operations. The byte range operations are important for very large objects - such as digital video and audio - because there may be memory size constraints that would make it impractical to build, retrieve, or update the whole object in one big step. EOS files collect related object together and are stored in EOS databases. EOS databases are stored in one or more storage areas (UNIX files or raw disk partitions). Clustering hints for the physical placement of objects in pages, files, databases and areas are also provided. Any EOS object can be named and subsequently retrieved by its name.
EOS offers extensible hashing supporting variable size keys and user-defined hash and comparison functions. In addition, other index structures can be built by using page objects - objects that expand over the entire available space of a page. EOS employs the multigranularity two version two phase locking protocol, that allows many readers and one writer to access the same item simultaneously. The option to switch to simple 2PL is also available. EOS uses a write-ahead redo-only logging scheme that offers short logs, fast recovery from system failures, and non-blocking checkpoints. Also, configuration files are provided that can be edited by users to customize and tune EOS performance. Finally, the EOS architecture has been designed to be extensible. Users may define hook functions to be executed when certain primitive events occur. This allows controlled access to a number of entry points in the system without compromising modularity.
BeSS: is a high-performance configurable and extensible object store based on a multi-client/multi-server architecture with support for distributed transactions. BeSS borrows some of the features of EOS, including extensive support for very large objects (images, video and audio), fast disk space allocation mechanisms, and extensibility. In BeSS data items are cached in the client workstation to be used by transactions in the same or different processes running on the same workstation. Cache consistency is guaranteed by following the callback locking algorithm. BeSS prevents database corruption caused by bad pointers by storing control structures separately from data. The control structures are protected by ordinary mechanisms provided by the virtual memory management hardware. BeSS offers a number of operation modes for accessing data and control structures - either on the server or the client side - such as copy on access, shared memory, and virtual-memory mode for databases smaller than the available virtual memory. These modes allow applications to take advantage of current advances in hardware technology such as shared memory multiprocessors and virtual memory management hardware offering a huge address space. BeSS allows application programs to access and manipulate persistent objects directly on the segment on which they reside, without incurring in-memory copying cost. BeSS employs a fast object reference mechanism that is based on memory mapping and it avoids greedy allocation of virtual memory addresses. Finally, the BeSS server is intended to be an open server, capable of supporting a wide range of applications. Sophisticated users can link with the BeSS server a trusted piece of code in order to build specialized servers, like SQL and multimedia servers.