Chapter 1 Introduction

We present a framework for building embeddable automatic theorem provers for customized logics. The framework defines logic functors as logic components; for instance, one component may be the propositional logic, another component may be the interval logic, also called intervals. Logic functors can be composed to form new logics, for instance, propositional logic on intervals.

Each logic functor has its own proof-theory, which can be implemented as a theorem prover. We desire that the proof-theory and the theorem prover of the composition of logic functors should result from the composition of the proof-theories and the theorem provers of the component logic functors.

All logic functors and their compositions implement a common interface. This makes it possible to construct generic applications that can be instantiated with a logic component. Conversely, customized logics built using the logic functors can be embedded in an application that comply with this interface.

Logic functors specify software components off-the-shelf (COTS), the validation of the composition of which reduces to a form of type-cheking, and their composition automatically results in an automatic theorem prover. Logic functors can be assembled by laymen, and used routinely in system-level programming, such as compilers, operating systems, file-systems, and information systems.

An implementation of all logic functors in the appendices (and more) is available as an Open Source software library, available at http://www.irisa.fr/lande/ferre/logfun. Properties of built logics can be checked automatically. Logic functors are implemented as OCaml module functors.

1.1 Logic-based information processing systems

In [FR04, Fer02], we have proposed Logical Information Systems that are built upon a variant of Formal Concept Analysis [GW99], namely Logical Concept Analysis [FR00]. The framework is generic in the sense that any logic whose deduction relation forms a lattice can be plugged-in. However, if one leaves the logic totally undefined, then one puts too much responsibility on the end-users or on a knowledge-base administrator. It is unlikely they can design such a logic component themselves. By using the framework developed in this article, one can design a toolbox of logic components, and the user has only the responsibility of composing those components. The design of these Logical Information Systems is the main motivation for this research.

However, we believe the application scope of this research goes beyond our Logical Information Systems. Several information processing domains have logic-based components in which logic plays a crucial role: e.g., logic-based information retrieval [SM83, vRCL98], logic-based diagnosis [Poo88], logic-based programming [Llo87, MS98], logic-based program analysis [SFRW98, AMSS98, CSS99]. These components model an information processing domain in logic, and also they bring to the front solutions in which logic is the main engine. This can be illustrated by the difference between using a logic of programs and programming in logic.

Even in information processing domains where traditionally logic does not play a crucial role it has been proposed to embed a logic component in otherwise not logic-based systems. For instance, in [IB96] the authors propose to model quality of service (QoS) conditions in logic, and to make applications check dynamically that the platform on which they run enforces the condition for a specified quality of service.

The logic in use in these system is often not defined by a single pure deduction system, but rather it combines several logics together. The designer of an application has to make an ad hoc proof of consistency and an ad hoc implementation (i.e., a theorem prover) every time he designs a new ad hoc logic. Since these logics are often variants of a more standard logic we call them customized logics.

In order to favour separation of concerns, it is important that the application that is based on a logic engine, and the logic engine itself, be designed separately. This implies that the interface of the logic engine should not depend on the logic itself. This is what we call embeddability of the logic component.

For instance, practical query-answering systems often use a mixture of logic and concrete computations. Queries are built with logical connectives, and with purely operational constructs like wild-cards. This usually causes no harm because the query-answering system is not actually a theorem prover, and thus does not actually implement a logic. Indeed, in most cases a boolean query is used to filter concrete strings that contain no wild-cards, and no boolean connectives.

However, one can imagine a logic-based query-answering systems in which queries and data actually use the same language. For instance, one may use the full language with connectives and wild-cards both for describing entries in an information system, and for querying them [FR04]. Then the query-answering system must decide something which can be written as description ⊑ query (where ⊑ means logical consequence of the logic used in descriptions and queries), and it must have the full capacity of a theorem prover for a logic whose syntax is the description/querying language. The choice of a particular logic depends on the application, but query-answering is generic and depends solely on the existence of ⊑.

If we need to separately design the application and its logic components, then who should develop the embedded logic components ?

1.2 The actors of the development of an information processing system

In this section, we present our views on the Actors of the development on an information processing system. Note that Actors are not necessarily incarnated in one person; each Actor may gather several persons possibly not living at the same time. In short, Actors are roles, rather than persons. Sometimes, Actors may even be incarnated in computer programs.

What follows is rather standard in the information system (especially data-base) community because it has adopted organisation standards of industry and administration, but we think it is not widely accepted by the academic community for other kinds of information processing systems, where it tends to follow more academic standards in which several Actors are collapsed into one, the Researcher.

The first Actor is the Theorist; he invents an abstract framework, like, for instance, relational algebra, lattice theory, or logic.

If the abstract framework has applications, then a second Actor, the System Programmer, implements (part of) the theory in a generic system for these applications. This results in systems like data-bases, static analysers, or logic programming systems.

Then the third Actor, the Application Designer, applies the abstract framework to a concrete objective by instantiating a generic system. This can be done by composing a data-base schema, or a program property, or a logic program.

Finally, the User, the fourth Actor, maintains and uses an application. He queries a data-base, he analyses programs, or he runs logic programs. The User is often incarnated in programs.

Certainly, the User could be analysed further in Administrators, End-Users, etc. However, we stop here because it is the relation between the System Programmer and the Application Designer that interests us; the first one creates a generic system, and the second one instantiates it.

1.3 Genericity and instantiation

Genericity is often achieved by designing a language: e.g., a data-base schema language, a lattice operation language, and a programming language. Correspondingly, instantiation is done by programming and composing: e.g., drawing a data-base schema, composing an abstract domain for static analysis, or composing a logic program.

We propose to do the same for logic-based tools. Indeed, on one hand the System Programmer is competent for building a logic subsystem, but he does not know the application; he only knows the range of applications. On the other hand the Application Designer knows the application, but is generally not competent for building a logic subsystem. In this article, we will act as System Programmers by providing elementary components for safely building a logic subsystem, and also as Theorists by giving formal results on the composition laws of these components.

We explore how to systematically build logics using basic components that we call logic functors. By “construction of a logic” we mean the definition of its syntax, its semantics, and its abstract implementation as a deduction system. All logic functors we describe in this article have also a concrete implementation as an actual module. As this implementation is based on module functors of OCaml, logic functors can be directly composed in OCaml itself, so there is no need to program a composer. In fact the language for composing functors is simple, and there is no need to learn the OCaml language in order to build customized logics. Examples are given in Section 4

1.4 Customized logics

The range of logic functors can be very large. In this article we consider for instance products and sums of logics, propositions (on arbitrary formulas), intervals, multisets, some concrete domains like integers or strings (e.g., “begin with”, “contains”), AIK (a modal epistemic logic [Lev90]). See the appendices for a full list.

The whole framework is geared towards manipulating logics as partial orderings, where the ordering is a specialization/generalization relation. This relation receives various names, depending on the context: deduction, entailment, subsumption. The latter, which is found in description logics, has our preference as it betters suggests the specialization/generalization ordering. This requirement for partial orderings excludes non-monotonic logics. Note that non-monotonicity is seldom a goal in itself, and that notoriously non-monotonic features have a monotonic rendering; e.g., Closed World Assumption can be reflected in the monotonic modal logic AIK.

We will consider as a motivating example an application for dealing with bibliographic entries. Each entry has a description made of its author name(s), title, type of cover, publisher, and date. The User navigates through a set of entries by comparing descriptions with queries that are written in the same language. The application answers navigation queries by lists of entries that match the queries, and lists of subqueries that can complement the current one to form a more precise query. So doing, we have a logic-based notion of navigation where matching a query is being in some place, and subqueries are links to other places. For instance, let us assume the following entry set:

descr(entry₁) = author: "Kipling", title: "The Jungle Book", paper-back, publisher: "Penguin", year: 1985,
descr(entry₂) = author: "Kipling", title: "The Jungle Book", hard-cover, publisher: "Century Co.", year: 1908,
descr(entry₃) = author: "Kipling", title: "Just So Stories", hard-cover, year: 1902.

An answer to the query:

title: contains "Jungle"

is:

`hard-cover`		`publisher: "Century Co."`		`year: 1900..1950`
`paper-back`		`publisher: "Penguin"`		`year: 1950..2000`

because several entries (entry₁ and entry₂) have a description that entails the query (i.e., they are possible answers), and the application asks the user to make his query more specific by suggesting some relevant refinements. Note that author is "Kipling" is not a relevant refinement because it is true of all matching entries. For every possible answer entry we have descr(entry)⊑query, and for every relevant refinement x the following holds

there exists a possible answer e₁ such that descr(e₁)⊑x, and
there exists a possible answer e₂ such that descr(e₂)¬ ⊑x.

In other words, a refinement must restrict the set of possible answers while avoiding to make it empty. We will not go any further in the description of this application (see [FR04]). We simply note that:

descriptions, queries, and answers belong to the same logical language, which combines logical symbols and expressions such as strings, numbers, or intervals, and
one can design a similar application with a different logic, e.g., for manipulating software components. Thus, it is important that all different logics share a common interface for being able to separately write programs for the navigation system and for the logic subsystem it uses.

1.5 Summary

We define tools for building automatic theorem provers for customized logics for allowing Users who are not sophisticated logic actors. Note also that the User may be a program itself: e.g., a mobile agent running on a host system [IB96]. This rules out interactive theorem provers.

Validating a theorem prover built by using our tools must be as simple as possible. Some kind of type-checking is ideal because the Application Designer, though it may be more sophisticated than the User, is not a logic actor.

Finally, the resulting theorem provers must have a common interface so that they can be embedded in generic applications. Subsumption algorithms terminate in all the logic components that we define. Thus, the logic components can be safely embedded in applications as black-boxes.

This article is organized as follows. Chapter 2 introduces the notions of logics and logic functors, which are based on definitions for syntax, semantics, operations, and logic properties. Chapter 3 details the concrete implementation of logic functors as ML functors. Chapter 4 examplifies the use of logic functors, and Chapter 5 presents related work, concludes and states future work. The numerous appendices give the full and detailed description of logic functors and combinators. They constitute a first toolbox of customized and embeddable logic components. A large part of these appendices is made of proofs of logic properties.