Description |
The formalism of chronicles has been proposed a few years ago by C. Dousson in order to monitor dynamic physical systems [55]. Since then, it has been widely used in monitoring industrial or medical systems and in particular in the CALICOT system we present in section 5.2. But chronicle-based approaches have mostly been developed in a centralized way. Our main intention is to extend the chronicle-based diagnosis engine in order to deal with distributed systems.
First, we extend the formalism proposed by C. Dousson by defining a distributed chronicle model[12]. The standard chronicle concept is enriched with synchronization events that enable the expression of synchronization constraints between the different components of the system. Thus, a distributed chronicle associated to some component can be related to distributed chronicles in other components.
Then, we have proposed a decentralized architecture for a monitoring system where each component is equipped with a local diagnoser [14]. A global diagnoser (also called broker) is in charge of merging the local diagnoses, using the synchronization constraints that appear in the local chronicles being recognized by local diagnosers.
Each local diagnoser relies on a chronicle recognition system engine based on the CRS developed by C. Dousson [52], and a chronicle base representing the local scenarios which are of interest. When a chronicle is recognized, the local diagnoser may send a message to the broker or not, according to its diagnosis policy. This policy depends both on the color (i.e. the degree of importance) of the recognized chronicle and on the filter of this local diagnoser. The filter is in charge of telling which color of chronicle shall trigger the broker. The broker updates its knowledge base when it receives a message from a local diagnoser. Then, if necessary, it sends a request to other local diagnosers in order to receive complementary information and refine its global diagnosis.
This work finds natural application within the context of the WS-Diamond European project ( http://wsdiamond.di.unito.it/ - cf. section 7.1). The overall goal is to monitor and diagnose the processing of a request sent to a web service, which in turn usually requests other services to complete the task: it is common that a fault, occurring in a service, propagates, via communication links, causing its primary effects later, in another service which is not directly responsible for the failure. The distributed chronicles we have designed describe the normal and abnormal behaviors of each web service and the communication with the other services.
The next step, currently under achievement, is the development of a test bed. This platform (cf. section 5.3) has been implemented with generic distributed diagnosis design principles, and should be re-used for other projects in the future. |