A proposal of event correlation for distributed network fault management and its evaluation

Nei Kato, Kohei Ohta, Tomohiro Ika, Glenn Mansfield, Yoshiaki Nemoto

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)

Abstract

In a distributed network management environment, a NMS (Network Management Station) interacts with several agents in different sub-networks. In the network fault management context, the NMS detects symptoms that indicate some abnormality e.g. a surge in ICMP traffic, which may be caused by some network malfunction or misuse. The occurrence of a symptom is an event. Large number of events may be detected by an NMS. The sheer number of these events makes it difficult, if not impossible, for an NMS to diagnose these events. Generally, a fault may have a cascading effect which may, in turn, give rise to a very large number of events. The sequence of events and their correlation play an important role in fault management and diagnosis. In the distributed environment of todays networks, the absence of any uniform time for reference makes this a challenging task. In the present network management framework of SNMP, a Manager maintains a notion of the clock of the agent it interacts with. But this mechanism is inadequate to determine the sequence of events and their correlation, more so, in a distributed environment which may involve several managers. In this paper we propose a mechanism for ordering and correlating events detected in large-scale network which is managed in a distributed manner within the SNMP framework. Our algorithm uses the concept of a Network Management Clock (NMC). The NMC is a virtual clock maintained by a manager based on sysUpTime readings from each SNMP agent. In this paper, the algorithm, its implementation and evaluation will be discussed.

Original languageEnglish
Pages (from-to)859-867
Number of pages9
JournalIEICE Transactions on Communications
VolumeE82-B
Issue number6
Publication statusPublished - 1999 Jan 1

Keywords

  • Distributed network management
  • Event correlation
  • NMC (Network Management Clock)

ASJC Scopus subject areas

  • Software
  • Computer Networks and Communications
  • Electrical and Electronic Engineering

Fingerprint Dive into the research topics of 'A proposal of event correlation for distributed network fault management and its evaluation'. Together they form a unique fingerprint.

Cite this