An expandable monitoring system for computing clusters

A.G. Tarasov

Authors

A.G. Tarasov Computing Center of FEB RAS

Keywords:

computing cluster operation, monitoring systems, virtualization systems

Abstract

An architecture for a new system of computational resource monitoring is considered. Several concepts devoted to the expansion of functionality for some existing system on the basis of event triggers and notification mechanisms are proposed. The developed monitoring system is implemented in the java programming language. The application of this system to the problems of computing cluster operation, including the usage of the XEN virtualization system, is discussed. Some experimental results obtained for the developed monitoring system are given.

Author Biography

A.G. Tarasov

Computing Center of FEB RAS
• Researcher

References

Barth W. Nagios system and network monitoring. San Francisco: No Starch Press, 2006.
Massie M.L., Chun B.N., Culler D.E. The Ganglia distributed monitoring system: design, implementation, and experience. Berkeley: Berkeley Univ. Press, 2003.
Mukherjee B., Heberlein T.L., Levitt K.N. Network intrusion detection // IEEE Network. 1994. N 8. 26-41.
Spafford E.H., Zamboni D. Data collection mechanisms for intrusion detection systems. CERIAS Tech. Rep. West Lafayette: Purdue Univ. Press, 2000.
Писарев А.В., Пересветов В.В. Нейросетевые компоненты мониторинга вычислительного кластера // Тр. конференции «Информационные и коммуникационные технологии в образовании и научной деятельности». Хабаровск: Изд-во Тихоокеанского гос. ун-та, 2008. 319-323.
Benveniste A., Fabre E., Jard C., Haar S. Diagnosis of asynchronous discrete event systems, a net unfolding approach. IRISA Tech. Rep. RR-4181. Rennes, 2001.
Yingquan W., Christoforos N.H. Distributed non-concurrent fault identification in discrete event systems // Proc. of Multiconference on Computational Engineering in Systems Applications. Lille, 2003.
Birman K.P., Joseph T.A. Exploiting virtual synchrony in distributed systems // Proc. of the 11th ACM Symposium on Operating Systems Principles. Austin, 1987. 123-138.
Aguilera M., Strom R., Sturman D., Astley M., Chandra T. Matching events in a content-based subscription system // Proc. of the 18th ACM Symposium on Principles of Distributed Computing. Atlanta, 1999. 53-61.
Snoeren A., Conley K., Gifforfd D. Mesh-based content routing using XML // Proc. of the 18th ACM Symposium on Operating Systems Principles. Banff, 2001. 160-173.
Bonnet P., Gehrke J., Seshadri P. Towards sensor database systems // Proc. of the 2nd International Conference on Mobile Data Management. Hong Kong, 2001.
http://sourceforge.net/projects/grate
Tарасов А.Г. Трeхуровневая система мониторинга расширенной функциональности // Тр. международной конференции «Параллельные вычислительные технологии». Челябинск: Изд-во ЮУрГУ, 2008. 464-469.
Scottie M., Minnich R. Supermon: a high-speed cluster monitoring system // Proc. of IEEE Cluster Computing. Chicago, 2002. 39-46.
Тарасов А.Г. Мониторинг вычислительного кластера с использованием java-технологий // XXX Дальневосточная математическая школа-семинар имени акад. Е. В. Золотова. Хабаровск: Изд-во ДВГУПС, ИПМ ДВО РАН, 2005. 201.
Пересветов В.В., Сапронов А.Ю., Тарасов А.Г. Вычислительный кластер бездисковых рабочих станций. Препринт № 83 ВЦ ДВО РАН. Хабаровск, 2005.
Пересветов В.В., Сапронов А.Ю., Тарасов А.Г., Шаповалов Т.С. Удалeнный доступ к вычислительному кластеру ВЦ ДВО РАН // Вычислительные технологии. 11. Новосибирск: Изд-во ИВТ СО РАН, 2006. 45-51.
Пересветов В.В., Сапронов А.Ю., Тарасов А.Г., Шаповалов Т.С. Организация работы вычислительного кластера в режиме удалeнного доступа. Препринт № 110 ВЦ ДВО РАН. Хабаровск, 2007.

An expandable monitoring system for computing clusters

Authors

Keywords:

Abstract

Author Biography

A.G. Tarasov

References

Downloads

Published

How to Cite

Issue

Section

Most read articles by the same author(s)

Language

Information

Make a Submission