Outline
- Abstract
- Keywords
- I. Introduction
- II. Motivating Scenario
- III. Failure Characteristics
- IV. Fault Tolerance in Cloud Computing
- V. Fault Tolerance Mechanisms and Deployment Level Selection Methodology
- VI. Related Work
- VII. Conclusions
- References
رئوس مطالب
- چکیده
- کلید واژه ها
- 1. مقدمه
- 2. سناریوی انگیزشی
- 3. شاخص های خرابی
- A. نمای کلی از زیرساخت ابری
- B. رفتار خرابی اجزای سیستم
- 4.مکانیزم تحمل پذیری خطا در محاسبات ابری
- A. مکانیزم های تحمل پذیری خطا
- B. سطوح استقرار در زیرساخت های ابری
- c. مثالهایی از رفتار در سطوح مختلف استقرار
- 5. مکانیزم های تحمل پذیری خطا و روش انتخاب سطح استقرار
- A. فرایند تطبیق و انتخاب
- B. مثال تشریحی در مورد فرایند تطبیق و انتخاب
- 6. کارهای مرتبط
- 7. نتایج
Abstract
Fault tolerance, reliability and availability in Cloud computing are critical to ensure correct and continuous system operation also in the presence of failures. In this paper, we present an approach to evaluate fault tolerance mechanisms that use the virtualization technology to transparently increase the reliability and availability of applications deployed in the virtual machines in a Cloud. In contrast to several existing solutions that assume independent failures, we take into account the failure behavior of various server components, network and power distribution in a typical Cloud computing infrastructure, the correlation between individual failures, and the impact of each failure on user’s applications. We use this evaluation to study fault tolerance mechanisms under different deployment contexts, and use it as the basis to develop a methodology for identifying and selecting mechanisms that match user’s fault tolerance requirements.
Keywords: Fault Tolerance as a Service - Fault Tolerance Management - Infrastructure CloudsConclusions
We presented a failure model comprising critical cloud infrastructure resources namely, server components (including VM and VMM), network and power distribution, to analyze the impact of each failure on user’s applications. Based on this failure model and representative fault tolerance mechanisms that transparently functions on applications deployed in the VM instances, we discussed suitable deployment contexts and quantified the high level reliability and availability properties for each mechanism. We also presented a methodology to select fault tolerance techniques based on user’s requirements. Our future work will mainly focus on extending the models presented in this paper to a larger scale in order to adapt with dynamically changing Cloud computing system attributes.