Preview

Real Time Fault Tolerance

Powerful Essays
Open Document
Open Document
26468 Words
Grammar
Grammar
Plagiarism
Plagiarism
Writing
Writing
Score
Score
Real Time Fault Tolerance
Contents
1 INTRODUCTION 2 BASIC DEFINITIONS 3 FAULTS, ERRORS, AND FAILURES 4 FAULT DURATION 5 DESIGN TECHNIQUES 6 FAULT-TOLERANT TECHNIQUES 7 TYPES OF REDUNDANCY 8 FAULT-TOLERANT ARCHITECTURE 9 REAL-TIME FAULT-TOLERANT SYSTEMS 10 THE LATENCY PROBLEM 11 APPLICATION AREAS 12 SOFTWARE FAULTS 13 DEPENDABILITY MODELLING 2 5 11 15 19 21 25 33 54 58 62 75 85

1

1 INTRODUCTION
Welcome to the , CSE42RFS Real-Time and Fault-Tolerant Systems!
Course Objectives Historical Background

2

INTRODUCTION

3

COURSE OBJECTIVES
It is assumed that students in this course have not been exposed previously to the terminology and techniques used in the fault-tolerant and real-time computing eld. Henceforth the principal aim of this course is to provide the students an introduction to the design and analysis of fault-tolerant and real-time systems. After completing this course, a student will be able to: Comprehend the existing fault-tolerant and real-time computing literature. Describe, explain, generalise, classify, adapt and assess those techniques, which are currently available for designing and analyzing reliable faulttolerant and real-time computer systems. Outline the methodologies that are available to combat system failures, caused by hardware and/or software. Recognise the analysis techniques, which can be used to verify that a system has met its requirements. Discuss the system design fundamentals of a fault-tolerant and real-time system used by Australia 's leading companies.

INTRODUCTION

4

HISTORICAL BACKGROUND
Through constant technological innovation, the vacuum tubes of the early computer systems have been replaced by chips with very large scale integration (VLSI) consisting of many thousands of gates. This has resulted in dramatic changes in the scale and complexity of computer systems, in both hardware and software aspects. Such changes have enabled certain tasks that were previously performed manually, or were even impossible, to be carried out by computers:



References: Dordiecht, 1999, pp. 361-374. 7. Object Management Group, “Fault Tolerance Request for Proposals,” 1999; available online at http://www.omg.org/techprocess/meetings/schedule/ Fault_Tolerance_RFP.htm.

You May Also Find These Documents Helpful

  • Powerful Essays

    Primary hardware that must have a backup to ensure availability is the web server and the database server. In addition to having a primary and a backup of each of these two servers a replication server must also be implemented into the architecture in order for the databases on each server to mirror each other. With proper planning and implementation of this system if the primary servers have a failure there will not be any interruption of service to the customer who is accessing the…

    • 2777 Words
    • 12 Pages
    Powerful Essays
  • Good Essays

    Designing a fault-tolerant system can be done at different levels of the software stack. We call general purpose the approaches that detect and correct the failures at a given level of that stack, masking them entirely to the higher levels (and ultimately to the end-user, who eventually see a correct result, despite the occurrence of failures). General-purpose approaches can target specific types of failures (e.g. message loss, or message corruption), and let other types of failures hit higher levels of the software stack. In this section, we discuss a set of well-known and recently developed protocols to provide general-purpose fault tolerance for a large set of failure types, at different levels of the software stack, but always below the…

    • 1211 Words
    • 5 Pages
    Good Essays
  • Satisfactory Essays

    Page 5 FCAPS Concepts Fault management Security management Performance management IS3120 Network Communications Infrastructure Configuration management Accounting management © ITT Educational Services, Inc. All rights reserved. Page 6 Fault Management Detection…

    • 599 Words
    • 9 Pages
    Satisfactory Essays
  • Good Essays

    Crash failures normally associated which a server fault in a typical distributed system. Inherently crash failures are interrupt operations of the server and can halt operation for a considerable time. Operating system or software failures come in many more varieties than hardware failures. Software bugs in distributed systems can be difficult to replicate and, consequently, repair and or debug. Corresponding fault tolerant systems are developed and employed with respect to these affects. An operating system or software failure can also occur in a centralized system such as a database this is why it is highly recommended to back up a database using stable mass storage media.…

    • 608 Words
    • 2 Pages
    Good Essays
  • Satisfactory Essays

    Filures Paper

    • 498 Words
    • 2 Pages

    There will be a discussion of four different failures within this paper. The failures are as follows: crash failures, timing failures, network failures, and byzantine failures.…

    • 498 Words
    • 2 Pages
    Satisfactory Essays
  • Powerful Essays

    Cnit 176 Final Exam

    • 2381 Words
    • 10 Pages

    3. In the model of modern computing systems, the physical circuits and gates that actually perform…

    • 2381 Words
    • 10 Pages
    Powerful Essays
  • Better Essays

    Week 7 Research Paper Sun

    • 1241 Words
    • 4 Pages

    “Hardware is the physical embodiment of an information system. It is one of the main elements which creates the information system cycle” (n. d.). Operational of business information systems depends on the particular hardware environment, such as various types of databases and web servers, LAN, INTERNET, bank POS terminals, etc. These environments rely on a large number of hardware devices that have a failure rate. When failure occurs, it will inevitably affect the normal operation of information systems. Failure often occurs mainly in electrical machinery and other aspects of hard faults. These failures of hardware are more frequent.…

    • 1241 Words
    • 4 Pages
    Better Essays
  • Powerful Essays

    References: [1] C. J. Dimmer, “The Tandem Non-stop System”, Resilient Computing Systems, (T. Anderson , ed.), pp. 178196, Collins, 1985 [2] D. Wilson, “The STRATUS Computer system”, Resilient Computing Systems, (T. Anderson , ed.), pp. 208231, Collins, 1985. [3] S. K. Shrivastava, G. N. Dixon, and G. D. Parrington, “An Overview of Arjuna: A Programming System for Reliable Distributed Computing,” IEEE Software, Vol. 8, No. 1, pp. 63-73, January 1991. [4]G. D. Parrington et al, “The Design and Implementation of Arjuna”, USENIX Computing Systems Journal, Vol. 8., No. 3, pp. 253-306, Summer 1995. [5] S. K. Shrivastava, “Lessons learned from building and using the Arjuna distributed programming system,” Int. Workshop on Distributed Computing Systems: Theory meets Practice, Dagsthul, September 1994, LNCS 938, Springer-Verlag, July 1995. [6] P.A. Bernstein et al, “Concurrency Control and Recovery in Database Systems”, Addison-Wesley, 1987. [7] M. C. Little, “Object Replication in a Distributed System”, PhD Thesis, University of Newcastle upon Tyne, September 1991. (ftp://arjuna.ncl.ac.uk/pub/Arjuna/Docs/Theses/TR-376-9-91_EuropeA4.tar.Z) [8] M. C. Little and S. K. Shrivastava, “Object Replication in Arjuna”, BROADCAST Project Technical Report No. 50, October 1994. (ftp://arjuna.ncl.ac.uk/pub/Arjuna/Docs/Papers/Object_Replication_in_Arjuna.ps.Z)…

    • 8069 Words
    • 33 Pages
    Powerful Essays
  • Good Essays

    In a distributed system, fault tolerance is something that needs to be taken into account to prevent catastophic situations and data loss. Fault olerance is simply the ability of a system to continue operating in the event of undesired changes to the external environment or internal structure of the system occurs.…

    • 571 Words
    • 3 Pages
    Good Essays
  • Powerful Essays

    Database Security

    • 7064 Words
    • 29 Pages

    protected access to the contents of a database as well as preserve the integrity, consistency, and…

    • 7064 Words
    • 29 Pages
    Powerful Essays
  • Powerful Essays

    Real-time computing prevalent in industrial control systems. In order to meet the timing requirements, the systems are usually built on specialty components with high determinism. As the capabilities of the hardware advances, it is desirable to integrate multiple systems into a single physical machine to lower the operating and maintenance costs. Virtualization is the well-known technology for consolidating multiple systems by multiplexing various resources among systems. However, the resource sharing of virtualization conflicts with the determinism of real-time computing. It is challenging to find the balance for real-time virtualization because the analysis…

    • 1061 Words
    • 5 Pages
    Powerful Essays
  • Powerful Essays

    Computers, Vol. 92, Burlington: Academic Press, 2014, pp. 161-202. ISBN: 978-012-420232-0 © Copyright 2014 Elsevier Inc. Academic Press…

    • 8449 Words
    • 35 Pages
    Powerful Essays
  • Good Essays

    * Real-time operating system (RTOS) - Real-time operating systems are used to control machinery, scientific instruments and industrial systems. An RTOS typically has very little user-interface capability, and no end-user utilities, since the system will be a "sealed box" when delivered for use. A very important part of an RTOS is managing the resources of the computer so that a particular operation executes in precisely the same amount of time, every time it occurs. In a complex machine, having a part move more quickly just because system resources are available may be just as catastrophic as having it not move at all because the system is busy.…

    • 615 Words
    • 3 Pages
    Good Essays
  • Good Essays

    distributed system

    • 2436 Words
    • 10 Pages

    Middleware is systems software that resides between the applications and the underlying operating systems, network protocol stacks, and hardware. Its primary role is to;…

    • 2436 Words
    • 10 Pages
    Good Essays
  • Good Essays

    Student Registration System

    • 18227 Words
    • 73 Pages

    Computer Architecture Lab Department of Computer Science and Engineering Technology licentiate thesis 2002-10 ISBN 97-88834-38-7…

    • 18227 Words
    • 73 Pages
    Good Essays