Preview

Process Migration

Good Essays
Open Document
Open Document
35735 Words
Grammar
Grammar
Plagiarism
Plagiarism
Writing
Writing
Score
Score
Process Migration
Process Migration
DEJAN S. MILOJICIC†, FRED DOUGLIS‡, YVES PAINDAVEINE††, RICHARD WHEELER‡‡ and SONGNIAN ZHOU*
† HP Labs, ‡ AT&T Labs–Research, †† TOG Research Institute, ‡‡ EMC, and *University of Toronto and Platform Computing

Abstract
Process migration is the act of transferring a process between two machines. It enables dynamic load distribution, fault resilience, eased system administration, and data access locality. Despite these goals and ongoing research efforts, migration has not achieved widespread use. With the increasing deployment of distributed systems in general, and distributed operating systems in particular, process migration is again receiving more attention in both research and product development. As high-performance facilities shift from supercomputers to networks of workstations, and with the ever-increasing role of the World Wide Web, we expect migration to play a more important role and eventually to be widely adopted. This survey reviews the field of process migration by summarizing the key concepts and giving an overview of the most important implementations. Design and implementation issues of process migration are analyzed in general, and then revisited for each of the case studies described: MOSIX, Sprite, Mach and Load Sharing Facility. The benefits and drawbacks of process migration depend on the details of implementation and therefore this paper focuses on practical matters. This survey will help in understanding the potentials of process migration and why it has not caught on. Categories and Subject Descriptors: C.2.4 [Computer-Communication Networks]: Distributed Systems - network operating systems; D.4.7 [Operating Systems]: Organization and Design - distributed systems; D.4.8 [Operating Systems]: Performance: measurements; D.4.2 [Operating Systems]: Storage Management - distributed memories. Additional Key Words and Phrases: process migration, distributed systems, distributed operating systems, load distribution.

1



References: and Acyclic Garbage Collection. Proceedings of the Symposium on Principles of Distributed Computing, pages 135-146. Shapiro, M., Gautron, P., and Mosseri, L. (July 1989). Persistence and Migration for C++ Objects. Proceedings of the ECOOP 1989–European Conference on Object-Oriented Programming. Shivaratri, N. G. and Krueger, P. (May-June 1990). Two Adaptive Location Policies for Global Scheduling Algorithms. Proceedings of the 10th International Conference on Distributed Computing Systems, pages 502–509. Shivaratri, N., Krueger, P., and Singhal, M. (December 1992). Load Distributing for Locally Distributed Systems. IEEE Computer, pages 33–44. Shoham, Y. (1997). An Overview of Agent-oriented Programming. in J.M. Bradshaw, editor, Software Agents, pages 271–290. MIT Press. Shoch, J. and Hupp, J. (March 1982). The Worm Programs Early Experience with Distributed Computing. Communications of the ACM, 25(3):172–180. Shub, C. (February 1990). Native Code Process-Originated Migration in a Heterogeneous Environment. Proceedings of the 18th ACM Annual Computer Science Conference, pages 266–270. Singhal, M. and Shivaratri, N. G. (1994). Advanced Concepts in Operating Systems. McGraw Hill. Sinha, P., Maekawa, M., Shimuzu, K., Jia, X., Ashihara, Utsunomiya, N., Park, and Nakano, H. (August 1991). The Galaxy Distributed Operating System. IEEE Computer, 24(8):34–40. Skordos, P. (August 1995). Parallel Simulation of Subsonic Fluid Dynamics on a Cluster of Workstations. Proceedings of the Fourth IEEE International Symposium on High Performance Distributed Computing. Smith, J. M. (July 1988). A Survey of Process Migration Mechanisms. Operating Systems Review, 22(3):28–40. Smith, J. M. and Ioannidis, J. (1989). Implementing Remote fork() with Checkpoint-Restart. IEEE Technical Committee on Operating Systems Newsletter, 3(1):15–19. Smith, P. and Hutchinson, N. (May 1998). Heterogeneous Process Migration: The Tui System. Software—Practice and Experience, 28(6):611–639. Soh, J. and Thomas, V. (1987). Process Migration for Load Balancing in Distributed Systems. TENCON, pages 888– 892. Squillante, M. S. and Nelson, R. D. (May 1991). Analysis of Task Migration in Shared-Memory Multiprocessor Scheduling. Proceedings of the ACM SIGMETRICS Conference, 19(1):143–155. Stankovic, J. A. (1984). Simulation of the three Adaptive Decentralized Controlled Job Scheduling algorithms. Computer Networks, pages 199–217. Steensgaard, B. and Jul, E. (December 1995). Object and Native Code Thread Mobility. Proceedings of the 15th Symposium on Operating Systems Principles, pages 68–78. Steketee, C., Zhu, W., and Moseley, P. (June 1994). Implementation of Process Migration in Amoeba. Proceedings of the 14th International Conference on Distributed Computer Systems, pages 194–203. Stone, H. (May 1978). Critical Load Factors in Two-Processor Distributed Systems. IEEE Transactions on Software Engineering, SE-4(3):254–258. Stone, H. S. and Bokhari, S. H. (July 1978). Control of Distributed Processes. IEEE Computer, 11(7):97–106. Stumm, M. (1988). The Design and Implementation of a Decentralized Scheduling Facility for a Workstation Cluster. Proceedings of the Second Conference on Computer Workstations, pages 12–22. Sun Microsystems (July 1998). JiniTM Software Simplifies Network Computing. http://www.sun.com/980713/jini/feature.jhtml Svensson, A. (May-June 1990). History, an Intelligent Load Sharing Filter. Proceedings of the 10th International Conference on Distributed Computing Systems, pages 546– 553. Swanson, M., Stoller, L., Critchlow, T., and Kessler, R. (April 1993). The Design of the Schizophrenic Workstation System. Proceedings of the third USENIX Mach Symposium, pages 291–306. Tanenbaum, A.S., Renesse, R. van, Staveren, H. van., Sharp, G.J., Mullender, S.J., Jansen, A.J., and van Rossum, G. (December 1990). Experiences with the Amoeba Distributed Operating System. Communications of the ACM, 33(12):46-63. Tanenbaum, A. (1992). Modern Operating Systems. Prentice Hall, Englewood Cliffs, New Jersey. Tardo, J. and Valente, L. (February 1996). Mobile Agent Security and Telescript. Proceedings of COMPCON’96, pages 52–63. Teodosiu, D., (1999) End-to-End Fault Containment in Scal- 47 able Shared-Memory Multiprocessors. Ph.D. Thesis, Technical Report, Stanford University. Theimer, M. H. and Hayes, B. (June 1991). Heterogeneous Process Migration by Recompilation. Proceedings of the 11th International Conference on Distributed Computer Systems, pages 18–25. Theimer, M. and Lantz, K. (November 1988). Finding Idle Machines in a Workstation-Based Distributed System. IEEE Transactions on Software Engineering, SE-15(11):1444– 1458. Theimer, M., Lantz, K., and Cheriton, D. (December 1985). Preemptable Remote Execution Facilities for the V System. Proceedings of the 10th ACM Symposium on OS Principles, pages 2–12. Tracey, K. M. (April 1991). Processor Sharing for Cooperative Multi-task Applications. Ph.D. Thesis, Technical Report, Department of Electrical Engineering, Notre Dame, Indiana. Tritscher, S. and Bemmerl, T. (February 1992). Seitenorientierte Prozessmigration als Basis fuer Dynamischen Lastausgleich. GI/ITG Pars Mitteilungen, no 9, pages 58–62. Tschudin, C. (April 1997). The Messenger Environment M0– a condensed description. In Mobile Object Systems: Towards the Programmable Internet, LNCS 1222, Springer Verlag, pages 149–156. van Dijk, G. J. W. and van Gils, M. J. (March 1992). Efficient process migration in the EMPS multiprocessor system. Proceedings 6th International Parallel Processing Symposium, pages 58–66. van Renesse, R., Birman, K. P., and Maffeis, S. (April 1996). Horus: A flexible Group Communication System. Communication of the ACM, 39(4):76–85. Vaswani, R. and Zahorjan, J. (October 1991). The implications of Cache Affinity on Processor Scheduling for Multiprogrammed Shared Memory Multiprocessors. Proceedings of the Thirteenth Symposium on Operating Systems Principles, pages 26–40. Venkatesh, R. and Dattatreya, G. R. (August 1990). Adaptive Optimal Load Balancing of Loosely Coupled Processors with Arbitrary Service Time Distributions. Proceedings of the 1990 International Conference on Parallel Processing, I:22–25. Vigna, G. (December 1998). Mobile Agents Security, LNCS 1419, Springer Verlag, to appear. Vitek, I., Serrano, M., and Thanos, D. (April 1997). Security and Communication in Mobile Object Systems. In Mobile Object Systems: Towards the Programmable Internet, LNCS 1222, Springer Verlag, pages 177–200. Walker, B., Popek, G., English, R., Kline, C., and Thiel, G. (October 1983). The LOCUS Distributed Operating System. Proceedings of the 9th Symposium on Operating Systems Principles, 17(5):49–70. Walker, B. J. and Mathews, R. M. (Winter 1989). Process Migration in AIX’s Transparent Computing Facility (TCF). IEEE Technical Committee on Operating Systems Newsletter, 3(1)(1):5–7. Wang, Y.-T. and Morris, R. J. T. (March 1985). Load Sharing in Distributed Systems. IEEE Transactions on Computers, C-34(3):204–217. Wang, C.-J., Krueger, P., and Liu, M. T. (May 1993). Intelligent Job Selection for Distributed Scheduling. Proceedings of the 13th International Conference on Distributed Computing Systems, pages 288–295. Welch, B. B. and Ousterhout, J. K. (June 1988). Pseudo-Devices: User-Level Extensions to the Sprite File System. Proceedings of the USENIX Summer Conference, pages 7–49. Welch, B. (April 1990). Naming, State Management and UserLevel Extensions in the Sprite Distributed File System. Ph.D. Thesis, Technical Report UCB/CSD 90/567, CSD (EECS), University of California, Berkeley. White, J. (1997). Telescript Technology: An Introduction to the Language. White Paper, General Magic, Inc., Sunnyvale, CA. Appeared in Bradshaw, J., Software Agents, AAAI/ MIT Press. White, J.E., Helgeson, S., and Steedman, D.A. (February 1997). System and Method for Distributed Computation Based upon the Movement, Execution, and Interaction of Processes in a Network. United States Patent no. 5603031. Wiecek, C. A. (April 1992). A Model and Prototype of VMS Using the Mach 3.0 Kernel. Proceedings of the USENIX Workshop on Micro-Kernels and Other Kernel Architectures, pages 187–204. Wong, R., Walsh, T., and Paciorek, N. (April 1997). Concordia: An Infrastructure for Collaborating Mobile Agents. Proceedings of the First International Workshop on Mobile Agents, LNCS 1219, Springer Verlag, pages 86–97. Xu, J. and Hwang, K. (November 1990). Heuristic Methods for Dynamic Load Balancing in a Message-Passing Supercomputer. Proceedings of the Supercomputing’90, pages 888– 897. Zajcew, R., Roy, P., Black, D., Peak, C., Guedes, P., Kemp, B., LoVerso, J., Leibensperger, M., Barnett, M., Rabii, F., and Netterwala, D. (January 1993). An OSF/1 UNIX for Massively Parallel Multicomputers. Proceedings of the Winter USENIX Conference, pages 449–468. Zayas, E. (November 1987a). Attacking the Process Migration Bottleneck. Proceedings of the 11th Symposium on Operating Systems Principles, pages 13–24. Zayas, E. (April 1987b). The Use of Copy-on-Reference in a Process Migration System. Ph.D. Thesis, Technical Report CMU-CS-87-121, Carnegie Mellon University. Zhou, D. (1987) A Trace-Driven Simulation Study of Dynamic Load Balancing. Ph.D. Thesis, Technical Report UCB/ CSD 87/305, CSD (EECS), University of California, Berkeley. Zhou, S. and Ferrari, D. (September 1987). An Experimental Study of Load Balancing Performance. Proceedings of the 7th IEEE International Conference on Distributed Com- 48 puting Systems, pages 490–497. Zhou, S. and Ferrari, D. (September 1988). A Trace-Driven Simulation Study of Dynamic Load Balancing. IEEE Transactions on Software Engineering, 14(9):1327–1341. Zhou, S., Zheng, X., Wang, J., and Delisle, P. (December 1994). Utopia: A Load Sharing Facility for Large, Heterogeneous Distributed Computer Systems. Software-Practice and Experience. Zhu, W. (March 1992). The Development of an Environment to Study Load Balancing Algorithms, Process migration and load data collection. Ph.D. Thesis, Technical Report, University of New South Wales. Zhu, W., Steketee, C., and Muilwijk, B. (1995). Load Balancing and Workstation Autonomy on Amoeba. Australian Computer Science Communications (ACSC’95), 17(1):588–597. 49

You May Also Find These Documents Helpful

  • Good Essays

    The above problem refers to problem of load balancing where time of execution of each task varies at random. Dynamic mapping technique can be used for solving the above problem. In dynamic mapping technique, tasks are managed by Master node and all other nodes that depend on Master for work are called slave nodes.…

    • 496 Words
    • 2 Pages
    Good Essays
  • Good Essays

    Among them the first approach was proposed in 1984 by Chandy and Lamport, to build a possible global state of a distributed system [20]. The goal ofthis protocol is to build a consistent distributed snapshot of the distributed system. A distributed snapshot is a collection of process checkpoints (one per process), and a collection of in-flight messages (an ordered list of messages for each point to point channel). The protocol assumes ordered loss-less communication channel; for a given application, messages can be sent or received after or before a process took its checkpoint. A message from process p to process q that is sent by the application after the checkpoint of process p but received before process q checkpointed is said to be an orphan message. Orphan messages must be avoided by the protocol, because they are going to be re-generated by the application, if it were to restart in that snapshot. Similarly, a message from process p to process q that is sent by the application before the checkpoint of process p but received after the checkpoint of process q is said to be missing. That message must belong to the list of messages in channel p to q, or the snapshot is inconsistent. A snapshot that includes no orphan message, and for which all the saved channel messages are missing messages is consistent, since the application can be started from that state and pursue its computation…

    • 1211 Words
    • 5 Pages
    Good Essays
  • Better Essays

    Comparing the principles of Distributed vs. Centralized computing systems and explaining some of the varying issues of each type. Both models have their own architectures and varying complexities with their own problems. I will be verifying the different types and how they interact.…

    • 3954 Words
    • 16 Pages
    Better Essays
  • Powerful Essays

    Pos420 Final Paper

    • 2424 Words
    • 10 Pages

    Tackett, J. and Gunter, D. (1997). Special edition. Using Linux. 3rd edition. Indianapolis, IN. QUE Corporation.…

    • 2424 Words
    • 10 Pages
    Powerful Essays
  • Satisfactory Essays

    Stallings, W. ((2015)). Operating Systems: Internal and Design Principles, 8e. Retrieved from The University of Phoenix eBook Collection Database.…

    • 472 Words
    • 2 Pages
    Satisfactory Essays
  • Satisfactory Essays

    POS-355 Week 5 Operating Systems Analysis 10 Slides with Speaker Notes - Team B new ver.ppt…

    • 400 Words
    • 3 Pages
    Satisfactory Essays
  • Good Essays

    Nt1310 Unit 3 Os

    • 1341 Words
    • 6 Pages

    +Micro kernel: A microkernel architecture assigns only a few essential functions to the kernel, including address spaces, inter process communication (IPC), and basic scheduling. Other OS services are provided by processes, sometimes called servers, that run in user mode and are treated like any other application by the microkernel. This approach decouples kernel and server development. Servers may be customized to specific application or environment requirements. The microkernel approach simplifies implementation, provides flexibility, and is well suited to a distributed environment. In essence, a microkernel interacts with local and remote server processes in the same way, facilitating construction of distributed…

    • 1341 Words
    • 6 Pages
    Good Essays
  • Better Essays

    seabreeze

    • 1203 Words
    • 3 Pages

    Dynamic load balancing and the disaster recovery: “the workloads of the server changes, the virtualization gives the capability for the virtual machines that are more utilizing the resources of the server to be navigated to underutilized servers”. (Burger, 2012) The disaster recovery is the serious element for the IT in sea breeze, because the system crashes could make the large economic losses.…

    • 1203 Words
    • 3 Pages
    Better Essays
  • Good Essays

    Operating systems has come a long way and a much improvement in the way each system built. This paper will discuss the four common types of distributed computer system failures, which are crash failure also known as operating system failure. Hardware failure. Omission failures and byzantine failures. Included in the discussion are failures, which can also occur in a centralized computer system.…

    • 608 Words
    • 2 Pages
    Good Essays
  • Satisfactory Essays

    Filures Paper

    • 498 Words
    • 2 Pages

    This paper will take a look at failures that occur in distributed and centralized systems. Also, it will discuss proper isolation processes, and the procedures that need to be taken to fix these failures.…

    • 498 Words
    • 2 Pages
    Satisfactory Essays
  • Better Essays

    IT 600 Module One Lecture

    • 1256 Words
    • 5 Pages

    Silberschatz, A., Galvin, P. B., & Gagne, G. (2009). Operating system concepts. Hoboken, NJ: John Wiley &…

    • 1256 Words
    • 5 Pages
    Better Essays
  • Best Essays

    Planet Lab

    • 4273 Words
    • 18 Pages

    References: [1] A. Bavier, M. Bowman, D. Culler, B. Chun, S. Karlin, S. Muir, L. Peterson, T. Roscoe, T. Spalink, and M. Wawrzoniak. Operating System Support for Planetary-Scale Network Services. In Proc. 1st NSDI, San Francisco, California, Mar. 2004. [2] D. D. Clark. The Design Philosophy of the DARPA Internet Protocols. In Proceedings of the SIGCOMM ’88 Symposium, pages 106–114, Stanford, 1988. [3] M. Huang, A. Bavier, and L. Peterson. PlanetFlow: Maintaining Accountability for Network Services. In Operating Systems Review, Jan. 2006. [4] L. Peterson, T. Anderson, D. Culler, and T. Roscoe. A Blueprint for Introducing Disruptive Technology into the Internet. In Proc. of ACM HotNets-I, Princeton, New Jersey, Oct. 2002.…

    • 4273 Words
    • 18 Pages
    Best Essays
  • Satisfactory Essays

    Barbara Liskov

    • 399 Words
    • 2 Pages

    Liskov has led many significant projects, including the Venus operating system, a small, low-cost and interactive timesharing system; the design and implementation of CLU; Argus, the first high-level language to support implementation of distributed programs and to demonstrate the technique of promise pipelining; and Thor, an object-oriented database system. WithJeannette Wing, she developed a particular definition of subtyping, commonly known as the Liskov substitution principle. She leads the Programming Methodology Group at MIT, with a current research focus inByzantine fault tolerance and distributed computing.…

    • 399 Words
    • 2 Pages
    Satisfactory Essays
  • Powerful Essays

    Data Center Architecture

    • 3586 Words
    • 15 Pages

    I. I NTRODUCTION Virtualization technology has changed the way hosted services are managed on today’s data centers. Significant cost savings are passed to the service providers as resources in the physical infrastructure are more efficiently utilized when pooled together and shared across the hosted servers. More importantly, virtualization is a key enabler to autonomic management of hosted services in a data center [1]. Planned or unplanned maintenance, asynchronous backups and service migration can be achieved easily with virtualization [2]. For more details on network virtualization, see this survey [3]. The many benefits of virtualization can be extended to managing reliability of the hosted services in a virtualized data center. Typically, load balancing between k service replicas with over-provisioning has been a common and straightforward way to provide fault tolerance. However, this is unsuitable for “stateful” services in which a failure will cause discontinuation in a service. Through asynchronous backups of the virtual hosted entities, states of the active services can be saved to backup nodes that are reserved with complete fail-over bandwidth for reliability guarantees. Furthermore, the backup…

    • 3586 Words
    • 15 Pages
    Powerful Essays
  • Good Essays

    . Description: Modern operating systems, by permitting more than one data processing task to be performed concurrently, make possible more efficient use of system resources. If a program that is being executed to accomplish a task must be delayed, for example, until more data is read into the computer, performance of some other completely independent task can proceed. The central processing unit can execute another program or even execute the same program to accomplish a different task. In the competition for system resources, such as main storage space or data sets (files), however, all multitasking systems are subject to a condition referred to as deadlock. This condition prevents the affected tasks from being carried out to completion. Several conditions must exist for tasks to become deadlocked. Consider a simple example involving only two tasks that are being performed concurrently. Assume that each task has been allocated a system resource which has been used in partially completing the task. Assume also that allocated resources are released only after completion of the task. If completing each task requires an additional resource and if the additional resource has been allocated to the other task, neither task can be completed; that is, task deadlock exists. Such impasses can arise in many forms involving many tasks, and when task deadlock does occur, there is no known general technique for correcting the condition. Problem: Case histories Acquiring for each job step, in turn, the resources required to This module actually issues the ATTACH macroinstruction, As shown in Figure 3. as the jobs were executed. Requests are fraught with deadlock…

    • 265 Words
    • 2 Pages
    Good Essays

Related Topics