Preview

Data Types

Powerful Essays
Open Document
Open Document
951 Words
Grammar
Grammar
Plagiarism
Plagiarism
Writing
Writing
Score
Score
Data Types
Dynamic Dependency Analysis of Ordinary Programs 1
Todd M. Austin and Gurindar S. Sohi Computer Sciences Department University of Wisconsin-Madison 1210 W. Dayton Street Madison, WI 53706 faustin sohig@cs.wisc.edu
A quantitative analysis of program execution is essential to the computer architecture design process. With the current trend in architecture of enhancing the performance of uniprocessors by exploiting ne-grain parallelism, rst-order metrics of program execution, such as operation frequencies, are not su cient characterizing the exact nature of dependencies between operations is essential. This paper presents a methodology for constructing the dynamic execution graph that characterizes the execution of an ordinary program (an application program written in an imperative language such as C or FORTRAN) from a serial execution trace of the program. It then uses the methodology to study parallelism in the SPEC benchmarks. We see that the parallelism can be bursty in nature (periods of lots of parallelism followed by periods of little parallelism), but the average parallelism is quite high, ranging from 13 to 23,302 operations per cycle. Exposing this parallelism requires renaming of both registers and memory, though renaming registers alone exposes much of this parallelism. We also see that fairly large windows of dynamic instructions would be required to expose this parallelism from a sequential instruction stream.

Abstract

1 Introduction

Two things generally a ect the advance of computer architectures: a better understanding of program execution, and new or better implementation technologies. It is therefore very important to understand the dynamics of program execution when considering the design of future-generation architectures. To date, most processors have either executed instructions sequentially, or have overlapped the execution of a few instructions from a sequential instruction stream (via pipelining). For such processors, the relevant

You May Also Find These Documents Helpful

  • Satisfactory Essays

    The second category of fault changes individual instructions in the text segment. These faults are intended to approximate the assembly-level manifestation of real C-level programming…

    • 285 Words
    • 2 Pages
    Satisfactory Essays
  • Satisfactory Essays

    Nt1310 Unit 1 Study Guide

    • 378 Words
    • 2 Pages

    Multiple threads can interfere with each other when sharing hardware resources such as caches or translation lookaside buffers (TLBs). As a result, execution times of a single thread are not improved but can be degraded, even when only one thread is executing, due to lower frequencies or additional pipeline stages that are necessary to accommodate thread-switching hardware.…

    • 378 Words
    • 2 Pages
    Satisfactory Essays
  • Powerful Essays

    En1320 Unit 1 Research Paper 1

    • 27742 Words
    • 111 Pages

    is extensively discussed in other works and is not the focus of this guide. The second…

    • 27742 Words
    • 111 Pages
    Powerful Essays
  • Satisfactory Essays

    Faith Integration

    • 613 Words
    • 3 Pages

    The processor could keep track of what locations are associated with each process and limit access to locations that are outside of a program's extent. By using base and limits registers and by performing a check for every memory access, information regarding the extent of a program's memory could be maintained…

    • 613 Words
    • 3 Pages
    Satisfactory Essays
  • Better Essays

    Schneider, G.M. & Gersting, J.L., (2013). Invitation to Computer Science. (6th ed.). Boston, Ma: press…

    • 2002 Words
    • 9 Pages
    Better Essays
  • Satisfactory Essays

    Chapter 2: Data Manipulation Computer Science: An Overview Eleventh Edition by J. Glenn Brookshear Copyright © 2012 Pearson Education, Inc. Chapter 2: Data Manipulation • • • • • • 2.1 Computer Architecture 2.2 Machine Language 2.3 Program Execution 2.4 Arithmetic/Logic Instructions 2.5 Communicating with Other Devices 2.6 Other Architectures Copyright © 2012 Pearson Education, Inc. 0-2 1 Computer Architecture • Central Processing Unit (CPU) or processor – Arithmetic/Logic unit versus Control unit – Registers • General purpose • Special purpose • B Bus • Motherboard Copyright © 2012 Pearson Education, Inc. 0-3 Figure 2.1 CPU and main memory connected via a bus…

    • 783 Words
    • 4 Pages
    Satisfactory Essays
  • Powerful Essays

    Due March 2, 2007 Submitted by: SUDEEPTHI MOGALLA DEPARTMENT OF COMPUTER SCIENCE NORTH CAROLINA STATE UNIVERISTY Email: smogall@ncsu.edu…

    • 4024 Words
    • 17 Pages
    Powerful Essays
  • Good Essays

    There are a million different types of data in the world. Some types we have learned through years of education and others have yet to be discovered. One question about data types that is asked frequently is “Why do we care?” Though there is no text book answer; I believe we care because without these various types of data we will not be able to answer even the simplest of questions.…

    • 816 Words
    • 4 Pages
    Good Essays
  • Powerful Essays

    Technical University of Denmark Department of Computer Science DK-2800 Lyngby Copenhagen Denmark dat JN@NEUVMl . bitnet…

    • 1773 Words
    • 6 Pages
    Powerful Essays
  • Satisfactory Essays

    Instruction set types

    • 431 Words
    • 2 Pages

    In this form of architecture, instructions are highly encoded in order to enhance the code density.Due to the way the instructions are packed together, results are in smaller program sizes, and they have slow memory access.…

    • 431 Words
    • 2 Pages
    Satisfactory Essays
  • Satisfactory Essays

    The war

    • 1240 Words
    • 5 Pages

    B.A. in Mathematics, Reed College, 1971. M.Sc. 1974, Ph.D. 1979, in Computer Science, Stanford University. Fulbright Senior Scholar Award (1997); Fellow of the Association Computing Machinery, 2001.…

    • 1240 Words
    • 5 Pages
    Satisfactory Essays
  • Good Essays

    Branch Delay

    • 1747 Words
    • 7 Pages

    The delayed branch is a difficult topic to grasp. In the DLX 5-stage pipeline we have found it easy to misunderstand the purpose of filling the branch delay slot with a single necessary instruction. Our focus is to remove the mystery of delayed branches with examples and explanations that clarify the topic. We will consider the case where machines with delayed branches have a single instruction delay, as the Hennessey and Patterson book explains in great detail.…

    • 1747 Words
    • 7 Pages
    Good Essays
  • Better Essays

    The MMX TM Technology extension to the Intel Architecture is designed to accelerate multimedia and communications software running on Intel Architecture processors (Peleg and Weiser). The technology introduces new data types and instructions that implement a SIMD architecture model and is defined in a way that maintains full compatibility with all existing Intel Architecture processors, operating systems, and applications. MMX technology on average delivers 1.5 to 2 times performance gains for multimedia and communications applications in comparison to running on the same processor but without using MMX technology. This extension is the most significant addition to the Intel Architecture since the Intel I386 and will be implemented on proliferation of the Pentium processor family and also appear on future Intel Architecture processors.…

    • 821 Words
    • 4 Pages
    Better Essays
  • Powerful Essays

    Advance Computer Architecture

    • 66876 Words
    • 268 Pages

    Computer technology has made incredible progress in the roughly 55 years since the first general-purpose electronic computer was created. Today, less than a thousand dollars will purchase a personal computer that has more performance, more main memory, and more disk storage than a computer bought in 1980 for $1 million. This rapid rate of improvement has come both from advances in the technology used to build computers and from innovation in computer design. Although technological improvements have been fairly steady, progress arising from better computer architectures has been much less consistent. During the first 25 years of electronic computers, both forces made a major contribution; but beginning in about 1970, computer designers became largely dependent upon integrated circuit technology. During the 1970s, performance continued to improve at about 25% to 30% per year for the mainframes and minicomputers that dominated the industry. The late 1970s saw the emergence of the microprocessor. The ability of the microprocessor to ride the improvements in integrated circuit technology more closely than the less integrated mainframes and minicomputers led to a higher rate of improvement—roughly 35% growth per year in performance.…

    • 66876 Words
    • 268 Pages
    Powerful Essays
  • Good Essays

    Mcse 011(Mca 5)Ignou

    • 1946 Words
    • 8 Pages

    A.J.Bernstein has elaborated the work of data dependency and derived some conditions based on which we can decide the parallelism of instructions or processes. Bernstein conditions are based on the following two sets of variables: i) The Read set or input set RI that consists of memory locations read by the statement of instruction I1. ii) The Write set or output set WI that consists of memory locations written into by instruction I1. The sets RI and WI are not disjoint as the same locations are used for reading and writing by SI. The following are Bernstein Parallelism conditions which are used to determine whether statements are parallel or not: 1) Locations in R1 from which S1 reads and the locations W2 onto which S2 writes must be mutually exclusive. That means S1 does not read from any memory location onto which S2 writes. It can be denoted as: R1∩W2= 2) Similarly, locations in R2 from which S2 reads and the locations W1 onto which S1 writes must be mutually exclusive. That means S2 does not read from any memory location onto which S1 writes. It can be denoted as: R2∩W1= 3) The memory locations W1 and W2 onto which S1 and S2 write, should not be read by S1 and S2. That means R1 and R2 should be independent of W1 and W2. It can be denoted as : W1∩W2= To show the operation of Bernstein’s conditions, consider the following instructions of sequential program:…

    • 1946 Words
    • 8 Pages
    Good Essays