CZ3001 Advanced Computer Architecture

Descripción

Undergraduate Computer Science Fichas sobre CZ3001 Advanced Computer Architecture, creado por Deka Auliya el 23/04/2015.
Deka Auliya
Fichas por Deka Auliya, actualizado hace más de 1 año
Deka Auliya
Creado por Deka Auliya hace alrededor de 9 años
30
1

Resumen del Recurso

Pregunta Respuesta
INTRODUCTION ALU Perform integer arithmetic operations and logical operations
INTRODUCTION CONTROL UNIT Generate control signals for data movement and data storage operations, let ALU perform operations specified in the instruction
INTRODUCTION CPU Central Processing Unit ALU + CU + Set of Registers
INTRODUCTION INSTRUCTION FETCH Fetch instruction from memory and get ready to fetch the next instruction
INTRODUCTION Instruction Decode Decode instruction, fetch operands from register file
INTRODUCTION Execute Execute ALU operation specified in the opcode of instruction
INTRODUCTION MEMORY ACCESS Perform read/write for load/store operations
INTRODUCTION WRITE BACK Write the result back to register file
PERFORMANCE METRICS AND ENCHANCEMENT Execution time = (number of instructions) x (number of cycles per instruction) x clock cycle time Execution time = IC x CPI x Tc IC: Instruction Count, CPI: Clock per instruction, Tc: Clock period
PERFORMANCE METRICS AND ENCHANCEMENT PERFORMANCE INDICATOR Execution time is the indicator Less execution time -> better performance = 1/execution time
INTRODUCTION REGISTERS Store the input to be used by ALU and control unit, and sometimes the computed results.
PERFORMANCE METRICS AND ENCHANEMENT AVERAGE CPI AVERAGE CPI = (total no. of cycles)/total no. of instructions in program (IC) No. of cycles = SUM(IC k x CPI k), for some category k Execution Time = IC X Average CPI X Tc
PERFORMANCE METRICS AND ENHANCEMENT FACTORS AFFECTING EXECUTION TIME 1. Clock Period: datapath design and implementation, semiconductor technology 2. Instruction Count: isa, application/program 3. Clocks Per Instrutions: isa, datapath, pipelined and parallel HW
PERFORMANCE METRICS AND ENHANCEMENT CPI AND DATAPATH Multi Cycle Datapath (CPI > 1) < Instruction pipelining (CPI = 1) < Instruction level parallel processing < VLIW/Superscalar (CPI < 1)
PERFORMANCE METRICS AND ENHANCEMENT SPEEDUP Speed up of computer B over A or computer B is the enhanced version of A: Speed Up = time execution for A / time execution for B
PERFORMANCE METRICS AND ENHANCEMENT ENHANCED VS UNENHANCED 1. Fraction E is enhanced by factor of S, U = (1 - E) ⇒ U is not enhanced fraction Time for unenhanced fraction = T x U Time for enhanced fraction = T x E / S Total exection time for enhanced machine T' = T x (1 - E) + T x E / S 2. Speedup = T/T' 3.Maximum speedup = T/(T X U) = 1/U, when S is very large, T' = T x U
PERFORMANCE METRICS AND ENHANCEMENT AMDAHL'S LAW Speedup via parallelism is limited by componen of application that cannot be enhanced ⇒ speed up limited by factor of 1/U
PERFORMANCE METRICS AND ENHANCEMENT THROUGHPUT RATE Number of output produced in one clock cycle. Increased by parallel processing and/or pipelined processing
PERFORMANCE METRIC AND ENHANCEMENT RESPONSE TIME/LATENCY Interval between time when system receive first input and gives first output: (measured in clock cycles) Time interval between stimulation and response Important for Hard-Real Time system
PERFORMANCE METRIC AND ENHANCEMENT MIPS (MILLION INSTRUCTIONS PER SECONDS) 1. Native MIPS = IC/(Execution Time X 10^6) 2. Peak MIPS = choose sequence that provide maximum MIPS 3. Relative MIPS = execution time ref x MIPS ref / (execution time machine being evaluated) 4. Varies with ISA (complexity), and choice of instruction mix 5. Does not guarantee better performance 6. Relative MIPS helps rate evolving design of same computer
PERFORMANCE METRIC AND ENHANCEMENT FLOPS (FLOATING-POINT OPERATIONS PER SECONDS) FLOPS = number of floating-point operations / (execution time x 10^6)
PERFORMANCE METRIC AND ENHANCEMENT PERFORMANCE AVERAGING Compare the speedup and execution time 1. Use of AM: arithmetic mean average execution time = (1/n) x sum (T k x W k ) n: number of programs, k: 1, 2, .. n, w: multiple of times 2. Use of GM: geometric mean, when ratio is given average speedup = square_root(product(s k)); s: speedup, k: 1, 2, .., n 3. Use of HM: harmony mean when MIPS is given average MIPS = n / (sum(1/m k)) n: number of programs, k: 1, 2, .., n, m: MIPS
CONCEPT OF PARALLEL PROCESSING AND PIPELINING Total Time = Total computation time + total communication time 1. Time for distribution = d x k, 2. Time for computation of local sum 3. Time for collection of partial results = d x k 4. Time to add partial results = t x k Total Time = 1 + 2 + 3 + 4 5. Total computation time = total computation of local sum + time for partial results 6. Total distribution time = time for distribution + time for collection k: number of elements
CONCURRENT COMPUTING: PARALLEL PROCESSING AND PIPELINING PIPELINING (CONCURRENCY) very fast only useful for large number of similar tasks latency increases with number of pipeline stages
PARALLEL PROCESSING AND PIPELINING PARALLELISM INSTRUTION LEVEL PARALELLISM: multiple instructions executed concurrently DATA PARALELLISM: each processor perform same task on different data TASK PARALLELISM: processor performs different independenttask
PARALLEL PROCESSING AND PIPELINING CLOCK Clock signal: periodic pulse Working processors is governed (synchronized) by clock Data movement is governed by clock
PARALLEL PROCESSING AND PIPELINING CRITICAL PATH maximum of all computation time (propagation time) required by any adjacent register (input-output) duration of clock period need to be more or equal to critical path
POWER DISSIPATION DYNAMIC POWER mainly due to charging and discharging of pseudo capacitors– when a logic state changes from 0 → 1 or from 1 → 0 short-circuit power (due to the current from Vdd to GND when both transistors are ON for a short time)
POWER DISSIPATION STATIC POWER consumed anytime the processor is powered on. mainly due to leakage current
POWER DISSIPATION DYNAMIC POWER BEHAVIOUR Pdyn = A C V² f C: total load capacitance in the circuit, increase with logic circuits used f: clock frequency V : operating voltage (also called Vdd) A: switching activity factor: the fraction of transistors switch during a clock cycle (in average).
POWER DISSIPATION STATIC POWER BEHAVIOUR Static Power = V x I leak where Ileak is leakage current and V is operating voltage
ISA MEMORY ORGANISATION Byte addressable: each memory address references to one byte, memory address is the index of the memory array
ISA WORD Fixed-sized group of bits handled as one unit by the processor. How many words can be addressed? 2^(word-length) / #bytes in group - Address of each instruction is multiple of #bytes in group - Address of next instruction is current address + #bytes of word
Mostrar resumen completo Ocultar resumen completo

Similar

Computing Hardware - CPU and Memory
ollietablet123
SFDC App Builder 2
Parker Webb-Mitchell
Data Types
Jacob Sedore
Intake7 BIM L1
Stanley Chia
Software Processes
Nurul Aiman Abdu
Design Patterns
Erica Solum
CCNA Answers – CCNA Exam
Abdul Demir
Abstraction
Shannon Anderson-Rush
Spyware
Sam2
HTTPS explained with Carrier Pigeons
Shannon Anderson-Rush
Data Analytics
anelvr