CZ3001 Advanced Computer Architecture

Frage	Antworten
INTRODUCTION ALU	Perform integer arithmetic operations and logical operations
INTRODUCTION CONTROL UNIT	Generate control signals for data movement and data storage operations, let ALU perform operations specified in the instruction
INTRODUCTION CPU	Central Processing Unit ALU + CU + Set of Registers
INTRODUCTION INSTRUCTION FETCH	Fetch instruction from memory and get ready to fetch the next instruction
INTRODUCTION Instruction Decode	Decode instruction, fetch operands from register file
INTRODUCTION Execute	Execute ALU operation specified in the opcode of instruction
INTRODUCTION MEMORY ACCESS	Perform read/write for load/store operations
INTRODUCTION WRITE BACK	Write the result back to register file
PERFORMANCE METRICS AND ENCHANCEMENT	Execution time = (number of instructions) x (number of cycles per instruction) x clock cycle time Execution time = IC x CPI x Tc IC: Instruction Count, CPI: Clock per instruction, Tc: Clock period
PERFORMANCE METRICS AND ENCHANCEMENT PERFORMANCE INDICATOR	Execution time is the indicator Less execution time -> better performance = 1/execution time
INTRODUCTION REGISTERS	Store the input to be used by ALU and control unit, and sometimes the computed results.
PERFORMANCE METRICS AND ENCHANEMENT AVERAGE CPI	AVERAGE CPI = (total no. of cycles)/total no. of instructions in program (IC) No. of cycles = SUM(IC k x CPI k), for some category k Execution Time = IC X Average CPI X Tc
PERFORMANCE METRICS AND ENHANCEMENT FACTORS AFFECTING EXECUTION TIME	1. Clock Period: datapath design and implementation, semiconductor technology 2. Instruction Count: isa, application/program 3. Clocks Per Instrutions: isa, datapath, pipelined and parallel HW
PERFORMANCE METRICS AND ENHANCEMENT CPI AND DATAPATH	Multi Cycle Datapath (CPI > 1) < Instruction pipelining (CPI = 1) < Instruction level parallel processing < VLIW/Superscalar (CPI < 1)
PERFORMANCE METRICS AND ENHANCEMENT SPEEDUP	Speed up of computer B over A or computer B is the enhanced version of A: Speed Up = time execution for A / time execution for B
PERFORMANCE METRICS AND ENHANCEMENT ENHANCED VS UNENHANCED	1. Fraction E is enhanced by factor of S, U = (1 - E) ⇒ U is not enhanced fraction Time for unenhanced fraction = T x U Time for enhanced fraction = T x E / S Total exection time for enhanced machine T' = T x (1 - E) + T x E / S 2. Speedup = T/T' 3.Maximum speedup = T/(T X U) = 1/U, when S is very large, T' = T x U
PERFORMANCE METRICS AND ENHANCEMENT AMDAHL'S LAW	Speedup via parallelism is limited by componen of application that cannot be enhanced ⇒ speed up limited by factor of 1/U
PERFORMANCE METRICS AND ENHANCEMENT THROUGHPUT RATE	Number of output produced in one clock cycle. Increased by parallel processing and/or pipelined processing
PERFORMANCE METRIC AND ENHANCEMENT RESPONSE TIME/LATENCY	Interval between time when system receive first input and gives first output: (measured in clock cycles) Time interval between stimulation and response Important for Hard-Real Time system
PERFORMANCE METRIC AND ENHANCEMENT MIPS (MILLION INSTRUCTIONS PER SECONDS)	1. Native MIPS = IC/(Execution Time X 10^6) 2. Peak MIPS = choose sequence that provide maximum MIPS 3. Relative MIPS = execution time ref x MIPS ref / (execution time machine being evaluated) 4. Varies with ISA (complexity), and choice of instruction mix 5. Does not guarantee better performance 6. Relative MIPS helps rate evolving design of same computer
PERFORMANCE METRIC AND ENHANCEMENT FLOPS (FLOATING-POINT OPERATIONS PER SECONDS)	FLOPS = number of floating-point operations / (execution time x 10^6)
PERFORMANCE METRIC AND ENHANCEMENT PERFORMANCE AVERAGING	Compare the speedup and execution time 1. Use of AM: arithmetic mean average execution time = (1/n) x sum (T k x W k ) n: number of programs, k: 1, 2, .. n, w: multiple of times 2. Use of GM: geometric mean, when ratio is given average speedup = square_root(product(s k)); s: speedup, k: 1, 2, .., n 3. Use of HM: harmony mean when MIPS is given average MIPS = n / (sum(1/m k)) n: number of programs, k: 1, 2, .., n, m: MIPS
CONCEPT OF PARALLEL PROCESSING AND PIPELINING	Total Time = Total computation time + total communication time 1. Time for distribution = d x k, 2. Time for computation of local sum 3. Time for collection of partial results = d x k 4. Time to add partial results = t x k Total Time = 1 + 2 + 3 + 4 5. Total computation time = total computation of local sum + time for partial results 6. Total distribution time = time for distribution + time for collection k: number of elements
CONCURRENT COMPUTING: PARALLEL PROCESSING AND PIPELINING PIPELINING (CONCURRENCY)	very fast only useful for large number of similar tasks latency increases with number of pipeline stages
PARALLEL PROCESSING AND PIPELINING PARALLELISM	INSTRUTION LEVEL PARALELLISM: multiple instructions executed concurrently DATA PARALELLISM: each processor perform same task on different data TASK PARALLELISM: processor performs different independenttask
PARALLEL PROCESSING AND PIPELINING CLOCK	Clock signal: periodic pulse Working processors is governed (synchronized) by clock Data movement is governed by clock
PARALLEL PROCESSING AND PIPELINING CRITICAL PATH	maximum of all computation time (propagation time) required by any adjacent register (input-output) duration of clock period need to be more or equal to critical path
POWER DISSIPATION DYNAMIC POWER	mainly due to charging and discharging of pseudo capacitors– when a logic state changes from 0 → 1 or from 1 → 0 short-circuit power (due to the current from Vdd to GND when both transistors are ON for a short time)
POWER DISSIPATION STATIC POWER	consumed anytime the processor is powered on. mainly due to leakage current
POWER DISSIPATION DYNAMIC POWER BEHAVIOUR	Pdyn = A C V² f C: total load capacitance in the circuit, increase with logic circuits used f: clock frequency V : operating voltage (also called Vdd) A: switching activity factor: the fraction of transistors switch during a clock cycle (in average).
POWER DISSIPATION STATIC POWER BEHAVIOUR	Static Power = V x I leak where Ileak is leakage current and V is operating voltage
ISA MEMORY ORGANISATION	Byte addressable: each memory address references to one byte, memory address is the index of the memory array
ISA WORD	Fixed-sized group of bits handled as one unit by the processor. How many words can be addressed? 2^(word-length) / #bytes in group - Address of each instruction is multiple of #bytes in group - Address of next instruction is current address + #bytes of word

Nächster

CZ3001 Advanced Computer Architecture

Beschreibung

Zusammenfassung der Ressource

ähnlicher Inhalt

	Erstellt von Deka Auliya vor mehr als 9 Jahre