CPSC313 Midterm Sample

Question 1 of 31

1

1a) The classic RISC pipeline consists of 5 stages, what are they?

Select one of the following:

Fetch, Decode, Execute, Memory, Write back
Fetch, Register, Execute, Memory, Write back
Load, Register, Execute, Memory, Write-read
Fetch, Register, Execute, Memory, Load-use
Load, Register, Execute, Memory, Control

Explanation

Question 2 of 31

1

1b) Fetch is responsible for which actions? Check all that apply.

Select one or more of the following:

Use pc to read next instruction from memory into instruction register
Determine instruction length and extract pieces of instruction
Advance the pc to the address of next instruction in sequence
Read values from register file
Update the pc for taken jump instructions

Explanation

Question 3 of 31

1

1c) Which of the following is a decode action?

Select one of the following:

read values from register
determine instruction length and extract pieces of instruction
access main memory
decode the difficulty of this course into a binary representation of tears and not tears

Explanation

Question 4 of 31

1

1d) Which of the following is an execute stage action?

Select one of the following:

perform ALU operations and determine whether jumps are taken
set the pc for taken jumps
access main memory
cry profusely into a bowl of alphabits hoping they'll arrange themselves into the proper values for the next stage

Explanation

Question 5 of 31

1

1e) Which of the following is a memory stage action?

Select one of the following:

access main memory, duhhhhhhhhhh
access main memory and write the values back to the register file, duhhhhhhhhhhh
when in doubt choose c
advance the pc to the next memory address in the sequence

Explanation

Question 6 of 31

1

1f) Which of the following is a write back stage action?

Select one of the following:

write values back to register file
write values back to main memory
write values back to the pc
forge your computer science degree when failing this course leads you into months of crippling depression

Explanation

Question 7 of 31

1

2a) Why is the memory stage after the execute stage? Include a Y86 instruction that could not be implemented if you reversed the order.

Select one of the following:

Execute needs to perform address calculations. mrmovl could not use base plus displacement addressing
Execute needs to perform math on register values. None of the op instructions would work (i.e. addl)
Memory needs to write a value from memory into a register. mrmovl would not work if the order was reversed.
Memory needs to read from memory in order for execute to work properly. ret would not work if the order was reversed.

Explanation

Question 8 of 31

2

2b) Why is the execute stage after the decode stage? Include a Y86 instruction that could not be implemented if you reversed the order.

Select one of the following:

Execute needs to be able to perform math on register values. If the order were reversed, none of the op instructions would work (i.e. addl)
Execute needs to perform address calculations. If the order were reversed, mrmovl would not work.
Decode needs to write a value from memory into a register. mrmovl would no longer work if the order was reversed.
Decode needs to read the values before execute stage to avoid stalling. popl would no longer work if the order was reversed.

Explanation

Question 9 of 31

2

2c) Why is the write-back stage after the memory stage? Include a Y86 instruction that could not be implemented if you reversed the order.

Select one of the following:

Memory needs to perform math on memory values. In the reverse order, none of the op instructions would work (i.e. addl)
Memory needs to perform address calculations. In the reverse order, mrmovl and rmmovl would not work.
Write back needs to be able to write a value from memory into a register. In the reverse order, mrmovl would not work.

Explanation

Question 10 of 31

1

3a) RISC instruction sets do not allow ALU operations to read from memory. Explain how the structure of the pipeline leads this restriction.

Select one of the following:

Since M comes after E, by the time we read the value from memory, it's too late.
Since M comes before E, by the time we read the value from memory, it's too late.
Since E comes after M, by the time we read the value from memory, it's too late.

Explanation

Question 11 of 31

2

3b) Describe how the pipeline could be modified to lift this restriction. You may not change the number
of pipeline stages or their basic function. Your revised pipeline does not need to be able to implement
every Y86 instruction (in fact it would not be able to) but it must be able to execute instructions like:

addl (%eax), %ebx # r[ebx] = r[ebx] + m[r[eax]]

Select one of the following:

Swap the order of E and M
Allow for M to forward data back to E
Stall E until M has the correct data

Explanation

Question 12 of 31

2

3c) This revised pipeline must place some restrictions on the revised ISA that it implements. One of
the impacts is on mrmovl and rmmovl. Describe the problem and modify the instructions so that
they will work with the new pipeline. Note that the revised versions need not be as powerful as the
original ones, but they must still load and store between memory and a register.

Select one of the following:

We wouldn't be able to handle base plus addressing because E would come after M. To solve this, we simply eliminate base plus addressing
Memory access is rendered useless. To solve this, we simply eliminate the memory stage.
The write back stage would no longer be able to determine the correct register to move data to. To solve this, we would simply add another stage input to account for that lost addressing.

Explanation

Question 13 of 31

2

Consider the following five-stage pipeline with stage delays (including overheads) of 34 ps,
42 ps, 75 ps, 50 ps and 18 ps.

What is the maximum clock rate (i.e., fastest) acceptable for this pipeline?
❌

Drag and drop to complete the text.

1 cycle / 75ps = 1000/75 Ghz

1 cycle / 34ps = 1000/34 Ghz

1 cycle / 42ps = 1000/42 Ghz

1 cycle / 50ps = 1000/50 Ghz

1 cycle / 18ps = 1000/18 Ghz

Explanation

Question 14 of 31

2

Consider the following five-stage pipeline with stage delays (including overheads) of 34 ps,
42 ps, 75 ps, 50 ps and 18 ps.

What is the maximum throughput of this pipeline?
❌

Drag and drop to complete the text.

1/75 instructions per picosecond

1/34 instructions per picosecond

1/42 instructions per picosecond

1/50 instructions per picosecond

1/18 instructions per picosecond

Explanation

Question 15 of 31

2

4c) Consider the following five-stage pipeline with stage delays (including overheads) of 34 ps,
42 ps, 75 ps, 50 ps and 18 ps.

What, if anything, might cause the actual throughput of programs to be lower than this maximum?

Select one of the following:

pipeline bubbles
nothing would cause that, the maximum is actually also the minimum
jumps and returns
stack depletion

Explanation

Question 16 of 31

2

Consider the following five-stage pipeline with stage delays (including overheads) of 34 ps,
42 ps, 75 ps, 50 ps and 18 ps.

What is the minimum instructions latency of this pipeline?
❌

Drag and drop to complete the text.

75ps + 75ps + 75ps + 75ps + 75ps

34ps + 42ps + 50ps + 75ps + 18ps

18ps + 18ps + 18ps + 18ps + 18ps

Explanation

Question 17 of 31

3

5a) Consider the following piece of Y86 assembly code:
[0] addl %eax, %ebx
[1] irmovl $1, %eax
[2] mrmovl %(ebx), %ebx
[3] addl %ebx, %eax

List all of the output dependencies present in this code.

Select one or more of the following:

0 and 2 on %ebx
0 and 1 on %eax
0 and 3 on %ebx
1 and 2 on %ebx
1 and 3 on %eax
2 and 3 on %ebx

Explanation

Question 18 of 31

1

5a) Consider the following piece of Y86 assembly code:
[0] addl %eax, %ebx
[1] irmovl $1, %eax
[2] mrmovl %(ebx), %ebx
[3] addl %ebx, %eax

List all of the anti dependencies present in this code.

Select one or more of the following:

0 and 1 on %eax
0 and 2 on %ebx
0 and 3 on %ebx
1 and 2 on %ebx
1 and 3 on %eax
2 and 3 on %ebx

Explanation

Question 19 of 31

1

5a) Consider the following piece of Y86 assembly code:
[0] addl %eax, %ebx
[1] irmovl $1, %eax
[2] mrmovl %(ebx), %ebx
[3] addl %ebx, %eax

List all of the causal dependencies present in this code.

Select one or more of the following:

0 and 1 on %eax
0 and 2 on %ebx
0 and 3 on %eax
1 and 2 on %ebx
1 and 3 on %eax
2 and 3 on %ebx

Explanation

Question 20 of 31

3

5b) Consider the following piece of Y86 assembly code:
[0] addl %eax, %ebx
[1] irmovl $1, %eax
[2] mrmovl %(ebx), %ebx
[3] addl %ebx, %eax

List all of the data hazards present in this code for the Y86, five stage pipeline

Select one or more of the following:

All the dependencies
All the causal dependencies
All the output dependencies
All the anti dependencies
Just the causal and output dependencies
Just the causal and anti dependencies
Just the anti and output dependencies

Explanation

Question 21 of 31

2

5c) Consider the following piece of Y86 assembly code:
[0] addl %eax, %ebx
[1] irmovl $1, %eax
[2] mrmovl %(ebx), %ebx
[3] addl %ebx, %eax

For each hazard, indicate the total number of bubbles added by the Pipe-Minus implementation

Select one or more of the following:

2 for dependency between 0 and 2 on %ebx
1 for dependency between 0 and 2 on %ebx
3 for dependency between 0 and 2 on %ebx
0 for dependency between 1 and 3 on %eax
1 for dependency between 1 and 3 on %eax
2 for dependency between 1 and 3 on %eax
0 for dependency between 0 and 2 on %ebx
3 for dependency between 1 and 3 on %eax
3 for dependency between 2 and 3 on %ebx
2 for dependency between 2 and 3 on %ebx

Explanation

Question 22 of 31

2

5d) Consider the following piece of Y86 assembly code:
[0] addl %eax, %ebx
[1] irmovl $1, %eax
[2] mrmovl %(ebx), %ebx
[3] addl %ebx, %eax

For each hazard, indicate the total number of bubbles added by the Pipe implementation

Select one or more of the following:

1 for dependency between 2 and 3 on %ebx
2 for dependency between 2 and 3 on %ebx
3 for dependency between 2 and 3 on %ebx
0 for dependency between 2 and 3 on %ebx

Explanation

Question 23 of 31

10

Describe the implementation of the following new instruction for Y86 Seq as you did in
Homework 2. This instruction is similar to mrmovl except that it adds 4 to rB and does not have a static-displacement operand. It would be useful, for example, for iterating over an array of integers.

Syntax:
mrmovincl (rB), rA
Semantics:
r[rA] <= m[r[rB]]
r[rB] <= r[rB] + 4
Memory Layout:
| 5 | F | rA | rB |
Describe each stage using a relaxed syntax similar as shown below. The Fetch and PC Update stages are complete. List only the code that would be added for this instruction.

Fetch:
f.iCd = m[F.pc] >> 4
f.iFn = m[F.pc] & 0xf
f.rA = m[F.pc+1] >> 4
f.rB = m[F.pc+1] & 0xf
f.valP = F.pc + 2
Decode:
d.srcA = ❌
d.srcB = ❌
d.dstE = D.rB
d.dstM = ❌
d.valB = ❌
Execute:
e.aluA = 4
e.aluB = ❌
e.aluFun = ❌
e.valE = ❌
Memory:
m.valM = ❌
Write Back:
r[W.dstE] = ❌
r[W.dstM] = ❌
PC Update (pseudo stage):
w.pc = W.valP

Drag and drop to complete the text.

R_NONE

D.rB

D.rA

r[d.srcB]

E.valB

A_ADD

A_SUB

A_INC

4 + E.valB

m4[M.valB]

m4[M.srcB]

W.valE

W.valM

R_ADD

R_SUB

Explanation

Question 24 of 31

6

Write Y86 assembly code that computes the sum of an array of integers where the address of
the array is stored in %ebx and the length of the array is in %ecx. Place the sum in %eax.

irmovl $0, ❌ #SET SUM TO 0
irmovl $❌, %edi #HOW MUCH WILL THE LENGTH DECREMENT BY AS YOU ITERATE?
irmovl $❌, %esi #AN ARRAY OF INTEGERS HAS ELEMENTS WITH A SIZE OF?
❌ %ecx, %ecx #IS THE LENGTH OF THE ARRAY 0?
jle L1
L0: ❌ ❌, %edx #STORE ADDRESS OF ARRAY INTO %edx
addl ❌, %eax #ADD VALUE OF ELEMENT IN THE ARRAY TO SUM
addl ❌, %ebx #INCREMENT THE ARRAY POINTER
subl ❌, %ecx #DECREMENT THE LENGTH OF THE ARRAY
❌ L0 #CHECK IF LENGTH IS STILL GREATER THAN ZERO
L1:

Drag and drop to complete the text.

%eax

1

2

3

4

5

andl

mrmovl

rmmovl

rrmovl

(%ebx)

%ebx

%edx

%esi

%edi

jg

jge

jle

jl

Explanation

Question 25 of 31

2

8a) What is spatial locality and why is it important?

Select one or more of the following:

Spatial locality exists when memory accesses are clustered together to nearby memory addresses
Caches exploit spatial locality by storing data in multi-byte blocks.
Spatial locality exists when the same memory location is accessed repeatedly.
Caches exploit spatial locality by retaining recently accessed blocks in the cache.
Spatial locality exists between a pair of instructions when there are no dependencies between them and thus their execution order does not matter.
Pipelined (and super-scalar) processor architectures exploit spatial locality.
Spatial locality exists when programs explicitly indicate that threads of execution can be executed in parallel by either directly or indirectly creating multiple threads that can execute con-currently.
Multi-core processors (and hyper-threading) exploit spatial locality.

Explanation

Question 26 of 31

2

8b) What is temporal locality and why is it important?

Select one or more of the following:

Temporal locality exists when memory accesses are clustered together to nearby memory addresses.
Caches exploit temporal locality by storing data in multi-byte blocks.
Temporal locality exists when the same memory location is accessed repeatedly.
Caches exploit temporal locality by retaining recently accessed blocks in the cache.
Temporal locality exists between a pair of instructions when there are no dependencies between them and thus their execution order does not matter.
Pipelined (and super-scalar) processor architectures exploit temporal locality.
Temporal locality exists when programs explicitly indicate that threads of execution can be executed in parallel by either directly or indirectly creating multiple threads that can execute con-currently.
Multi-core processors (and hyper-threading) exploit temporal locality.

Explanation

Question 27 of 31

2

8c) What is instruction-level parallelism and why is it important?

Select one or more of the following:

Instruction-level parallelism exists between a pair of instructions when there are no dependencies
between them and thus their execution order does not matter.
Pipelined (and super-scalar) processor architectures exploit instruction-level parallelism.
Instruction-level parallelism exists when programs explicitly indicate that threads of execution can be executed in parallel by either directly or indirectly creating multiple threads that can execute con-currently.
Multi-core processors (and hyper-threading) exploit instruction-level parallelism.
Instruction-level parallelism exists when the same memory location is accessed repeatedly.
Caches exploit instruction-level parallelism by storing data in multi-byte blocks.
Instruction-level parallelism exists when memory accesses are clustered together to nearby memory addresses.
Caches exploit instruction-level paralellism by retaining recently accessed blocks in the cache.

Explanation

Question 28 of 31

2

8d) What is thread-level parallelism and why is it important?

Select one or more of the following:

Thread-level parallelism exists when programs explicitly indicate that threads of execution can be executed in parallel by either directly or indirectly creating multiple threads that can execute con-currently.
Multi-core processors (and hyper-threading) exploit thread-level parallelism.
Thread-level parallelism exists between a pair of instructions when there are no dependencies
between them and thus their execution order does not matter.
Pipelined (and super-scalar) processor architectures exploit thread-level parallelism.
Thread-level parallelism exists when the same memory location is accessed repeatedly.
Caches exploit thread-level parallelism by retaining recently accessed blocks in the cache.
Caches exploit thread-level parallelism by storing data in multi-byte blocks.
Thread-level parallelism exists when memory accesses are clustered together to nearby memory addresses.

Explanation

Question 29 of 31

4

Polymorphic dispatch common to object-oriented languages like Java is implemented using a
double-indirect call instruction that reads an address from memory and then jumps to it. Explain why it is
challenging to implement such an instruction without stalling in a pipelined processor.

Select one of the following:

The address of the next instruction to execute after the call is not known to the pipeline until the call instruction exits the memory stage and so predicting which instructions should be F, D, and E at this point is impossible without retaining execution history.
If the predPC is overwritten with the address of a call at a stage earlier than memory, the instructions already in the pipeline will bubble.
Couldn't we all just down a bottle of bleach instead of do this midterm?

Explanation

Question 30 of 31

4

Consider the following instruction-execution frequencies for a program running on the
standard Y86 Pipe processor. The table shows, for example, that 7% of all instructions executed read the value of a register immediately after the preceding instruction modified that register by writing into it a value that came from memory.
7% read register immediately after an instruction writes into that register a value it reads from memory
6% read register immediately after an instruction writes into that register a value computed in Execute
12% conditional jump that is taken
8% conditional jump that is not taken
5% call
5% ret
57% the remaining introduce no bubbles

What is the average cycles per instruction for this execution?

Select one of the following:

CPI = 1 + 0.07 + 0.08 * 2 + 0.05 * 3 = 1.38
Because load-use causes 1 bubble, not taken penalties cause 2 bubbles, and ret instructions cause 3 bubbles
CPI = 1 + 0.07 * 2 + 0.08 * 3 + 0.05 = 1.43
Because load-use causes 2 bubbles, not taken penalties cause 3 bubbles, and ret instructions cause 1 bubble
CPI = 1 + 0.06 * 2 + 0.12 * 3 + 0.05 = 1.41
Because forwarding causes 2 bubbles, taken penalties cause 3 bubbles, and call instructions cause 1 bubble
CPI = 1 + 0.06 + 0.12 * 2 + 0.05 * 3 = 1.45
Because forwarding causes 1 bubble, taken penalties cause 2 bubbles, and call instructions cause 3 bubbles

Explanation

Question 31 of 31

4

Consider the following instruction-execution frequencies for a program running on the
standard Y86 Pipe processor. The table shows, for example, that 7% of all instructions executed read the value of a register immediately after the preceding instruction modified that register by writing into it a value that came from memory.
7% read register immediately after an instruction writes into that register a value it reads from memory
6% read register immediately after an instruction writes into that register a value computed in Execute
12% conditional jump that is taken
8% conditional jump that is not taken
5% call
5% ret
57% the remaining introduce no bubbles

What is the throughput of this execution on a 3-Ghz processor (ie 3*10^9 cycles per second)? (You may go back one question to see what you chose for the answer.

Select one of the following:

(3*10^9 / 1.38) instructions per second
(3*10^9 / 1.41) instructions per seconds
(3*10^9 / 1.43) instructions per seconds
(3*10^9 / 1.45) instructions per seconds

	Created by Zim Brightwood about 8 years ago

For when life gives you all the lemons

CPSC313 Midterm Sample

Question 1 of 31

1a) The classic RISC pipeline consists of 5 stages, what are they?

Select one of the following:

Explanation

Question 2 of 31

1b) Fetch is responsible for which actions? Check all that apply.

Select one or more of the following:

Explanation

Question 3 of 31

1c) Which of the following is a decode action?

Select one of the following:

Explanation

Question 4 of 31

1d) Which of the following is an execute stage action?

Select one of the following:

Explanation

Question 5 of 31

1e) Which of the following is a memory stage action?

Select one of the following:

Explanation

Question 6 of 31

1f) Which of the following is a write back stage action?

Select one of the following:

Explanation

Question 7 of 31

2a) Why is the memory stage after the execute stage? Include a Y86 instruction that could not be implemented if you reversed the order.

Select one of the following:

Explanation

Question 8 of 31

2b) Why is the execute stage after the decode stage? Include a Y86 instruction that could not be implemented if you reversed the order.

Select one of the following:

Explanation

Question 9 of 31

2c) Why is the write-back stage after the memory stage? Include a Y86 instruction that could not be implemented if you reversed the order.

Select one of the following:

Explanation

Question 10 of 31

3a) RISC instruction sets do not allow ALU operations to read from memory. Explain how the structure of the pipeline leads this restriction.

Select one of the following:

Explanation

Question 11 of 31

Select one of the following:

Explanation

Question 12 of 31

Select one of the following:

Explanation

Question 13 of 31

Consider the following five-stage pipeline with stage delays (including overheads) of 34 ps, 42 ps, 75 ps, 50 ps and 18 ps. What is the maximum clock rate (i.e., fastest) acceptable for this pipeline? ❌

Drag and drop to complete the text.

Explanation

Question 14 of 31

Consider the following five-stage pipeline with stage delays (including overheads) of 34 ps, 42 ps, 75 ps, 50 ps and 18 ps. What is the maximum throughput of this pipeline? ❌

Drag and drop to complete the text.

Explanation

Question 15 of 31

4c) Consider the following five-stage pipeline with stage delays (including overheads) of 34 ps, 42 ps, 75 ps, 50 ps and 18 ps. What, if anything, might cause the actual throughput of programs to be lower than this maximum?

Select one of the following:

Explanation

Question 16 of 31

Consider the following five-stage pipeline with stage delays (including overheads) of 34 ps, 42 ps, 75 ps, 50 ps and 18 ps. What is the minimum instructions latency of this pipeline? ❌

Drag and drop to complete the text.

Explanation

Question 17 of 31

5a) Consider the following piece of Y86 assembly code: [0] addl %eax, %ebx [1] irmovl $1, %eax [2] mrmovl %(ebx), %ebx [3] addl %ebx, %eax List all of the output dependencies present in this code.

Select one or more of the following:

Explanation

Question 18 of 31

5a) Consider the following piece of Y86 assembly code: [0] addl %eax, %ebx [1] irmovl $1, %eax [2] mrmovl %(ebx), %ebx [3] addl %ebx, %eax List all of the anti dependencies present in this code.

Select one or more of the following:

Explanation

Question 19 of 31

5a) Consider the following piece of Y86 assembly code: [0] addl %eax, %ebx [1] irmovl $1, %eax [2] mrmovl %(ebx), %ebx [3] addl %ebx, %eax List all of the causal dependencies present in this code.

Select one or more of the following:

Explanation

Question 20 of 31

5b) Consider the following piece of Y86 assembly code: [0] addl %eax, %ebx [1] irmovl $1, %eax [2] mrmovl %(ebx), %ebx [3] addl %ebx, %eax List all of the data hazards present in this code for the Y86, five stage pipeline

Select one or more of the following:

Explanation

Consider the following five-stage pipeline with stage delays (including overheads) of 34 ps,
42 ps, 75 ps, 50 ps and 18 ps.

What is the maximum clock rate (i.e., fastest) acceptable for this pipeline?
❌

Consider the following five-stage pipeline with stage delays (including overheads) of 34 ps,
42 ps, 75 ps, 50 ps and 18 ps.

What is the maximum throughput of this pipeline?
❌

4c) Consider the following five-stage pipeline with stage delays (including overheads) of 34 ps,
42 ps, 75 ps, 50 ps and 18 ps.

What, if anything, might cause the actual throughput of programs to be lower than this maximum?

Consider the following five-stage pipeline with stage delays (including overheads) of 34 ps,
42 ps, 75 ps, 50 ps and 18 ps.

What is the minimum instructions latency of this pipeline?
❌

5a) Consider the following piece of Y86 assembly code:
[0] addl %eax, %ebx
[1] irmovl $1, %eax
[2] mrmovl %(ebx), %ebx
[3] addl %ebx, %eax

List all of the output dependencies present in this code.

5a) Consider the following piece of Y86 assembly code:
[0] addl %eax, %ebx
[1] irmovl $1, %eax
[2] mrmovl %(ebx), %ebx
[3] addl %ebx, %eax

List all of the anti dependencies present in this code.

5a) Consider the following piece of Y86 assembly code:
[0] addl %eax, %ebx
[1] irmovl $1, %eax
[2] mrmovl %(ebx), %ebx
[3] addl %ebx, %eax

List all of the causal dependencies present in this code.

5b) Consider the following piece of Y86 assembly code:
[0] addl %eax, %ebx
[1] irmovl $1, %eax
[2] mrmovl %(ebx), %ebx
[3] addl %ebx, %eax

List all of the data hazards present in this code for the Y86, five stage pipeline

5c) Consider the following piece of Y86 assembly code:
[0] addl %eax, %ebx
[1] irmovl $1, %eax
[2] mrmovl %(ebx), %ebx
[3] addl %ebx, %eax

For each hazard, indicate the total number of bubbles added by the Pipe-Minus implementation

5d) Consider the following piece of Y86 assembly code:
[0] addl %eax, %ebx
[1] irmovl $1, %eax
[2] mrmovl %(ebx), %ebx
[3] addl %ebx, %eax

For each hazard, indicate the total number of bubbles added by the Pipe implementation

Polymorphic dispatch common to object-oriented languages like Java is implemented using a
double-indirect call instruction that reads an address from memory and then jumps to it. Explain why it is
challenging to implement such an instruction without stalling in a pipelined processor.