无代写-CS356
时间:2021-05-05
CS356: Introduction to Computer Systems
Final (Thu Dec 12, 2019, 11-1pm / 2-4pm)
Full Name: Section: ä Morning ä Afternoon
USC Email: USC Student ID:
Question: 1 2 3 4 5 6 7 8 9 10 Total
Points: 24 10 6 6 8 6 5 9 12 14 100
Score:
1
CSCI 356 Final (Dec 12, 2019) Page 2 of 11
1. General Knowledge (24 points, assume x86-64 System V ABI)
(a) 1 point For any int x and int y=x*x, “y >= 0” is always true. ä True ■ False
(b) 2 points The instruction popq ■ reads ä writes 8 bytes at %rsp and
then ■ increases ä decreases the value of %rsp by 8 .
(c) 2 points A non-executable stack region protects from (select all that apply)
■ code injection attacks ä return-oriented programming attacks
(d) 1 point Each process has a distinct set of TLB entries. ■ True ä False
(e) 1 point A direct-mapped cache has associativity equal to 1.
(f) 1 point Multi-level page tables allow ä faster lookup ■ reduced memory usage
(g) 1 point It is possible for a page to have an invalid entry in the page table and data
in the CPU cache. ä True ■ False
(h) 1 point Linking will fail if two C files contain the same global variable definition
“int x;” ä True ■ False
(i) 3 points Dynamic linking allows programs to (select all options that apply):
■ save memory ä run faster ■ avoid recompilation after library updates
(j) 1 point With static linking, only the address of referenced library functions is in-
cluded in the final executable, not their binary code. ä True ■ False
(k) 1 point Memory blocks returned by calloc are initialized to 0. ■ True ä False
(l) 1 point During a call, malloc and free can rearrange already allocated blocks to
reduce heap fragmentation. ä True ■ False
(m) 1 point Best-fit is ä faster ■ slower than next-fit at finding a free block.
(n) 3 points (select all that apply) ä RAW ■ WAR ■ WAW pipeline hazards
can be resolved through register renaming.
(o) 4 points Spectre and Meltdown attacks exploit (select all that apply):
■ out-of-order execution ■ CPU caches ä VM vulnerabilities ä buffer overflows
CSCI 356 Final (Dec 12, 2019) Page 3 of 11
2. Number Systems (10 points)
(a) 2 points Assume an 8-bit unsigned integer encoding (e.g., unsigned char in C).
Then, the sequence 01111111 encodes the decimal number 127 .
(b) 2 points Assume an 8-bit signed integer encoding (e.g., signed char in C).
Then, the sequence 10000001 encodes the decimal number −127 .
(c) 2 points Assume an 8-bit unsigned integer encoding, e.g., unsigned char in C.
Provide (as decimal, e.g., “42”) two integer numbers X and Y such that:
(1) X and Y can be encoded as unsigned char; (2) X +Y causes unsigned
overflow but X −Y does not. X = 255 Y = 1
(d) 3 points Assume a 12-bit floating-point representation (1 sign bit, 5 exponent bits
using excess-15, and 6 fraction bits). Then, the sequence 1 10010 001100
(a 12-bit FP) encodes the decimal number −9.5 .
(e) 1 point In C, a cast from int (32-bit two’s complement) to float (IEEE 754 32-bit
floating point) followed by a cast to int, e.g., (int)((float)x), always
produces the initial integer value. ä True ■ False
3. Assembly Operations (6 points)
--------- REGISTERS --------
%rax = 0x1122 3344 4455 7788
%rbx = 0xFFFF FFFF FFFF FFFF
%rcx = 0x0000 0000 0000 200A
%rdx = 0xFFFF FFFF FF00 0022
-- MEMORY --- ADDR
FEFE D0D0 @ M[200C]
DADA 0000 @ M[2008]
AABB CCDD @ M[2004]
0102 0304 @ M[2000]
Given the values above for registers/memory, what is the effect of the following instruc-
tions on %rdx? (Consider each instruction individually from the same initial state.)
(a) 2 points movsbl -5(%rcx),%edx results in %rdx = 0000 0000 FFFF FFCC
(b) 2 points leaq 2(%rbx,%rbx,4),%rdx results in %rdx = FFFF FFFF FFFF FFFD
(c) 2 points andq %rax,%rdx results in %rdx = 1122 3344 4400 0000
CSCI 356 Final (Dec 12, 2019) Page 4 of 11
4. Addressing, Caches and Virtual Memory (6 points)
Consider a processor with:
• 20-bit physical address space;
• 32-bit virtual address space;
• single-level page table of 2 MB, using 4 bytes per entry;
• a k-way set-associative TLB with 32 entries and 4 sets;
• a 2-way cache of 64 kB with blocks of 128 bytes each.
(In your answers, you can use powers of 2 or unit prefixes, like “230 bytes” or “1G bytes.”)
(a) 2 points The size of each virtual page is 8k bytes.
(b) 2 points The TLB has k= 8 ways and uses 17 bits for the tag.
(c) 2 points The cache uses the most significant 5 bits of the address as tag.
5. Caches: Data Access (8 points)
Consider a processor with 16-bit physical address space using a 3-way cache with 128-
byte lines and 2 sets. Assume the following values for the cache metadata:
[WAY-0] [WAY-1] [WAY-2]
SET V TAG V TAG V TAG
=== ======== ======== ========
0 1 0xAC 0 0xAC 1 0xD5
1 1 0xD4 1 0xF1 1 0xD5
(a) 2 points An access to 0xF17F will result in a
ä hit ■ miss and no eviction ä miss and eviction
(b) 2 points An access to 0xAC80 will result in a
ä hit ä miss and no eviction ■ miss and eviction
(c) 2 points Address 0xD5FF points to the last byte in largest memory
range that (1) starts at 0xD480 and (2) is fully contained in the cache.
(d) 2 points If the hit rate of the cache is 0.9, the hit time is 10 ns, and the miss
penalty is 200 ns, the average access time is 30 ns .
CSCI 356 Final (Dec 12, 2019) Page 5 of 11
6. Memory Allocation (6 points)
Assume an implicit free list implementation with the current state of the heap shown
below, where “x, y” means that the header/footer size field is set to x and the allocated
bit is set to y (payload alignment is 8 bytes, initial padding is 4 bytes).
Address: 0 4 8 12 16 20 24 28 32 36 40 44 48
Memory: | 0,0| 8,1| 8,1|24,0| | | | |24,0|24,1| | | |
------------------------------------------------------------------
Address: 52 56 60 64 68 72 76 80 84 88 92 96 100
Memory: | |24,1|16,0| | |16,0|24,1| | | | |24,1| 0,1|
(Consider each question individually from the same initial state of the heap.)
(a) 2 points malloc(24) can be satisfied. ä True ■ False
(b) 2 points Assuming best-fit placement, malloc(8) would return address 64 .
(c) 2 points Assuming best-fit placement and immediate coalescing, after free(80),
malloc(8) would return address 16 .
7. Out-of-Order Dynamic Scheduling (5 points)
In the following sequence of instructions, assume that the first ld stalls due to a cache
miss and that the miss latency is longer than the execution time of the entire trace.
Assuming an out-of-order, dynamically-scheduled processor (that performs automatic
register renaming), which instructions in the sequence would be allowed to execute
(i.e., are independent) and which instructions would need to stall due to the ld miss?
ld 0(%rdi),%rax
addq %rbx,%rcx
addq %rcx,%rdx
addq %rax,%rdx
movq %rcx,%rbx
st %rdx,(%rsi)
Cache Miss
(a) 1 point ä Stall ■ Execute
(b) 1 point ä Stall ■ Execute
(c) 1 point ■ Stall ä Execute
(d) 1 point ä Stall ■ Execute
(e) 1 point ■ Stall ä Execute
CSCI 356 Final (Dec 12, 2019) Page 6 of 11
8. Loop Unrolling and Register Renaming (9 points)
(a) 9 points Unroll the loop body on the left for 2 additional iterations (3 in total); to
resolve hazards, use registers %r8d, %r9d, . . . , %r13d for renaming.
add2_sub1:
movl $0,%eax
L1: jge %rdi,%rsi,L2
ld 0(%rdi),%ebx
ld 4(%rdi),%ecx
ld 8(%rdi),%edx
addl %ebx,%ecx
subl %edx,%ecx
addl %ecx,%eax
addq $12,%rdi
jmp L1
L2: st %eax,0(%rsi)
add2_sub1:
movl $0,%eax
L1: jge %rdi,%rsi,L2
ld 0(%rdi),%ebx
ld 4(%rdi),%ecx
ld 8(%rdi),%edx
addl %ebx,%ecx
subl %edx,%ecx
addl %ecx,%eax
ld 12(%rdi),%r8d
ld 16(%rdi),%r9d
ld 20(%rdi),%r10d
addl %r8d,%r9d
subl %r10d,%r9d
addl %r9d,%eax
ld 24(%rdi),%r11d
ld 28(%rdi),%r12d
ld 32(%rdi),%r13d
addl %r11d,%r12d
subl %r13d,%r12d
addl %r12d,%eax
addq $36 ,%rdi
jmp L1
L2: st %eax,0(%rsi)
CSCI 356 Final (Dec 12, 2019) Page 7 of 11
9. Static Scheduling (12 points)
(a) 12 points Schedule the given code for the 2-way VLIW (static multiple issue) CPU
presented in class (1 ALU/branch slot and 1 load/store slot) to achieve the
best IPC when: (1) increments of %rdi, %rsi are in the first 2 cycles; (2)
other instructions can be reordered; (3) 1-cycle latency is needed after ld.
L1: ld 0(%rdi),%eax
ld 4(%rdi),%ebx
ld 0(%rsi),%ecx
addl %eax,%ebx
andl %ecx,%ebx
st %ebx,4(%rsi)
ld 8(%rdi),%r8d
ld 12(%rdi),%r9d
ld 8(%rsi),%r10d
addl %r8d,%r9d
andl %r10d,%r9d
st %r9d,12(%rsi)
addq $16,%rdi
addq $16,%rsi
jne %rdi,%rdx,L1
======== ALU / Branch ========
addq $16,%rdi
addq $16,%rsi
addl %eax,%ebx
andl %ecx,%ebx
addl %r8d,%r9d
andl %r10d,%r9d
jne %rdi,%rdx,L1
======== Load / Store ========
ld 0(%rdi),%eax
ld -12(%rdi),%ebx
ld -16(%rsi),%ecx
ld -8(%rdi),%r8d
ld -4(%rdi),%r9d
ld -8(%rsi),%r10d
st %ebx,-12(%rsi)
st %r9d,-4(%rsi)
CSCI 356 Final (Dec 12, 2019) Page 8 of 11
10. Assembly and Linking (14 points)
(a) 11 points Fill in the blanks of the C source files lib1.c and test1.c.
========= lib1.s ========== ========= lib1.c =========
match: addl $1, count(%rip)
testl %edi, %edi
js .L6
movl %edi, %eax
andl $15, %eax
cmpl $3, %eax
jne .L4
cmpl $42, %edi
jg .L5
movl $0, %eax
ret
.L6: movl err(%rip), %eax
leal 1(%rax), %edx
movl %edx, err(%rip)
ret
.L4: movl $0, %eax
ret
.L5: movl $10, %eax
ret
static int count = 0;
int err = 0;
int match(int c) {
count = count + 1 ;
if (c < 0 ) {
return err++ ;
} else {
if ((c & 0xF ) == 3 && c > 42 )
return 10;
else
return 0;
}
}
========= test1.s ========= ========= test1.c =========
search: jmp .L3
.L9: movl %ecx, %eax
subl %edx, %eax
sarl $1, %eax
addl %edx, %eax
movslq %eax, %r8
leaq (%rsi,%r8,8),%r8
movsbl (%r8), %r9d
cmpl %edi, %r9d
jle .L8
movl %eax, %ecx
.L3: cmpl %edx, %ecx
jg .L9
movl $-1, %eax
ret
.L8: jge .L4
leal 1(%rax), %edx
jmp .L3
.L4: movl 4(%r8), %edi
jmp match@PLT
extern int err;
int match(int c);
struct item { char x; int y; };
int search(int key, struct item *a,
int lo, int hi) {
if ( hi <= lo )
return -1;
int i = lo + (hi-lo)/2 ;
if (key < a[i].x )
return search(key, a, lo, i );
else if (key > a[i].x )
return search(key, a, i+1 , hi);
else
return match( a[i].y );
}
CSCI 356 Final (Dec 12, 2019) Page 9 of 11
(b) 3 points Complete the following. (Not all the blanks need to be filled in.)
• The strong symbols defined in lib1.c are: match, err .
• The weak symbols defined in lib1.c are: .
• The external symbols defined in lib1.c are: .
• The local symbols defined in lib1.c are: count .
• The strong symbols defined in test1.c are: search .
• The weak symbols defined in test1.c are: .
• The external symbols defined in test1.c are: err, match .
• The local symbols defined in test1.c are: .
CSCI 356 Final (Dec 12, 2019) Page 10 of 11
(Page intentionally blank for scratch work. Please turn it in with your exam.)
CSCI 356 Final (Dec 12, 2019) Page 11 of 11
(Page intentionally blank for scratch work. Please turn it in with your exam.)



































































































































































































































































































学霸联盟


essay、essay代写