CS4203/EE4363-无代写
时间:2022-12-01
Arithmetic
Antonia Zhai
Department Computer Science and Engineering
University of Minnesota
http://www.cs.umn.edu/~zhai
CS4203/EE4363
Arithmetic for Computers
• Operations on integers
• Addition and subtraction
• Multiplication and division
• Dealing with overflow
• Floating-point real numbers
• Representation and operations
Addition/Subtraction in Binary with Negative Number
Decimal: 0 - 1 = -1
Hexadecimal: 0 - 1 = -1
Binary: 0000 + 0001 =1111
Decimal: -1 + 1 = 0
Hexadecimal: -1 + 1 = 0
Binary: 1111 + 0001 =0000
Decimal: -1 - 1 = -2
Hexadecimal: -1 - 1 = -2
Binary: 1111 - 0001 =1110
2’s-Complement Integers
• Negation rule:
• Invert bits and add 1
• Some properties:
• Only one representation for 0
• Exactly as many positive numbers as negative numbers
• Slight asymmetry – there is one negative number with no
positive counterpart
2’s Complement Addition
Integer Addition
Example: 7 + 6
• Overflow if result out of range
• Adding +ve and –ve operands, no overflow
• Adding two +ve operands
• Overflow if result sign is 1
• Adding two –ve operands
• Overflow if result sign is 0
Integer Subtraction
• Add negation of second operand
• Example: 7 – 6 = 7 + (–6)
+7: 0000 0000 … 0000 0111
–6: 1111 1111 … 1111 1010
+1: 0000 0000 … 0000 0001
• Overflow if result out of range
• Subtracting two +ve or two –ve operands, no overflow
• Subtracting +ve from –ve operand
• Overflow if result sign is 0
• Subtracting –ve from +ve operand
• Overflow if result sign is 1
Problems with Finite Math
• Finite size of representation:
• Digital circuit cannot be arbitrarily large.
• Overflow detection – easy to determine when the
number becomes too large.
• Represent negative numbers:
• Unique representation of 0.
-4 0 4 8 12 16 20
0000 0100 1000 1100 10000 10100
Infinite
universe
of integers
∞-∞
4-bit numbers
Overflow and Finite Universe
. . .1111 0000 0001 0010 0011 0100 0101 . . .
Decrease Increase
Infinite
universe
of integers
No overflow
∞-∞
0000
Forbidden fence
1000
00011111
1001
Finite
Universe
of 4-bit
binary
integers
0010
0011
0100
0101
0110
0111
1010
1011
1100
1101
1110
In
cr
ea
se
D
ec
re
as
e
Dealing with Overflow
• Some languages (e.g., C) ignore overflow
• Use MIPS addu, addui, subu instructions
• Other languages (e.g., Ada, Fortran) require raising an
exception
• Use MIPS add, addi, sub instructions
• On overflow, invoke exception handler
• Save PC in exception program counter (EPC) register
• Jump to predefined handler address
• mfc0 (move from coprocessor reg) instruction can
retrieve EPC value, to return after corrective action
Arithmetic for Multimedia
• Graphics and media processing operates on vectors of 8-
bit and 16-bit data
• Use 64-bit adder, with partitioned carry chain
• Operate on 8×8-bit, 4×16-bit, or 2×32-bit vectors
• SIMD (single-instruction, multiple-data)
• Saturating operations
• On overflow, result is largest representable value
• c.f. 2s-complement modulo arithmetic
• E.g., clipping in audio, saturation in video
32-bit Ripple-Carry Adder
FA0
FA1
FA2
FA31
a0
b0
c0 = 0
a1
b1
a2
b2
a31
b31
s0
s1
s2
c32
(discard)
s31c31
c2
c1
c32 c31 . . . c2 c1 0
a31 . . . a2 a1 a0
+ b31 . . . b2 b1 b0
s31 . . . s2 s1 s0
Speeding Up the Adder
16-bit
ripple
carry
adder
a0-a15
b0-b15
c0 = 0
s0-s15
16-bit
ripple
carry
adder
a16-a31
b16-b31
0
16-bit
ripple
carry
adder
a16-a31
b16-b31
1
M
ul
tip
le
xe
r
s16-s31
0
1
This is a carry-select adder
N-bit Adder Design Options
Type of adder Time complexity
(delay)
Space complexity
(size)
Ripple-carry O(N) O(N)
Carry-lookahead O(log2N) O(N log2N)
Carry-skip O(√N) O(N)
Carry-select O(√N) O(N)
Multiplication
Multiplication
1000
× 1001
1000
0000
0000
1000
1001000
Length of product is the sum of operand lengths
multiplicand
multiplier
product
Combinational Multiplier
A0
B0
A0 B0
A1
B1
A1 B0
A0 B1
A2
B2
A2 B0
A1 B1
A0 B2
A3
B3
A2 B0
A2 B1
A1 B2
A0 B3
A3 B1
A2 B2
A1 B3
A3 B2
A2 B3A3 B3
S6 S5 S4 S3 S2 S1 S0S7
Combinational Multiplier
A3 B0
SC
A2 B0
SC
A1 B0
SC
A0 B0
SC
A3 B1
S
C
A2 B1
S
C
A1 B1
S
C
A0 B1
S
C
A3 B2
S
C
A2 B2
S
C
A1 B2
S
C
A0 B2
S
C
A3 B3
SC
A2 B3
S
A1 B3
S
A0 B3
S
B0
B1
B2
B3
P7 P6 P5 P4 P3 P2 P1 P0
A3 A2 A1 A0
Building block: full adder + and
4 x 4 array of building blocks
F A
X
Y
A B
S
CI CO
Cin Sum In
Sum Out Cout
LSB
of multiplier
?
Multiplication Flowchart
Initialize product register to 0 Partial product number, n = 1
Left shift multiplicand register 1 bit
Right shift multiplier register 1 bit
n = ? n = n + 1Done
Start
Add multiplicand to
product and place result
in product register
1 0
n < 32n = 32
N = 32
Serial Multiplication
64-bit product register, initially 0
64
64
64
64-bit ALU Test LSBN = 32 times
shift right
32-bit multiplier
shift left
write 3 operations per bit:
• shift right
• shift left
• add
64-bit ALU
Multiplicand (expanded 64-bits)
LSB = 0
LSB
= 1
add
Shift l/r
LS
B
af
te
r a
dd
N = 32
af
te
r a
dd
Optimized Multiplier
Perform steps in parallel: add/shift
Multiplicand
64-bit product register
32
32
32
32-bit ALU Test LSB32 timesLSB
(3) shift right
00000 . . . 00000 32-bit multiplier Initialized product register
(2)
w
rite
2 operations per bit:
• shift right
• add
32-bit ALU
1
1(1) add
1 or 0
N = 32
One cycle per partial-product addition
Example: 00102× 00112
Iteration Step Multiplicand Product
0 Initial values 0010 0000 0011
1 LSB =1 → Prod = Prod + Mcand 0010 0010 0011
Right shift product 0010 0001 0001
2 LSB =1 → Prod = Prod + Mcand 0010 0011 0001
Right shift product 0010 0001 1000
3 LSB = 0 → no operation 0010 0001 1000
Right shift product 0010 0000 1100
4 LSB = 0 → no operation 0010 0000 1100
Right shift product 0010 0000 0110
00102 × 00112 = 01102
Faster Multiplier
n Can be pipelined
n Several multiplication performed in parallel
Division
Division
• Check for 0 divisor
• Long division approach
• If divisor ≤ dividend bits
• 1 bit in quotient, subtract
• Otherwise
• 0 bit in quotient, bring down
next dividend bit
• Restoring division
• Do the subtract, and if remainder
goes < 0, add divisor back
• Signed division
• Divide using absolute values
• Adjust sign of quotient and
remainder as required
1001
1000 1001010
-1000
10
101
1010
-1000
10
n-bit operands yield n-bit
quotient and remainder
quotient
dividend
remainder
divisor
4-bit Binary Division (Unsigned)
Dividend: 6 = 0110
Divisor: 4 = 0100
6
─ = 1, remainder 2
4
0 0 0 1
0 0 0 0 1 1 0
1 1 0 0
1 1 0 0 negative → quotient bit 0
0 1 0 0 → restore remainder
0 0 0 0 1 1 0
1 1 0 0
1 1 0 1 negative → quotient bit 0
0 1 0 0 → restore remainder
0 0 0 1 1 0
1 1 0 0
1 1 1 1 negative → quotient bit 0
0 1 0 0 → restore remainder
0 0 1 1 0
1 1 0 0
0 0 1 0 positive → quotient bit 1Ite
ra
tio
n
4
I
te
ra
tio
n
3

I
te
ra
tio
n
2
I
te
ra
tio
n
1
Division Hardware
Initially dividend
Initially divisor
in left half
32-bit Binary Division Flowchart
$R = 0, $M = Divisor, $Q = Dividend, count = n
Shift 1-bit left $R, $Q
$R ← $R – $M
$R < 0?$Q0 = 1 $Q0=0$R ← $R + $M
count = count – 1
count = 0?
Done
$Q = Quotient
$R = Remainder
Start
Yes
Yes
No
No
$R and $M have
one extra sign bit
beyond 32 bits.
Restore $R
(remainder)
$R (33 b) | $Q (32 b)
Optimized Divider
• One cycle per partial-remainder subtraction
• Looks a lot like a multiplier!
• Same hardware can be used for both
4-bit Example: 6/4 = 1, Remainder 2
Actions n $R, $Q $M = Divisor
Initialize 4 0 0 0 0 0 | 0 1 1 0 0 0 1 0 0
Shift left $R, $Q 4 0 0 0 0 0 | 1 1 0 0 0 0 1 0 0
Add – $M (11100) to $R 4 1 1 1 0 0 | 1 1 0 0 0 0 1 0 0
Restore, add $M (00100) to $R 3 0 0 0 0 0 | 1 1 0 0 0 0 1 0 0
Shift left $R, $Q 3 0 0 0 0 1 | 1 0 0 0 0 0 1 0 0
Add – $M (11100) to $R 3 1 1 1 0 1 | 1 0 0 0 0 0 1 0 0
Restore, add $M (00100) to $R 2 0 0 0 0 1 | 1 0 0 0 0 0 1 0 0
Shift left $R, $Q 2 0 0 0 1 1 | 0 0 0 0 0 0 1 0 0
Add – $M (11100) to $R 2 1 1 1 1 1 | 0 0 0 0 0 0 1 0 0
Restore, add $M (00100) to $R 1 0 0 0 1 1 | 0 0 0 0 0 0 1 0 0
Shift left $R, $Q 1 0 0 1 1 0 | 0 0 0 0 0 0 1 0 0
Add – $M (11100) to $R 1 0 0 0 1 0 | 0 0 0 0 0 0 1 0 0
Set LSB of $Q = 1 0 0 0 0 1 0 | 0 0 0 1 0 0 1 0 0
Remainder | Quotient
co
un
t
4
3
2
1
0
Faster Division
• Can’t use parallel hardware as in multiplier
• Subtraction is conditional on sign of remainder
• Faster dividers (e.g. SRT devision) generate multiple
quotient bits per step
• Still require multiple steps
Signed Division
• Remember the signs and divide magnitudes.
• Negate the quotient if the signs of divisor
and dividend disagree.
• There is no other direct division method for
signed division.
MIPS Division
• Use HI/LO registers for result
• HI: 32-bit remainder
• LO: 32-bit quotient
• Instructions
• div rs, rt / divu rs, rt
• No overflow or divide-by-0 checking
• Software must perform checks if required
• Use mfhi, mflo to access result
Floating Point Operations
Floating Point
• Representation for non-integral numbers
• Including very small and very large numbers
• Like scientific notation
• –2.34 × 1056
• +0.002 × 10–4
• +987.02 × 109
• In binary
• ±1.xxxxxxx2 × 2yyyy
• Types float and double in C
normalized
not normalized
Floating Point Standard
• Defined by IEEE Std 754-1985
• Developed in response to divergence of representations
• Portability issues for scientific code
• Now almost universally adopted
• Two representations
• Single precision (32-bit)
• Double precision (64-bit)
• Numerical Form:
(–1)s M 2E
• Sign bit s determines whether number is negative or
positive
• Significand M normally a fractional value in range
[1.0,2.0).
• Exponent E weights value by power of two
• Encoding
• MSB s is sign bit s
• exp field encodes E (but is not equal to E)
• frac field encodes M (but is not equal to M)
Floating Point Representation
s exp frac
Precisions
• Single precision: 32 bits
• Double precision: 64 bits
s exp frac
1 8-bits 23-bits
s exp frac
1 11-bits 52-bits
Normalized Values
Condition: exp ≠ 000…0 and exp ≠ 111…1
• Exponent coded as biased value: E = Exp – Bias
• Exp: unsigned value exp
• Bias = 2k-1 - 1, where k is number of exponent bits
• Single precision: 127 (Exp: 1…254, E: -126…127)
• Double precision: 1023 (Exp: 1…2046, E: -
1022…1023)
• Significant coded with implied leading 1: M = 1.xxx…x2
• xxx…x: bits of frac
• Minimum when 000…0 (M = 1.0)
• Maximum when 111…1 (M = 2.0 – ε)
• Get extra leading bit for “free”
Normalized Encoding Example
Value: Float F = 15213.0;
1521310 = 111011011011012
= 1.11011011011012 x 213
Significand
M = 1.11011011011012
frac = 110110110110100000000002
Exponent
E = 13
Bias = 127
Exp = 140 = 100011002
Result:
0 10001100 11011011011010000000000
s exp frac
Denormalized Values
Condition: exp = 000…0
• Exponent value: E = –Bias + 1 (instead of E = 0 – Bias)
• Significand coded with implied leading 0: M = 0.xxx…x2
• xxx…x: bits of frac
• Cases
• exp = 000…0, frac = 000…0
• Represents zero value
• Note distinct values: +0 and –0 (why?)
• exp = 000…0, frac ≠ 000…0
• Numbers very close to 0.0
• Lose precision as get smaller
• Equal spaced
Special Values
Condition: exp = 111…1
• Case: exp = 111…1, frac = 000…0
• Represents value ¥ (infinity)
• Operation that overflows
• Both positive and negative
• E.g., 1.0/0.0 = −1.0/−0.0 = +¥, 1.0/−0.0 = −¥
• Case: exp = 111…1, frac ≠ 000…0
• Not-a-Number (NaN)
• Represents case when no numeric value can be
determined
• E.g., sqrt(–1), ¥ − ¥, ¥ ´ 0
Visualization: Floating Point Encodings
+¥−¥
-0
+Denorm +Normalized−Denorm−Normalized
+0NaN
NaN
FP Adder Hardware
Step 1
Step 2
Step 3
Step 4
FP Arithmetic Hardware
• FP multiplier is of similar complexity to FP adder
• But uses a multiplier for significands instead of
an adder
• FP arithmetic hardware usually does
• Addition, subtraction, multiplication, division,
reciprocal, square-root
• FP « integer conversion
• Operations usually takes several cycles
• Can be pipelined
FP Instructions in MIPS
• FP hardware is coprocessor
• Adjunct processor that extends the ISA
• Separate FP registers
• 32 single-precision: $f0, $f1, … $f31
• Paired for double-precision: $f0/$f1, $f2/$f3, …
• Release 2 of MIPs ISA supports 32 × 64-bit FP reg’s
• FP instructions operate only on FP registers
• Programs generally don’t do integer ops on FP data, or
vice versa
• More registers with minimal code-size impact
• FP load and store instructions
• lwc1, ldc1, swc1, sdc1
• e.g., ldc1 $f8, 32($sp)
essay、essay代写