Saturday, 2 August 2014

What's Up With Signed Integers?!

In an earlier post we talked about how binary numbers worked (in fact we talked about how all bases work) for positive numbers with both integer and fractional parts. In this post we're going to talk about how negative integers are represented in binary.

Under status quo, let's say we have a computer which, due to a lack of foresight into memory allocation requirements, only holds 4 bits of memory. If we were to consider those bits to represent an unsigned number, here's what we would interpret the states as:

0000 = 0    0001 = 1    0010 = 2    0011 = 3
0100 = 4    0101 = 5    0110 = 6    0111 = 7
1000 = 8    1001 = 9    1010 = 10   1011 = 11
1100 = 12   1101 = 13   1110 = 14   1111 = 15

0,1,2,3,4,5,6,7,8,9,10,11,12,12,14,15

so we have 16 numbers, 0-15, so no surprises so far.

If we wanted to have as many negative numbers represented by our 4 bits as positive numbers, then we're going to end up with half as many positive numbers that we can represent. One possible way of going about this (which was my first impression) would be to reserve one bit for the sign of the number. Let's say we take the left-most (most significant) bit and say that if it's 0, we're dealing with a positive number and if it's a 1 we're dealing with a negative number. How will our representations change?

0000 =  0   0001 =  1   0010 =  2   0011 =  3
0100 =  4   0101 =  5   0110 =  6   0111 =  7
1000 = -0   1001 = -1   1010 = -2   1011 = -3
1100 = -4   1101 = -5   1110 = -6   1111 = -7

0,1,2,3,4,5,6,7,-0,-1,-2,-3,-4,-5,-6,-7

Not bad but a few problems arise:

Firstly we have two ways of representing 0. Not only is this a waste of a state (namely the binary number 1000) but computers very very often need to compare numbers to 0 so if there are two ways to represent 0 in binary then every time a computer wants to check whether a number is 0, it will need to check if it's +0 or -0 and the extra comparison is costly when we're dealing with processors that can do these kinds of things millions of times in a second.

Secondly, in terms of arithmetic, the computer needs to read the sign bit of both numbers if it wants to work out whether it needs to perform addition or subtraction on them.

Thirdly, it would be nice to have the numbers be in order like -7,-6,-5...0,1,2,3...6,7 so if you're incrementing or decrementing the number you don't need to worry about shifting gears when you get to 0.

Before I introduce the next method for representing signed integers, it's a good time to mention that these quarrels I have with the above method (called the sign-and-magnitude method) are not necessarily going to be solved by some super-method, and the method that computers actually use today (which will be later on in the post) was and still is subject to a lot of heated debate in light of the other methods.

The next method which is intuitive to me is the 'Excess K' method where you take your regular unsigned numbers and just give them an offset of K. For example we could choose K to be 8 meaning that each of our numbers will be translated 8 units to the left:

0000 = -8    0001 = -7    0010 = -6    0011 = -5
0100 = -4    0101 = -3    0110 = -2    0111 = -1
1000 =  0    1001 =  1    1010 =  2    1011 =  3
1100 =  4    1101 =  5    1110 =  6    1111 =  7

-8,-7,-6,-5,-4,-3,-2,-1,0,1,2,3,4,5,6,7

Note that you could have chosen your K to be 7 and you'd have a range of -7-8 which is just as valid.

Well we fixed quarrel #3 with the sign-and-magnitude method, an increase/decrease of 1 in the binary representation corresponds to an increase/decrease of 1 in the signed number itself. We also now only have a single 0. The only problem here, and it is a big problem, is arithmetic. If you want to add 1 and -5 then you're dealing with 1001 and 0011. This means you're dealing with (1+8) + (-5 + 8) = 12 and then you need to cancel out the excess 8's by subtracting 2*8 which gives you -4 (at least it gives the correct answer). Well actually if you used that process you'd then need to add another 8 afterwards to give -4 its binary representation so we may as well only ever subtract off one 8 instead of 2*8 after we do additions.

The other methods are the complement methods. What is a complement? Its easier to think about in terms of decimals:

For base 10 we can look at the 10's complement or the 9's complement of any number. If I have the number 531, the 10's complement of that is simply 1000 - 531 = 469. Why would we ever want to use this? Let's say I want to subtract 531 from 812. I suck at subtraction so it would be easier to just add 812 to -531 which by the above equation is (469 - 1000) i.e. add 469 and 812, and then subtract off 1000 when I'm done to get the answer to the question of what 812 - 531 is. 469 + 812 = 1281 meaning our answer is 1281-1000 = 281. To generalise, the 10's complement of a number n with d digits is 10^d - n (but if n=0 the complement is 0 which is reasonable when you consider that the number of digits is irrelevant to representing 0). 10's complements are good because removing the 10^d from the result of the addition at the end is easy, although actually calculating the complement in the first place isn't as easy as it could be. What you could do is instead of subtract n from 1000 you could subtract it from 999 (meaning you can do the subtraction digit by digit) and then add 1 at the end i.e. you're subtracting the rightmost digit from 10 and every other digit from 9 (which is just another way of subtracting n from 1000 because 1000 = 999 + 1).

So our 10's complement is 10^d - n and the way we implemented it actually utilised the 9's complement i.e. (10^d - 1 - n). So if I wanted the 9's complement of 531 that's just 999 - 531 = 468 (the 10's complement is 469 because 999 and 1000 differ by 1).

So there's 9's and 10's complements, and you may be wondering why we can't have an 8's complement or a 7's complement when dealing with decimal numbers. The answer is that if you did, and you wanted to subtract your number from say 8888, then if one of the digits is a 9 you'll need to take the subtraction beyond the current digit which defeats the purpose of having a complement.

Anyway now that we know what a complement is we can apply the concept to binary numbers. We shall start off with the 1's complement.

If we have a number 0010 (2) then the one's complement of it is 1111 - 0010 = 1101. The bits just get inverted. The formula for the 1's complement is 2^d - 1 - n. The 2's complement (the only other possible complement) is 2^d - n. so the 2's complement of 0010 is 10000 - 0010 = 1110. Just like how you could consider the 10's complement as the 9's complement plus 1, the 2's complement is the 1's complement plus 1. So the 2's complement of 0010 could be calculated as 1111 - 0010 + 1 = 1101 + 1 = 1110. Notice what happens when you add the 1 at the end: if the leftmost digit is a 0 (meaning it was a 1 before the inversion happened) then it becomes a 1 (i.e. the inverted digit is inverted again meaning it goes back to how it was in n originally). If the leftmost digit is a 1 then it becomes a 0 and the 1 gets carried all the way until there's a 0 to hold it. For example if you say n is 0101011000 then when calculating the 2's complement you first invert the digits so we get 1010100111 then you add one so we get 101010[1000]. The digits in the brackets are exactly what they were when we started so now we know that the easiest way to get the 2's complement is to just start from right to left and we ignore all the zeros and the first 1 that we see and then we start inverting the digits to the end.

Before I get into how this helps represent negative numbers in binary, notice that the complement of a complement is the original number. For example if we start with n and take the 2's complement we get (2^d - n) and if we take the 2's complement of that we get 2^d - (2^d - n) = 2^d - 2^d + n = n.

So how does this translate into representing negative numbers? Say I only allowed myself to represent decimal numbers with 3 digits. Clearly I need to halve my range of positive values in order to have equal representation of positive and negative numbers. We can say that 000 to 499 are positive and anything past that is negative and you just take the 10's complement to find out it's magnitude. e.g. 500 becomes -(1000 - 500) = -500, and 999 becomes -(1000 - 999) = -1. So we get a range of 0 to 499 in positive numbers and -1 to -500 in negative numbers.

In 9's complement we again say 000 to 499 just remain unchanged and represent positive numbers. Then everything else represents the negative of the 9's complement so 500 becomes -(999 - 500) = -499, and 999 becomes -(999 - 999) = -0. So we can represent 0 to 499 in positive numbers and -0 to -499 in negative numbers.

In 2's complement let's say again we can only store 4 bits, so 0000 (0) to 0111 (7) remain unchanged and 1000 (8) to 1111 (15) represent the negative of their 2's complements so with the table we used above:

0000 =  0   0001 =  1   0010 =  2   0011 =  3
0100 =  4   0101 =  5   0110 =  6   0111 =  7
1000 = -8   1001 = -7   1010 = -6   1011 = -5
1100 = -4   1101 = -3   1110 = -2   1111 = -1

Notice that you can easily tell when a number is negative because the leftmost digit is 1.

In 2's complement the 0000 (0) to 0111 (7) remain unchanged but the 1000 (8) to 1111 (15) represent the negative of their 1's complement:

0000 =  0   0001 =  1   0010 =  2   0011 =  3
0100 =  4   0101 =  5   0110 =  6   0111 =  7
1000 = -7   1001 = -6   1010 = -5   1011 = -4
1100 = -3   1101 = -2   1110 = -1   1111 = -0

Considering subtraction can be seen as adding a number and another number's complement i.e. A - B = A + (-B). Here's how a decimal computer would go about subtracting 531 from 712 using 10's complement:

531 --> 1000 - 531 = 469

712 + 469 = 1181

because we end up with an extra digit, remove the rightmost 1 because that represents the 1000 we added on at the beginning

1181 --> 181 and that's our answer.

If we ended up with a three-digit number then we can't just take off the rightmost 1 to subtract off 1000 so instead we take the complement of the number and store that e.g. 280 - 435:

435 --> 1000 - 435 = 575

280 + 575 = 855

855 is the 10's complement representation of the actual answer -145 because we are in excess of 1000 so if we wanted to print the answer we'd just go 855 - 1000 = -(1000 - 855) = -(10's complement of 855) = -145 i.e. take the 10's complement of 855 and stick a minus sign infront of it.

With 9's complement the process is exactly the same except if you get an extra digit holding a 1 (an end-carry) you need to do an end-around-carry. For example let's 712 - 531 using 9's complement:

531 --> 999 - 531 = 468

712 + 468 = 1180

you've got an end-carry but you can't just remove it because that would be removing 1000 when you want to remove 999 to get the true answer, so instead remove the rightmost digit and then add 1 i.e. 1000 = 999 + 1:

1180 --> 180 --> 181. Which is the same as we got with 2's complement.

If you're doing 9's complement subtraction with 280 - 435:

435 --> 999 - 435 = 564

280 + 564 = 844 which is correct in terms of 9's complement representations of negative numbers but if you want to print the actual result you'd go 999 - 844 is 155 so the answer is -155 (you've just taken the 9's complement again).

A question that just popped up in my head is 'what happens if you do the subtraction and end up with 999, because you won't have the extra digit telling you to subtract off 999 to get 0', but then I remembered that 999 = -0 so that a self-solving problem.

As for 2's complement subtraction it's very similar: if we wanted to subtract 10010 (26) from 11010 (18) then we follow the same process:

11010 --> 100000 - 11010 = 00110

10010 + 00110 = 11000 and considering there's no 6th digit we're dealing with a negative number so we can store it like that and if we want to print out what it we can take the negative of the 2's complement and say it's -(100000 - 11000) = -01000 = -8.

If we wanted to subtract 18 from 26 we'd end up with an extra 1 digit in which case we could just throw that away (i.e. take the negative of the complement again) to get our answer.

For 1's complement subtraction let's see what subtracting 26 from 18 is like:

11010 --> 11111 - 11010 = 00101

10010 + 00101 = 10111 which is negative and to find out the value we take the negative of the complement to give us -01000 = -8. If we had done 26 - 18 we'd have an end-carry so we'd need to discard that digit and add 1 because 100000 = 11111 + 1 so subtracting 100000 and then adding 1 is the same as subtracting 11111 i.e. taking the negative of the complement.

So the general process is to calculate A - B we take B's complement and then calculate A + (B's complement) and then take the negative of the complement of the result to get the true answer which, if A > B (i.e. the answer is positive) involves either discarding the extra digit (10's and 2's complement) or discarding the extra digit and adding 1 (9's and 1's complement) and if A < B (i.e. the answer is negative) we can leave the value as it is if we want to store it or if we want to print it we just take the negative of the complement.

So which is better for binary? 1's complement or 2's complement?

Why to go with 1's and not 2's:

  • getting the complement is easier with 1's complement because you just invert each bit
  • 1's is symmetric i.e. 4 bits goes from -15 to 15 but 2's is assymetric


Why to go with 2's and not 1's:

  • 2's can be implemented with two rules: get the 1's complement and add 1 or search from right to left until you find a 1 and after that you start inverting
  • when subtracting with 2's complement you can just discard the end-carry whereas with 1's you need to discard it and add 1
  • 1's complement has two zeros which is very bad considering how often computers compare values to zero
  • whether the number is positive or negative, if a number in 2's complement is even, then the rightmost big is 0 which is good for testing whether an integer is even (in 1's complement that bit is 0 for negative odd numbers)

Seems pretty compelling to me to choose 2's complement, which is why it's the most commonly used system for representing integers in binary.

No comments:

Post a Comment