5.1 Chapter Overview
Manipulating bits in memory is, perhaps, the thing that assembly language is most famous for. Indeed, one of the reasons people claim that the “C” programming language is a “medium-level” language rather than a high level language is because of the vast array of bit manipulation operators that it provides. Even with this wide array of bit manipulation operations, the C programming language doesn’t provide as complete a set of bit manipulation operations as assembly language. This chapter will discuss how to manipulate strings of bits in memory and registers using 80x86 assembly language. This chapter begins with a review of the bit manipulation instructions covered thus far and it also introduces a few new instructions. This chapter reviews information on packing and unpacking bit strings in memory since this is the basis for many bit manipulation operations. Finally, this chapter discusses several bit-centric algorithms and their implementation in assembly language.
What is Bit Data, Anyway?
Before describing how to manipulate bits, it might not be a bad idea to deﬁne exactly what this text means by “bit data.” Most readers probably assume that “bit manipulation programs” twiddle individual bits in memory. While programs that do this are deﬁnitely “bit manipulation programs,” we’re not going to limit this title to just those programs. For our purposes, bit manipulation refers to working with data types that consist of strings of bits that are non-contiguous or are not an even multiple of eight bits long. Generally, such bit objects will not represent numeric integers, although we will not place this restriction on our bit strings. A bit string is some contiguous sequence of one or more bits (this term even applies if the bit string’s length is an even multiple of eight bits). Note that a bit string does not have to start or end at any special point. For example, a bit string could start in bit seven of one byte in memory and continue through to bit six of the next byte in memory. Likewise, a bit string could begin in bit 30 of EAX, consume the upper two bits of EAX, and then continue from bit zero through bit 17 of EBX. In memory, the bits must be physically contiguous (i.e., the bit numbers are always increasing except when crossing a byte boundary, and at byte boundaries the byte number increases by one). In registers, if a bit string crosses a register boundary, the application deﬁnes the continuation register but the bit string always continues in bit zero of that second register. A bit set is a collection of bits, not necessarily contiguous (though it may be), within some larger data structure. For example, bits 0..3, 7, 12, 24, and 31 from some double word object forms a set of bits. Usually, we will limit bit sets to some reasonably sized container object (that is, the data structure that encapsulates the bit set), but the deﬁnition doesn’t speciﬁcally limit the size. Normally, we will deal with bit sets that are part of an object no more than about 32 or 64 bits in size. Note that bit strings are special cases of bit sets. A bit run is a sequence of bits with all the same value. A run of zeros is a bit string containing all zeros, a run of ones is a bit string containing all ones. The ﬁrst set bit in a bit string is the bit position of the ﬁrst bit containing a one in a bit string, i.e., the ﬁrst ‘1’ bit following a possible run of zeros. A similar deﬁnition exists for the ﬁrst clear bit. The last set bit is the last bit position in a bit string containing that contains ‘1’; afterwards, the remainder of the string forms an uninterrupted run of zeros. A similar deﬁnition exists for the last clear bit. A bit offset is the number of bits from some boundary position (usually a byte boundary) to the speciﬁed bit. As noted in Volume One, we number the bits starting from zero at the boundary location. If the offset is less than 32,...
Please join StudyMode to read the full document