Can a bit mask be / typed?

8

In most programming languages, when you want to create a bitmask you usually use an integer type and bitwise operations (and, or, xor, not, shift left, shift right .. .). However, although nothing prevents the programmer from assigning a specific value (say, 6 : 110 ) to the mask, constants are usually created to represent each bit and insisted - as good practice, and to avoid incompatibility problems in the future - in using these constants, avoiding the "magic values". This is not usually enforced, however.

Would there be any harm in creating an abstract "bitmask" type, whose subtypes were particular applications of this technique, and make the compiler force its use ? For example, some languages that support enums (enums) - such as Java - allow you to create methods whose parameters must be of this type, so that the programmer has no choice but to use his even when each of them has one or more associated [unique] values. And an enumeration may or may not be used to implement bit masks, but it also has other purposes [1].

My question is specifically: is there any use case for bitmasks in which the freedom to use integers instead of the defined constants brings a significant advantage, and its loss can compromise the expressiveness of the code? I think this is something that only those who have experience working with bit masks can answer, but if anyone has any external reference dealing with the subject would also be very useful. In my limited experience, the main cases of using a bitmask are:

  • Set multiple bits or only one particular bit (or clear a particular bit);
  • Check whether a particular bit (or set of bits) is set or not;
  • Serialize / deserialize (i.e. save the data structure containing the bit mask to a file or other binary / textual format).

I can not think of any other.

[1]: By the way, contrary to the premise of this related question , I have good reason to want to change the bit mask over the evolution of the products, both their individual values as their set of elements - but always versioning, so as not to break old code. This restricts my particular case, but does not invalidate the question (as I remain interested in knowing what is lost when using a bitmask type rather than a generic integer) / p>     

asked by anonymous 26.08.2015 / 00:50

1 answer

2

How they work:

Bitmasks will always have values based on 2, the sequence would be basically: 1, 2, 4, 8 ... up to the maximum number of bits. Imagine for example if you had to choose your favorite color:

VERMELHO = 1;
VERDE = 2;
AZUL = 4;
TODAS = 7;

Arbitrary values:

You can not give any value to them as you quoted 6 or 110. The reason is that these values occupy more than 1 bit . The cases that you will see values like combinations , as in the example, TODAS is actually the combination of all previous values.

This is the normal use of bitmasks and the Windows does a lot of this in the APIs to create windows where you will see that WS_POPUPWINDOW is actually the WS_POPUP , WS_BORDER and WS_SYSMENU .

Non-enumeration languages:

In languages where enums are not supported, I usually create a class or a series of functions that validate the received value, where we will assume that I use only 4 bits, any value above 15 would be invalid. >

Change values throughout development:

This may happen yes but in this case it is good to remember the consequence: Maybe you will have to recompile all the programs that depend on your constants .

I say maybe because there will be cases where their functions depend only on 1 or two constants that you have not changed.

Should it be typed or not?

It's up to you, let's suppose you want to add constants without editing the old ones:

Free constants (as in the Windows example): Just declare more constants that will not break old code.

Subclasses (languages that support classes but not enums): You do not need to recompile old code but you will have to create a whole subclass every time you need to add a constant if you do not want to recompile all

Enums: In some languages, like C #, you would have to recompile everything that depends on it.

Note: Of course languages like PHP you would not have to recompile but would still have to update all servers that depend on their constants, the headache would be similar.

My recommendation is: Avoid bit masks wherever possible, it's easy to introduce bugs into them, and if you use some methodology like Test Driven Development you'll see that it's easy to forget to write tests for all combinations of them.

Another problem you may encounter is trying to mix them with the database, it will make the queries quite difficult, and the maintenance is horrible , I learned the hard way.

Use constants or direct numbers:

Some languages do implicit conversion between numbers and the constants used, the problem in using numbers directly is code maintenance, it is much more difficult for another programmer to understand because it will appear that you are using invented values but just comment your code to communicate the goal.

What will I lose because I can not use numbers directly?

Converting integers to a group of constants is extra useful if you want to, for example, specify a setting using only numbers and they are short.

An example of this is changing file permissions on linux:

chmod   000 ---------
chmod   400 r--------
chmod   444 r--r--r--
chmod   600 rw-------
chmod   620 -rw--w----
chmod   640 -rw-r-----
chmod   644 rw-r--r--
chmod   645 -rw-r--r-x
chmod   646 -rw-r--rw-
chmod   650 -rw-r-x---
chmod   660 -rw-rw----
chmod   661 -rw-rw---x
chmod   662 -rw-rw--w-
chmod   663 -rw-rw--wx
chmod   664 -rw-rw-r--
chmod   666 rw-rw-r--
chmod   700 rwx------
chmod   750 rwxr-x---
chmod   755 rwxr-xr-x
chmod   777 rwxrwxrwx
etc...

It treats each number as a 4-bit bitmask, and in that case if you had to combine constants, the command would be considerably larger outside the fact that remembering 3 numbers is even easier.

    
26.08.2015 / 02:29