Logic of these operations bit by bit


I've already had a good time getting started in the emulator world, I decided to stop trying to make an emulator of a complex system and start with a very basic, emulator of CHIP-8, which is what many indicate in forums of emulation. Well let's break it down:

First operation I do not see the logic:

std::uint16_t opcode = m_Memory[reg.PC] << 8 | m_Memory[reg.PC + 1];

Basically 1 CHIP-8 opcode is 2 bytes, but rom is 8 bits, first I access the array of std :: uint8_t that I call m_Memory that I used to store the ROM and the font set at the position of Program Counter that starts as 0x200 which is where most programs / games of CHIP-8 begin, then add 8 more zeros, which is easy to understand, 1 byte = 8 bits, then 2 bytes are 16 bits, but then the confusion starts, if you already got the opcode then why masking a 16-bit value with 8? and why use the rom itself however advancing the position of the pc?

Here we go to the second part of my problem:

switch (opcode & 0xF000) {

In a discussion I started in a reddit forum about emulators people told me they masked the opcode with 0xF000 to get the actual opcode, but what I did not understand is how they came to the conclusion that they should mask and why with that value .

The final part:

I use this documentation in which I and many others are guided, first we go to opcode 0x6000 or 6xkk or LD V x, byte:

//LD Vx, byte
case 0x6000:
    reg.Vx[(opcode & 0x0F00) >> 8] = (opcode & 0x00FF);
    reg.PC += 2;
    std::cout << "OPCODE LD Vx, byte executado." << std::endl;

The CHIP-8 has 16 registers of 8 bits that I called of Vx, we go the:

reg.Vx [(opcode & 0x0F00)> > 8]

First I converted opcode 0x6000 to binary and performed the and:

0110 0000 0000 0000    //0x6000
0000 1111 0000 0000    //0x0F00
0000 0000 0000 0000    //0x0

Then >> 8 moves 8 bits to the right which would be 0000 0000 ie index 0 of Vx, then = (opcode & 0x00FF) that is:

0110 0000 0000 0000    //0x6000
0000 0000 1111 1111    //0x00FF
0000 0000 0000 0000    //0x0

So why not just reg.Vx[0] = 0; ?

Remembering that I've never had to do Bit Bit operations before on any project, I just know what the books told me about the AND operation, OR, XOR, NOT etc ...

I would like to be able to understand this logic that people use to be able to use in future projects.

asked by anonymous 18.06.2018 / 14:16

1 answer


Some of the things you do not understand seem to be because you did not interpret that there are values that are a "family" of opcodes, or parameters for a single opcode - all encoded in the 16-bit value - not just a fixed value . The last example of opcode 0x6000, for example, you did the whole simulation as if the always value would be exactly 0x6000 - however, see the documentation:


6xkk - LD Vx, byte Set Vx = kk.


The interpreter puts the value kk into register Vx.

That is, the first "nibble" (first 4 bits) of the opcode contains the hexa digit "6". The remaining 3 hexadecimal digits are the opcode arguments. So, yes, "0x6000" will always be "set V0 = 0x00", but the 0x62FF opcode means "set V2 = 0xFF". The role of your interpreter / emulator is to detect that opcode 6 means to put a value in a register, extract those values, and execute the operation.

See how this already answers your second question - when doing the switch-case with the opcode masked with 0xF000, only the value "0x6000" stands for _comparison as a case, but inside the case code, you need the opcode in in the other digits of the opcode are the parameters.

opcode = 0x62ff;
switch (opcode  & 0xf000):
   case 0x6000:
       register_number = (opcode & 0x0f00) >> 8;
       value = opcode & 0xff;
       registers[register_number] = value;

Note in the documentation that not all opcodes are determined integrally by the first hexadecimal digit - for some of them, for example, the "0x0" itself, there is a whole subfamily of opcodes - in those cases you will make another switch / case within the first (or call a function in C for this), to test the other values. / p>

And lastly, regarding:

opcode = m_Memory[reg.PC] << 8 | m_Memory[reg.PC + 1]; 

It is legible as clearly as in Portuguese - The m_Memory (*) vector contains 8-bit values. You need to read two bytes and compose a single 16-bit value (and see the documentation: the most significant byte comes first - that is, "big endian")


All instructions are 2 bytes long and are stored most-significant-byte   first.

Then - you get the first byte, multiply it by 2 ^ 8 using the 8% shift with% - that is, it inserts 8 zeros to the right of that byte - and then move those 8 lower binary digits to the value of the next one byte in memory, using << 8 binary (since all corresponding values are 0, the value of the second byte is placed integrally in the lower bits of the opcode). In other words: you read a byte, put it in the position of bits 15 through 8 of your opcode, and read the byte in the next position of memory, in the position of bits 7 through 0.

(*)Separatenote:youreallygainverylittlebycomplicatingvariablenames-evenifthisisthestylepracticeinotherexamplesyouarereading:"m_Memory" instead of "memory" only means 4 touches to more on the keyboard, and three "visual junk" signals your brain has to discard when reading the variable. You do not have much risk of having another variable "memory" in this code, do you?

18.06.2018 / 15:56