"OMG! A heisenbug! "- Explaining to a layman what a heisenbug is!

32

A heisenbug is a bug that changes your behavior when you study [1] . It has its name derived from the principle that Heisenberg detected that the simple "passive observation" of quantum processes changes the final result.

Typical heisenbugs happen with race conditions , since any type of measurement you do (such as trace debugging or break points ) ends up synchronizing concurrent processes in one way or another. The André LFS Bacci has indicated that using floating point for monetary purposes can cause heisenbugs [2] .

*: In quantum physics there are no purely passive observations, but that's details

So, imagine that I am in a complicated situation trying to solve a problem in the system and I can even reproduce the bug, but when I try to see more things about the bug by putting a strategic break point , this bug leaves to happen. At that moment, I come across that situation:

  

"OMG! A heisenbug!"

The PHB asks me what's going on. Support is with the customer on the line. I need to give an answer about how the study of this problem is going, I need to ask for more time to try to heal because it is not a simple bug, but a heisenbug!

How can I explain to the boss and support about heisenbug? They are not the deepest connoisseurs of programming, they think that only by if (stuff_will_bug()) { dont_do_stuff(); } else { do_stuff(); } solves the problem magically.

*: Preferably, explanations that do not result in my resignation

    
asked by anonymous 22.03.2018 / 03:25

1 answer

23

So I understand, you need an analogy , so make it easy to explain something complex, to a layman in programming (his boss).

Explaining with my words to my boss (can be improved, always!):

  

In my opinion, Analogy 2 and Analogy 3 are clearer, but worth reading the rest.

Analogy 1 - Broken bus

Fact

I own a public transport company.

Problem

I have a bus that always breaks the buffer.

Debug

The bus driver always travels exactly the same route.

With this, I will go to him, and so check what is going on.

But "incredibly" the times I went along, the bus did not break! Why?

Solution

After suspicions, he was accompanied in a non-invasive way by putting a "spy" in disguise as a passenger, and bingo!

There is a huge hole in one of the streets of the route, and the driver does not even deviate from this hole, unless it is observed .

Summarizing

Just being observed resulted in the temporary fix of the problem.

Analogy 2 - Where is my steak?

Fact

I work in the XYZ company and I always take lunch pots , where I leave in the community refrigerator that you have in the service.

Problem

As I make the 2nd lunch time, every day is "fading" a steak from my kettle.

Debug

Due to the problem, I started passing by the cafeteria on the first lunch break . And so, my steak has stopped "fading." Why?

Solution

There was an employee who, when he went to get his own kettle, opened other kettles and took the steaks (a factor outside his correct "instruction"). This was discovered by observing and analyzing all of the "resources" (employees), until the problem was found to be visually not directly tied to the problem but with a intervention factor (I) changed the final result.

In this case you have 3 solutions :

  • Send employee though (delete feature / process)
  • Placing cameras (creating a companion feature that filters the crash)
  • Apply warning (fix, with possibility of temporary solution)

Summarizing

Same as Analogy 1, with examples of 3 possible more concrete solutions.

Analogy 3 - Enter and Leave the Beetle (true fact)

Fact

I have a 1947 Beetle!

Problem

Sometimes when I'm riding Beetle, it stops working suddenly . I back up and try to start several times, but does not work at all .

Debug

In the past, it was heard that when the Beetle gave this problem, you should leave the vehicle for 2 minutes, go back in, start, and it would call.

In fact, done and proven!

Solution

After a better understanding of the mechanical (structural) part, the "coil" was found to heat up and cause the electrical part to stop.

Getting in and out of the vehicle and waiting 2 minutes, made it cool enough to work again.

Summarizing

Another example of a third key factor that was neither imaginable nor remarkable, and when we tried a solution out of logic but that worked palliatively, we would "vitiate" our reasoning in an incorrect "starting point" .

    
27.03.2018 / 12:45