In Python, is there any way beyond 'numpy' and 'float (' nan ')' to get the special constant 'nan'?

4

I've been reading the Underhanded C Contest site, where the goal is to write subtly malicious code that looks normal the first View. One of the common techniques mentioned was the use of not a number , or nan , which has some special properties; remarkably, any type of comparison with nan results in False .

Thinking of a proof of concept in Python, I came up with the following:

def maior_que_10():
    entrada = input('Digite um número: ')
    try:
        entrada_float = float(entrada)
    except ValueError:
        print('Erro!')
        return
    if entrada_float > 10:
        print('Maior que 10!')
        return
    elif entrada_float <= 10:
        print('Não é maior que 10!')
        return

    print('Inesperado!')

while True:
    maior_que_10()

The function correctly treats invalid numeric values by issuing an error and by looking inattentive, never seems to reach print('Inesperado!') because it checks > 10 and <= 10 , but having the entry "nan" executes the last line: / p>

Digite um número: 11
Maior que 10!
Digite um número: 9
Não é maior que 10!
Digite um número: 10
Não é maior que 10!
Digite um número: foobar
Erro!
Digite um número: nan
Inesperado!

Theoretically, in less trivial code, malicious code could be hidden after two if . This, however, depends on having a user input passed to float .

Is there any operation between variables that manages a nan otherwise?

I thought of division by zero or root of negative number, but they result in exceptions, not nan :

>>> math.sqrt(-1)
 ValueError: math domain error
>>> 1/0
 ZeroDivisionError: division by zero
    
asked by anonymous 26.11.2018 / 19:32

2 answers

4
  

Is there any operation between variables that manages a nan otherwise?

At the end of the Python math module documentation you will find:

  

NaN will not be returned from any of the above functions unless one   or more of the input arguments were a NaN;

I mean - usually not. But it is possible for you to get to a NaN, for example if you try to subrout float.inf from float.inf - only in that case your problem happens to be to generate float.inf :

In [100]: a = float("inf")                                                                               

In [101]: a - a                                                                                          
Out[101]: nan
Another way is to write is to note the binary form of a "nan" as an object of tpo bytes, or an integer - and use the "struct" module to convert those bytes back to a floating point, which will then be a NaN:

 struct.unpack("d", struct.pack("Q", ((2 ** 12 - 1) << 52) + 1))[0]

Or using ctypes:

In [199]: import ctypes                                                                                  

In [200]: class A(ctypes.Union): 
     ...:     _fields_ = [('i', ctypes.c_uint64), ('f', ctypes.c_double)] 
     ...:                                                                                                

In [202]: A(i=((2 ** 12 - 1) << 52) + 1).f                                                               
Out[202]: nan
    
26.11.2018 / 20:56
5

(* rereading the whole question, I saw that I wrote an extensive answer on how to check a decimal point input, but that does not answer well to your specific question - sorry.) I'll keep the answer because it can help beginners who fall here by account title of question)

In later versions of Python it is possible to do from math import nan - this places the variable nan in the namespace that contains a nan number.

In older versions (older than Python 3.5), the recommended thing to do is put in your code:

nan = float('nan')  

even (or use the expression float('nan') directly.

In addition, it is important to keep in mind when dealing with NaN's that a NaN value is never equal to another when compared to == (nor equal to itself). The best way to tell if a value is a NaN is to use the isnan function of the math module:

from math import nan, isnam

isnan(nan)

prints True .

That said about NaNs - there are more things to consider about using float direct on top of a string that the user types. In particular, infinite values can be expressed with float('inf') (and negative infinity with "-inf"), and numbers with scientific notation are also accepted, where an exponent of "10" can be added to the number after the letter " and ":

In [95]: float("1e3")                                                                                    
Out[95]: 1000.0
So, if you really want to limit input to positive or negative numbers, with decimal points, it's better to parse them more carefully than simply calling float(entrada) .

In general, when we talk about "parse," many people think first about regular expressions. I find that regular expressions are difficult to read and maintain, and people tend to put simple expressions that do not match all of the data possibilities.

checking the data entered with regular expressions:

Python is a good regular expression language because you have not happily invented mixing it with the language intact - you call normal functions and pass a string with the regular expression you want to compare with your text - there are several functions in the module re of regular expressions - for example to "find all occurrences" ( re.findall ) or replace ( re.sub ). In this case, we simply want to see if an expression matches the user input.

And in a hurry someone might think "I want to see the user typed one or more digits, followed by an optional dot, followed by one or more digits" - this expression can be written as "[0-9]+\.?[0-9]+" - just look at it and see which is not good: what if the user types a "-" sign? What if there is only one digit? (the second part expects one more digit after the point - although the point is optional) - result - whereas this expression can match "11", "23.2", "0.1", will not match "1" 1 "," .23 ", etc. ...

To shorten the story, the regular expression that checks for a decimal number, with optional sign, with at least one valid digit, or no digit if there is a decimal point, and if there is a decimal point at least one digit after it is

c = r"-?(?:[0-9]+|(?=\.))(?:\.[0-9]+)?$" 

(The regexps documentation in Python is here - link )

And you could do with your code:

import re

def maior_que_10():
    entrada = input('Digite um número: ')
    if not re.match(r"-?(?:[0-9]+|(?=\.))(?:\.[0-9]+)?$", entrada):
        print('Erro!')
        return
    entrada_float = float(entrada)
    ...

Checking input with Python code

Then, in the name of readability, and knowing what is being done, it may be worth using the Python string manipulation functions: split, find,  count, isdigit to do a function that checks if a string is a well-formatted decimal before attempting to convert it to float.

You can do something like this:

verifica_decimal(text):
   if not text:  # string vazia
      return False
   filtered = text.replace('-', '').replace('.', '')
   if not filtered.isdigit(): # há caracteres que não são nem dígito nem - nem .
       return False
   if '-' in text[1:]: # sinal 'perdido' no meio do número.
       return False
   if text.count('.') > 1 or text[-1] == '.': # Mais de um '.', ou '.' na última casa
       return False
   return True

def maior_que_10():
    entrada = input('Digite um número: ')
    if not verifica_decimal(entrada):
        print('Erro!')
        return
    entrada_float = float(entrada)
    ...
    
26.11.2018 / 20:03