The prefix " b'
" in the representation of your object shows that the text you have at that point in your program is a bytes
object, not a text string.
In Python 3 the two things are different - since they invented multi-byte text encodings, it can not be said that a byte is a character.
The normal workflow in any Python application is:
get your input data;
if the uq library delivered your data
no longer delivered as text, that is, if they are bytes, decode them
( decode
) to become text
Process your data
encodes them again ( encode
) and writes them to the output (if this
not done automatically - as with text files,
for example)
Then in your case, assuming that the object you have there is in the a
variable, to continue your program, just decode those bytes to text (object of type str
) in Python 3 and continue your program:
a = b'N\xc3\xa3o n\xc3\xa3o n\xc3\xa3o, n\xc3\xb3s iremos sim!'
b = a.decode("utf-8")
print(b)
In the case, I know that the encoding is utf-8 to look at the encoding: two bytes for an accented character, and the first one being "\ xc3" is a good hint that bytes represent text encoded in utf-8 .
An essential thing to understand is the difference between text ( str
in Python 3) which is composed of unicode characters, and bytes, which are sequences of numbers between 0 and 255 effectively stored in files or transmitted over the network. To do this, be sure to read:
link