Error writing binary file in Python

1

I have a program in Python that receives a binary file via parameter and writes this file. However, when it writes the file, some characters it replaces with a series of numbers. Below the original file I get as a parameter:

ÐT_Ö / ¤Ð樮kMμûÀz "Ô (Î", + œd¼Es ¥

But when the program writes, look at the result:

ÐT_Ö / ¤Ð樮kMμûÀz & # 148; Ô (Î, & # 147; & # 156; d¼Es ¥

You can see that the character between the z and Ô characters was replaced by the & # 148 and the + character between Î

Below the Python program code that do the binary file and write

import subprocess
from subprocess import Popen, PIPE, STDOUT
def chamaProg(arquivo): 
   var_file = open("C:\Nitgen\arquivo.rec","wb")
   conteudo_texto = var_file.write(arquivo)
   var_file.close(

Why is this happening?

What should I do to read and write all characters correctly?

Please, I need to solve this problem urgently.

Thank you.

    
asked by anonymous 31.05.2015 / 22:00

3 answers

0

Binary files have characters ASCII the file you are trying to record has Unicode you have converted before.

def chamaProg(arquivo): 
  var_file = open("C:\Nitgen\arquivo.rec","wb")
  conteudo_texto = var_file.write(arquivo.encode("utf-8"))
  var_file.close()
    
31.05.2015 / 23:42
1

By taking a% w / o of missing% after% w / o (which I suppose was an error at the time of copying and pasting), your code is correct. Where does the ) variable come from? As you mentioned error 500 in another answer, I imagine this is part of a web application? I would investigate (even if spreading close( s by code) where this variable comes from; it is being prepared to be displayed on the web, not as a binary string.

If you can not avoid this transformation (because you do not control the code that calls your function), you can try the "dirty" solution of interpreting the input as an HTML snippet:

import HTMLParser
html_parser = HTMLParser.HTMLParser()
arquivo = html_parser.unescape(arquivo)

(but note that you should only use this to erase a production fire; you have to find out why arquivo is coming with these replacements)

    
03.06.2015 / 16:20
0

You're having trouble with the encoding of these characters, so I recommend opening the file as utf8 . You can do this with the built-in codecs package . Here's an example:

# -*- coding: utf-8 -*-
import codecs


def save_file(content):
    with codecs.open('file.rec', 'wb', 'utf8') as f:
        f.write(content)

if __name__ == '__main__':
    save_file(u'ÐT_Ö/¤Ð樮kMµûÀz”Ô(Î,“+œd¼Es¥')

And another tip, whenever you're working with files, use context manager with this example), because it already closes the file for you when you leave the context.

    
31.05.2015 / 22:30