There are two things you need to understand when dealing with binary files like this:
-
Reading an amount X of bytes with read
advances the read position of the X file positions. This happens every time you call read
.
-
seek
sends you to the position you passed. Then a seek(0)
sends the file reading position back to the beginning.
-
Inside the binary file, it only has bytes, and all are queued.
In your case, for example: the first four bytes represent an integer that indicates how many pairs of int and float the file contains, followed by four from the first int, four from the first float, and so on.
Let's suppose that our file has 2 pairs. The binary will look something like this:
>0010 iiii ffff iiii ffff
Where 0010 is integer 2 in binary, iiii
represents a 4-byte integer, and ffff
is a 4-byte float. The >
arrow represents the read position of the file. When we open the file, it is at position 0, first of all.
Let's take a look at your code:
with open('valores.bin', 'r+b') as arq:
n = struct.unpack('i', arq.read(4))[0]
arq.seek(0)
for i in range(n):
arq.seek(0)
if isinstance(struct.unpack('i', arq.read(4)), int) and struct.unpack('i', arq.read(4)) < 10:
arq.write(struct.pack('i', 0))
elif isinstance(struct.unpack('f', arq.read(4)), float) and struct.unpack('f', arq.read(4)) > 9.0:
arq.write(struct.pack('f', 1000,0))
The first problem is that before you enter the loop, you send the reading position back to the beginning of the file, but that's not what we want to do. After we read the first integer and we know the size of the file, you do not have to read the first 4 bytes anymore. seek(0)
is unnecessary.
I mean: after n = struct.unpack('i', arq.read(4))[0]
, as we gave read
, the reading position is this:
0010 >iiii ffff iiii ffff
We are already in a position to start reading the values. If we give seek(0)
, we return to the first position:
>0010 iiii ffff iiii ffff
And we are no longer interested in reading 0010
, because we already know that the file has 2 pairs of values.
Then you can see some more problems inside the loop:
We give seek(0)
at the beginning of each iteration. So, not only do we get back to the first one that does not interest us, but we never get ahead in the next iterations, and even if we did not have the first% wc we would always read the first pair of values.
We give 0010
several times without saving the value. Remember that every arq.read(4)
advances the read position in read(x)
, so we can only call x
once before it goes to the next item. You may want to save the result of read
to a variable to avoid having to read the same value twice.
We checked that the result is arq.read(4)
after we have interpreted it as int
. When we call int
with the argument struct.unpack
, we are saying to interpret those bytes as integers and it will return an integer anyway. The problem is that if we interpret a float as integer, the int value will have nothing to do with the float.
What I recommend is to first make the most basic work: let's read the file and make sure the positions are correct:
with open('valores.bin', 'r+b') as arq:
n = struct.unpack('i', arq.read(4))[0]
print(n)
for i in range(n):
meu_inteiro = struct.unpack('i', arq.read(4))
print(meu_inteiro)
meu_float = struct.unpack('f', arq.read(4))
print(meu_float)
# Resultado: 3 (2,) (2.5,) (12,) (12.5,) (1337,) (314.70001220703125,)
In my case, the values I took were these, so that's all there is to it. Note that we do not use 'i'
yet, because it is not necessary just for sequential reading. We're just going to need it to overwrite the values. That is:
We read the first value seek
and put it in the variable iiii
.
0010 >iiii ffff iiii ffff
->
0010 iiii >ffff iiii ffff
We compare meu_inteiro
(without doing another meu_inteiro
) with some value. If it is less than 10, we return the necessary positions to change it by -1:
0010 iiii >ffff iiii ffff
-> (seek pra voltar à primeira posição)
0010 >iiii ffff iiii ffff
-> (escrita de novo int -1)
0010 iiii >ffff iiii ffff
(procedemos com a leitura do float)
The read
has 3 modes of operation, defined by the second argument. The first and default mode is to set the absolute position of the read / write position of the file. That is, to make seek
read the read position in byte 4. If we pass the second argument as 1, then the position is relative to the current position. That is, seek(4)
pushes the position 4 bytes ahead of the current position; if we are in position 4, it goes to 8. The third mode, passing 2, is relative to the end of the file, but that does not matter to us.
Since we want to return 4 bytes if we are to write, we should use seek(4, 1)
.
Then your code would look like this:
import struct
try:
with open('valores.bin', 'r+b') as arq:
n = struct.unpack('i', arq.read(4))[0]
print(n)
for i in range(n):
meu_inteiro = struct.unpack('i', arq.read(4))[0]
print(meu_inteiro)
if meu_inteiro < 10:
arq.seek(-4, 1) # Voltar à posição do iiii que deve ser sobrescrito
arq.write(struct.pack('i', 0))
meu_float = struct.unpack('f', arq.read(4))[0]
print(meu_float)
if meu_float > 9.0:
arq.seek(-4, 1) # Voltar à posição do ffff que deve ser sobrescrito
arq.write(struct.pack('f', 1000.0))
except IOError:
print('Erro ao abrir ou ao manipular o arquivo.')