I suggest you take a look at the zip
function and - if you have not already dominate this concept - in the list comprehensions list comprehensions . This will help you a lot in structuring your code.
If you have a ready code that deals with a single file:
workbook_r = open_workbook('arquivo.xls', formatting_info=True)
worksheet_r = workbook_r.sheet_by_index(0)
...
The "natural" way to move to another file is to use functions:
def ler_arquivo(nome):
workbook_r = open_workbook(nome, formatting_info=True)
worksheet_r = workbook_r.sheet_by_index(0)
...
return resultados # Aquilo que você quer extrair do arquivo
ler_arquivo('arquivo.xls')
If you have N files, then you will call this function N times, extracting the individual results from each one (if the files have different structures then you will have to do different functions for each "type" of file, of course ):
arquivos = ['arquivo1.xls', 'arquivo2.xls', 'arquivo3.xls']
resultados = [ler_arquivo(nome) for nome in arquivos]
Okay, so far I probably have not said anything that you did not already know. The problem now is - how to combine the results of processing each file? This is where the zip
function comes in. Let's say some of your results are values from a file column:
def ler_arquivo(nome):
...
coluna_que_me_interessa = []
... # Extrai os valores de uma coluna que te interesse e coloque na lista
return {
"minha_coluna":coluna_que_me_interessa,
... # outros dados
}
For example, let's say that the Excel file A contains "weights" and the file B contains "height measures" - and you want to use both to do a calculation.
resultados1 = ler_arquivo('pessoas_pesos.xls')
resultados2 = ler_arquivo('pessoas_alturas.xls')
pesos = resultados1['minha_coluna']
alturas = resultados2['minha_coluna']
for i in range(len(pesos)):
imc = pesos[i] / alturas[i]*alturas[i]
Instead of doing this "strange" loop, you can use the zip
function to combine the elements of the pesos
and alturas
two to two lists:
pesos = [50,70,60]
alturas = [160,180,165]
zip(pesos, alturas) # [(50,160), (70,180), (60,165)]
So that you can use the pair in a list understanding:
def calcular_imc(peso, altura): # Exemplo; na prática, suas funções serão mais complexas
...
imcs = [calcular_imc(x,y) for (x,y) in zip(pesos, alturas)]
Finally, answering your question: What if I have an arbitrary number of files? In that case (and assuming the files are homogeneous, i.e. with the same structure) you can pass a list of arguments to zip
dynamically. Example:
arquivos = ['arquivo1.xls', 'arquivo2.xls', 'arquivo3.xls']
resultados = [ler_arquivo(nome) for nome in arquivos]
minha_coluna = [r['minha_coluna'] for r in resultados]
calculo = [meu_calculo(x) for x in zip(*minha_coluna)]
This will cause meu_calculo
to receive a list containing the values of each column [of each file] corresponding to line 1, then a list for line 2, etc.
Update: then adapting the technique described for your specific case (in this case, zip
was unnecessary):
def ler_arquivo(nome):
inputWorkbook = xlrd.open_workbook(nome, formatting_info=True)
inputSheet = inputWorkbook.sheet_by_name('idade')
return {
"ncols":inputSheet.ncols,
# As duas linhas que interessam
"linha1":[inputSheet.cell_value(1, coluna) for coluna in range(inputSheet.ncols)],
"linha2":[inputSheet.cell_value(2, coluna) for coluna in range(inputSheet.ncols)],
}
arquivos = ["entrada{}.xls".format(i+1) for i in range(10)]
resultados = [ler_arquivo(nome) for nome in arquivos]
outputWorkbook = xlwt.Workbook(encoding="utf-8")
outputSheet = outputWorkbook.add_sheet('idade')
headerStyle = xlwt.easyxf('font: height 200, bold 1, color white;''pattern: pattern solid, fore_color black; ')
# Percorre os resultados de cada arquivo, incrementando a linha em 2
linha = 0
for resultado in resultados:
for coluna in range(resultado["ncols"]):
outputSheet.write(linha+0, coluna, resultado["linha1"][coluna], headerStyle)
outputSheet.write(linha+1, coluna, resultado["linha2"][coluna])
linha = linha + 2
outputWorkbook.save("resultado.xls")