When creating spreadsheets with random values in Python , I made use of a script containing the openpyxl library (see below). I want to increase the speed of creation of random spreadsheets in python because the current method is very slow. How to get the same result by generating CSV first and separately?
The current problem is both in the time spent creating the files, and in the dedicated RAM for mounting each worksheet. The spreadsheet, as currently configured, results in approximately 91.2 MB (95,674 KB) and takes between 400 and 700 seconds (depending on other processes on the machine) as well as oscillating the RAM around 8GB in each cycle.
Question: Any suggestions (preferably in code) using CSV to supposedly increase the speed in generation of spreadsheets containing random numbers? Here is the code (starting point) currently used
from openpyxl import Workbook
from openpyxl.compat import range
import random
import time
def gera_plan():
print('start script')
t = time.time()
numFile = random.randint(1,1001)
dest_filename = 'file_workbook_' + str(numFile)+'.xlsx'
wb = Workbook()
intro = wb.active
intro.title = 'data'
intro['B2'] = 'Just Confidencial Data'
ws = {'ws1','ws2','ws3','ws4','ws5','ws6'}
#for work in range(1,10):
for work in ws:
t_name = 'data'+str(random.randint(1,1001))
work = wb.create_sheet(title=t_name)
#work = wb.active
for row in range(1, 5000):
#ws1.append(range(random.randint(100,300)))
work.append(random.sample(range(1, 10000), 600))
wb.save(filename = dest_filename)
print('end script')
elapsed = time.time() - t
print('time %f seg' % elapsed)
for ger in range(1,5):
print('cicle:'+ str(ger))
gera_plan()