How to improve file write speed for a ClientDataSet?

8

I am doing the integration of a system with a banking file, and am having a problem in the process. I get a plain text file with approximately 1300Kb and about 5,500 lines from the credit card company.

I'm reading this file and storing it in a ClientDataSet , only in memory, I do not insert into the database at any time. However I am finding that the reading process is very slow, since I am able to write at a rate of only 13 lines per second, I do not have much experience with ClientDataSet and I do not know if this rate is acceptable. p>

To read the file I import the text file to a StringList , then loop based on the number of lines in the file, importing each record type to its ClientDataSet (Within the file can have 9 different record types, and each one places a different%%).

Loop in file:

  EnableDisableControls(False);
  { Percorre todas as linhas do arquivo verificando o tipo e executando uma rotina para cada tipo de registro. }
  for I := 0 to (Extrato.Count - 1) do
  begin
    CurrentLine := I;
    case StrToInt(Copy(Extrato.Strings[I], 01, 01)) of
      0: LerHeader;
      1: LerRegistroDetalheRO;     //Resumo de Operação
      2: LerRegistroDetalheCV;     //Comprovante de Venda
      3: LerRegistroDetalheIDROSA; //Informativo detalhe do RO do Saldo em Aberto
      4: LerRegistroDetalheIBSA;   //Informativo por bandeira do Saldo em Aberto
      5: LerRegistroDetalheIOAR;   //Informativo de Operação de Antecipação de Recebíveis
      6: LerRegistroDetalheIRODA;  //Informações de RO da data antecipada
      7: LerRegistroDetalheIDRODA; //Informações de débitos de ROs da data antecipada
      9: LerTrailer;
    end;
  end;

  { Reativar todos os controles após a inserção. }
  EnableDisableControls(True);

If the record is of type 2, for example, I call its recording procedure in ClientDataSet :

procedure ThreadProcessarExtrato.LerRegistroDetalheCV;
begin
  with FrmExtratoEletronicoCielo, Extrato, DSDetalheCV.DataSet do
  begin
    Insert;
    FieldByName('TIPO_REGISTRO').AsString := Copy(Strings[CurrentLine], 001, 1);
    FieldByName('ESTAB_SUBMISSOR').AsString := Copy(Strings[CurrentLine], 002, 10);
    FieldByName('NUMERO_RO').AsString := Copy(Strings[CurrentLine], 012, 7);
    FieldByName('NUMERO_CARTAO').AsString := Copy(Strings[CurrentLine], 019, 19);
    ...
    ...
    ...
    //Aqui existem muitos outros campos que são atribuídos, retirei para ficar menor...
    ...
    ...
    ...
    Post;
    Inc(TotalRegistrosCV);
  end;
end;

So, any idea how to streamline this process? Or a better way to accomplish this?

    
asked by anonymous 02.04.2014 / 22:27

3 answers

13

Hello, I've had to do something similar, here are some suggestions that have greatly improved the performance in my case.

1 - Change the LogChanges property of ClientDataSet to False (this greatly decreases the time of bulk inserts)

2 - If you are not displaying the ClientDataSet data in some visual control execute ClienteDataset.DisableControls before starting the inserts

3 - If the ClientDataset is not indexed, preferably by using Insert instead of Append , since Append has to place at the end of the dataset the new record inserted

4 - If you have too many fields in Dataset to eliminate the use of FieldByName can help a lot too. If the fields are already created in the design-time dataset, you can access it directly by the object TField (example: ClienteDataset1NOME.AsString )

5 - The ClientDataSet is not good for insertion of many records, if it is indexed the curve in the operation of many inserts is exponential, this can be evidenced through the ASMProfiler tool. One option that helps in performance is to eliminate the indexes before starting these batch operations and restore the indexes at the end of the operation so the cost becomes more linear.

    
03.04.2014 / 01:26
6

I do not know the details of your system, but 780 lines of a text file per minute seems low to me, unless you're pulling the file from somewhere on the network.

From what I've seen, all comments so far have focused on clientdataset . I think it's worth changing the focus a little bit.

For "small" files, using StringList to read the txt file is very valid. It's usually faster, easier, looks perfect.

But the fact is: Using StringList , you bring to memory your text file. From what I understand, the file you're using is huge, and in cases like this, I realized that by using the file structure, not bringing the whole txt to memory, things flow better.

I tested it on a file with 500,000 lines (Input.txt).

I first tested StringList : (needless to say it did not roll, right?)

SoonafterusingtheformIproposed:

Follow the example for testing:

You usually call the function this way:

procedure TForm1.Button2Click(Sender: TObject);
var
  arquivo : Tstringlist;
  I: Integer;
  linha : string;
begin
  arquivo := TStringList.Create;
  arquivo.LoadFromFile('c:\Entrada.txt');
  for I := 0 to arquivo.Count - 1 do
  begin
    linha := arquivo.Strings[i];
    . . . (tratamento da linha)
  end;
  showmessage('pronto');
  arquivo.Destroy;
end;

Test by doing this:

var
   arquivo : TextFile;
   linha   : string;
begin
  AssignFile(arquivo,'c:\Entrada.txt');
  Reset(arquivo);
  while not eof(arquivo) do
  begin
    readln(arquivo,linha);
    . . . (tratamento da linha)
  end;
  showmessage('pronto');
  CloseFile(arquivo);

I do not know if this will solve your slowness, but maybe consume less memory to help a little.

    
03.04.2014 / 16:11
1

To open a text file I prefer direct approach, so the file is released immediately after it is read:

  function StreamToString(Arquivo: String): String;
  var
    oMemoryStream: TMemoryStream;
  begin
    Result := '';
    if FileExists(Arquivo) then
    begin
      oMemoryStream := TMemoryStream.Create;
      try
        oMemoryStream.LoadFromFile(Arquivo);
        SetString(Result, PChar(oMemoryStream.memory), oMemoryStream.Size);
      finally
        FreeAndNil(oMemoryStream);
      end;
    end;
  end;

As for the DataSet, is it really necessary? In my view we use only for aesthetic effect in this case, which can be replaced by byte control and not line control. Transferring the contents of the file to a variable and from this variable directly to the bank your process must have a considerable gain. Working with 200MB files with data from a credit card I made the breaks and ran COMMIT every 10,000 records on an Oracle database and recorded something around 60 records per second. Example:

procedure TForm1.BitBtn1Click(Sender: TObject);
type
  TDados = record
    Nome  : String;
    Obs   : String;
  end;

var
  rDados    : TDados;
  sTexto    : String;
  sAuxiliar : String;
  nCount,
  nIndex    : Integer;
  nCommit   : Integer;

  function Grava(Dados: TDados): Boolean;
  begin
    Result := True;
    // processo de gravação
  end;

  function Commit: Boolean;
  begin
    Result := True;
    // processo de COMMIT
  end;

  function StreamToString(Arquivo: String): String;
  var
    oMemoryStream: TMemoryStream;
  begin
    Result := '';
    if FileExists(Arquivo) then
    begin
      oMemoryStream := TMemoryStream.Create;
      try
        oMemoryStream.LoadFromFile(Arquivo);
        SetString(Result, PChar(oMemoryStream.memory), oMemoryStream.Size);
      finally
        FreeAndNil(oMemoryStream);
      end;
    end;
  end;

begin

  sTexto := StreamToString('C:\Arquivo.csv');
  if Trim(sTexto) = EmptyStr then
  begin
    Raise Exception.Create('Arquivo vazio!');
  end
  else
  begin
    for nCount := 1 to Length(sTexto) do
    begin
      if (sTexto[nCount] in [#10, #13]) and (sAuxiliar <> '') then
      begin
        rDados.Obs := sAuxiliar;
        sAuxiliar  := '';
        Grava(rDados);
        if nCommit < 10000 then // Vai depender do banco de dados
          nCommit := nCommit + 1
        else
        begin
          Commit;
          nCommit := 0;
        end;
      end
      else if (sTexto[nCount] = ';') then
      begin
        nIndex := nIndex + 1;
        case nIndex of
          0 : rDados.Nome := sAuxiliar;
          // demais campos com exceção do ultimo;
        end;
        sAuxiliar := '';
      end
      else if not (sTexto[nCount] in [#10, #13]) then
        sAuxiliar := sAuxiliar + sTexto[nCount];
    end;
  end;
end;
    
23.02.2015 / 18:06