Bypassing the primary key duplication error

3

Hello,

I have the following script, .sql, to create a table and also create a trigger.

SET client_encoding TO 'LATIN1';

CREATE OR REPLACE FUNCTION before_insert() RETURNS trigger AS '
  DECLARE
   n integer;
  BEGIN
   IF tg_op = ''INSERT'' THEN
     select count(*) into n
      from files
      where date=new.date and url=new.url;
      IF n > 0 THEN
        RETURN NULL;
      ELSE
        RETURN new;
      END IF;
   END IF;
  END
' LANGUAGE plpgsql;

create table files
    (date TIMESTAMP, 
     url VARCHAR(4000),
     type VARCHAR(100),
     status INTEGER,
     size INTEGER NOT NULL,
     arcname VARCHAR(100) NOT NULL,
     PRIMARY KEY (url,date));

CREATE TRIGGER before_insert_trigger BEFORE INSERT ON files
FOR EACH ROW EXECUTE PROCEDURE before_insert();

From what I understand, each insert checks to see if an equal record already exists, if it exists, it does not. It seems to me that the trigger was created for the insert not to return the repeated primary key error. Since I have to enter a value of 90 million records and the trigger is taking a long time to respond to each insertion, is there any other way to get around the problem?

Thank you.

    
asked by anonymous 19.01.2018 / 18:41

1 answer

0

However, if the idea is the fast and massive insertion of data into the (url, date) table, I suggest that all the triggers related to the files table are temporarily disabled as follows:

>
ALTER TABLE files DISABLE TRIGGER ALL;

To enable them again:

ALTER TABLE files ENABLE TRIGGER ALL;

If you are sure that the data integrity you want to insert into the TRIGGER table does not violate your primary key, you can speed up the data insertion process by removing this key:

ALTER TABLE files DROP CONSTRAINT files_pkey;

After the operations of files you can recreate it as follows:

ALTER TABLE files ADD CONSTRAINT files_pkey PRIMARY KEY(url, date);
    
19.01.2018 / 20:20