Physical Exclusion vs. Logical Exclusion

10

Is it a common practice? Is it safe?

  

Derived from: Physical Vs. logical / soft delete of database record?

    
asked by anonymous 24.03.2014 / 13:46

5 answers

8

Generally, use physical deletion only when you know for sure that you will no longer need the record in the table.

Use logical exclusion when it might be possible for you to:

  • You can need to restore the registry at some point (undelete)

  • get information from the deleted records (several times the client requests, not the user).

The decision to use one or the other depends on each case individually. But in most cases the recommended is to use logical exclusion when there is a minimal likelihood of restoration of the deleted records. A query WHERE tabela.Excluido = 0 does not usually denigrate performance, leaving more concern about the size of data to be stored. Of course, about the possibility of restoring.

An alternative also used in tables that might have a giant size with a high proportion of excluded is that you move the records to another exclusive table to Exclude so as not to affect the performance of the main table.

But in this case, care must be taken because it may be possible that they can not be easily restored if they have a PK auto-increment. Usually used for historical queries.

This case can also be used on legal issues, such as saving data about logins made by users to a site for X years in contrast to information only to the user / client of your last login to a site.

    
24.03.2014 / 14:22
7

Advantages of using a logical deletion:

  • Audits: If you use a date-time field to save the deletion, instead of a simple boolean / string, you can tell when it was deleted. And you can use indexes and search without problems with this, because your search will be of type WHERE dtExclusao = NULL or something like this.

  • Easier deletion: you do not have to worry about keeping some of the integrity at the time of deletion, since a cascade deletion is not done.

  • In the future, you will be able to implement some intelligence / BI solution, since there will be a history of all the data that can be used

Disadvantages:

  • You will have to control the deletion in your code, always remembering to have a WHERE to be able to delete those deleted

  • If the table is too large and fragmented (with some well-spaced non-deleted data in the middle of a number of deleted data), performance may drop because tables are stored in "pages" on disk, with some being loaded into memory. If the non-deleted data is all on a few "pages", fine. But if your index says you have a valid data on each page, your bank will waste a huge time reading each page to memory, drawing only the line you need, and loading the next page. In this case, to get around this, it would be worth having an active data table and throwing those that are deleted to another table.

  • The cascade deletion data all remain in the same tables, ie a record is marked as deleted in the parent table, but in the child tables nothing indicates whether the data is still being used or not, and the tables can get large without being able to play data to a special table of those that have been deleted

  • backup time / cost: since full backups will have a lot of data that has actually been deleted

  • Delay to write data: If indexes are large (since tables can be large by storing all deleted records), each write of a new data may require a large index build time, and so each write action may have poor performance

24.03.2014 / 15:03
7

Well, the advantage or disadvantage of the two types of exclusion depends very much on the needs of your system.

Before deciding on the best practice for your case, I recommend that you pay attention to the following:

  • Set flag to indicate whether the record is active or does not imply having to perform checks ( WHERE clauses in each query from that point forward) li>
  • Performance may be a major problem occasionally because of logical exclusion, so you have to take into account the complexity of your database and the relationships between the entities to see if this would really be a hindrance or not;
  • It is essential to perform an analysis of the frequency with which a previously excluded data is requested / searched. Through this, you will know if the best thing to do is to use a bit field, create an additional table that will hold certain information, or simply add fields that contain change date, change, and / or deletion for a larger control.

If you need to keep inactive records, go ahead and use backups to save your data, implement triggers and save resources as needed. In short, giving permissions to users who really should have the power to exclude, discarding what prevents the optimization of your system and promoting a good user-machine interactivity is always the best alternative.

    
24.03.2014 / 14:59
5

There are two main purposes:

The first is that you do not lose information permanently, it stays somewhere for audit purposes for example (there are better solutions for this, such as keeping special historical records).

Second, is maintaining relationships. For example, let's look at the typical case of disabling users of a system: While active, this user performed operations, registrations, etc. Each of these entities is linked to the responsible user through fields such as "user_id", "created_by", etc., all pointing to that row of the users table. If you deleted this user using a DELETE, all those records would be broken, pointing to a non-existent user. At this time it is only convenient to inactivate the user, so he can no longer enter the system, and at the same time you can consult the records left by him that will be pointing to the correct user.

If it's a common practice:

In some organizations this is mandatory, it is simply forbidden to give DELETE on records in the database. Even when he would not need it. So the ideal is to evaluate each situation and see what is the best solution. In many cases the logical exclusion is really useful, and in others you will have to use it even though it is not: -)

    
24.03.2014 / 14:22
2

Avoid logical exclusion so you do not get useless and irrelevant data in your database.

Use logical exclusion only when needed! This is a common and safe practice, as long as the application code takes care of the case correctly (the Laravel framework, for example, takes logical exclusions with extreme ease for the developer).

The question that remains is: when would it be necessary to use logical exclusion?

Through logical exclusion, you can recover the "deleted" data, that is, you preserve history, and maintain integrity in the database.

One example: a product in a virtual store. To the extent that there are orders placed containing this product, it is convenient to make a logical exclusion of the product. Thus, you can identify items in past orders, even if the product can be "deleted". That is: With the logical exclusion of the product, you preserve the data of those requests.

There are many other cases where logical exclusion is useful, important, and valuable. But if it is not useful or necessary, it is best to do the physical exclusion even, so as not to leave the database polluted.

    
24.03.2014 / 18:31