What is the advantage of using a database for reading and another for writing?

12

What is the advantage / difference of using separate banks, one for reading and one for writing?

In my conception there is no such concept. The writing bank, one hour will have to be read to replicate the data in the read, which in turn will suffer writing to receive the replication.

Even I understand that this replication could occur in a time where the bank / system is not being used so much as at dawn for example, but in my case, I can not afford this, synchronization / replication between banks has occur within a maximum of 5 minutes.

I have a database today that suffers thousands of insertions per minute, so if every 5 minutes I run a synchronization process between banks, I believe it will be millions of lines, and at that point, my reading bank would lose performance because it would be suffering from replication writing.

I am considering that this replication would be done in the "hand" through a service developed by me as a queue, or if I use the replication features in SGDBs, would it be more advantageous / performance?     

asked by anonymous 31.10.2017 / 17:46

2 answers

4

There are some points that make a lot of difference:

  • Read performance : If I have more machines available (read replicas) do you agree that the readings will be faster? There is distributed load balancing between replicas.

  • High Availability : Imagine if I have only one instance of a database (read, write), for some reason the instance stops working, all of its application depended on the database. also stops working. If I have replicas of the database (be they just read replicas) my application does not suffer an outage, it still works, even if it only has basic functionality (reads in the database). High availability is a very important point for your business.

  • Data Durability : If I have N read replicas, then I can "lose" the data of an instance and quickly retrieve them (without having a last minute backup) because N replicates are available with the data.

Then:

  

I ask this, because in my conception, there is no such concept. The writing bank, one hour will have to be read to replicate the data in the read, which in turn will suffer writing to receive the replication.

     

Even I understand that this replication could occur in a time where the bank / system is not being used so much as at dawn for example, but in my case, I can not afford this, synchronization / replication between banks has occur within a maximum of 5 minutes.

     

I have a database today that suffers thousands of insertions per minute, so if every 5 minutes I run a synchronization process between banks, I believe it will be millions of lines, and at that point, my reading bank would lose performance because it would be suffering from replication writing.

I do not think this is the case. Cluster managers have a "smarter" copy process so you do not "stress" your database to the limit. I think here is worth a deeper research to better understand the operation.

  

One final comment, I'm considering that this replication would be done in the "hand" through a service developed by me as if it were a queue, or if I use the replication features in the SGDBs, would it be more advantageous / performative?

I believe you do not want to "write-in" a replication tool. Leave this work for the cluster management tools, for example: Galera Cluster (MySQL Cluster) or even the Read Replica of AWS RDS .

    
31.10.2017 / 19:16
5

Advantages

  • Scalability (not performance)
  • Improved infrastructure reliability

Detailing

In general, this is not the way to do it. There is a server that receives the writings and eventually can receive readings as well. There are other servers that only receive readings (the gain starts to be interesting when there are some).

Usually you want and can have faster readings by doing this separation. In most scenarios it is complicated to have large gains in writing, it is difficult to separate. Of course, a server that has writing priority is an aid.

It's obvious that the read servers will be written, but they are lighter because the processing needed to do the correct writing has already been done on the master server, so it does not weigh as much.

Of course, each case is different. That's why this is not a magic solution. There are situations that can bring gains that compensate for the increase in complexity.

In general the replication is done in near real time (less than 1ms difference, in some cases synchronously where the data is only validated when all slaves are updated). Most scenarios this is important.

Great care is taken to do this type of operation. A lot can go wrong.

The separation does not give more performance, even eat a little of it, only allows you to climb more. You will not be able to have 3X performance because it has a server with writing and another one with reading, it is more likely to have 1.5X with this configuration.

The highest reliability is obtained since having more than one server a slave can take over the functions if the master stops, or the master can continue working everything if a slave stops.

Also if one of them stop has the same data in the other, so there is no loss.

    
31.10.2017 / 18:51