About Aurora mechanism

Question:

We have one instance for each writer instance and one instance for reader instance.

By specifying the cluster endpoint, the former is used for reading and writing and the latter is used for reading only.
Here, if the reader instance is broken due to some kind of failure and is stopped, will the burden on the writer increase because the reader cannot be accessed?
Does the fact that two replicas are automatically created in one AZ mean that if the leader instance dies, another leader instance will appear on the console?

Answer from AWS:

Dear AWS Customer,

A very warm greeting from AWS! I hope that you are doing good!

Thank you for contacting AWS Premium Support. My name is Haritik from the RDS team and I am happy to assist you on this case.

DESCRIPTION

From the case notes, I understand that you need general guidance with respect to working of aurora. Therefore you need assistance in knowing the following pointers:

What happens in case of a read replica fails/stops in a cluster, does that increase the workload of the primary instance as the workload shifts back to the primary?
What does the statement exactly mean “two replicas are automatically created in one AZ ”?

Please correct me if I am not inline with your expectations.

\====================Response====================

Firstly, I would like to appreciate the efforts done by you in providing us with a clear and concise case description .

Moving ahead in order to address your concern, I would like to inform you that your understanding of the concept is totally correct that in an Aurora Cluster we have one instance as leader/former node also termed as primary instance (Read-write both ) and another we have is the latter node also called as read replica (read-only).
Therefore in order to clarify the further concerns regarding the same concept, please allow me to explain to you with the best of my efforts :

To begin with, I would like to explain about Aurora reader instance

Aurora reader : Aurora read Replicas are independent endpoints in an Aurora DB cluster, best used for scaling read operations and increasing availability. Up to 15 Aurora Replicas can be distributed across the Availability Zones that a DB cluster spans within an AWS Region. All Aurora Replicas return the same data for query results with minimal replica lag—usually much less than 100 milliseconds after the primary instance has written an update. Replica lag varies depending on the rate of database change. That is, during periods where a large amount of write operations occur for the database, you might see an increase in replica lag.

[+] https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/Aurora.Overview.html
[+] https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/Aurora.Replication.html#Aurora.Replication.Replicas

Query 1 :

“ if the reader instance is broken due to some kind of failure and is stopped, will the burden on the writer increase because the reader cannot be accessed?”

\=> If you’ve one reader instance in the cluster and load on it is high then it is not going to impact the writer instance, however when there is a restart of reader instance due to low freeable memory or hardware failures(rarely), during that time the read traffic will be routed to writer instance momentarily. Below is the document explaining about the architecture of Aurora cluster.

[+] Amazon Aurora DB clusters - https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/Aurora.Overview.html

In simpler words, lets consider if you have a retry logic at your application if a connection timeouts/drops it will again try to establish the connection waiting for a certain threshold.
If there are N connections, performing read operations and suddenly reader instance become unavailable. At that point of time all the ongoing transactions will either be rollbacked or killed accordingly based on the mysqld process.

As your application has the retry logic and the connections are reinitiated then those will be redirected to the writer instance.

If they do not have the retry logic, the connections will be dropped.

Therefore, Please note, if you connect to the reader endpoint of your cluster and the reader instance fails, the cluster endpoint will wait for the DB instance to be repaired and replaced, so it can start its activity again.
Furthermore, depending on the type of failure on the reader node, there is normally a required reboot of the instance or an underlying host replacement, which will approximately determine the time it will take for the instance to be recovered.
\=> All the read traffic in your application which is pointed out to reader instance will not be able to reach it and your existing connections will go down.
\=> If you want higher availability in the reader node, you may always add a second reader [+], in the event that one of them fails, the other will still be available and your activity on the reader won’t be interrupted.

[+] Adding Aurora Replicas to a DB cluster – https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/aurora-replicas-adding.html

Query 2:

“Does the fact that two replicas are automatically created in one AZ means that if the leader instance dies, another leader instance will appear on the console?”

\=> I apologize for the confusion, but I am unable to comprehend your query. As per my understanding, two reader instances and one primary instance in a single AZ are not possible. Therefore , in order to help you with the query I will request you to please elaborate me the context to what you are relating to . Also, if you can help me with the endpoint of the cluster to which you are referring or some documentation that will be very kind of you.

On a best effort basis, I am explaining some general information about failover and multi-AZ aurora read replicas. Please feel free to reach out if the concern still persists.

If the DB cluster has one or more Aurora Replicas, then an Aurora Replica is promoted to the primary instance during a failure event. To increase the availability of your DB cluster, we recommend that you create at least one or more Aurora Replicas in two or more different Availability Zones.

Note: Within each AWS Region, Availability Zones (AZs) represent locations that are distinct from each other to provide isolation in case of outages. We recommend that you distribute the primary instance and reader instances in your DB cluster over multiple Availability Zones to improve the availability of your DB cluster. That way, an issue that affects an entire Availability Zone doesn’t cause an outage for your cluster.

\==> In Aurora MySQL 2.10 and higher, you can improve availability during a failover by having more than one reader DB instance in a cluster. In Aurora MySQL 2.10 and higher, Aurora restarts only the writer DB instance and the failover target during a failover. Other reader DB instances in the cluster remain available to continue processing queries through connections to the reader endpoint.

You can customize the order in which your Aurora Replicas are promoted to the primary instance after a failure by assigning each replica a priority. Priorities range from 0 for the first priority to 15 for the last priority. If the primary instance fails, Amazon RDS promotes the Aurora Replica with the better priority to the new primary instance. You can modify the priority of an Aurora Replica at any time. Modifying the priority doesn’t trigger a failover.
More than one Aurora Replica can share the same priority, resulting in promotion tiers. If two or more Aurora Replicas share the same priority, then Amazon RDS promotes the replica that is the largest in size. If two or more Aurora Replicas share the same priority and size, then Amazon RDS promotes an arbitrary replica in the same promotion tier.

If the DB cluster doesn’t contain any Aurora Replicas, then the primary instance is recreated in the same AZ during a failure event. A failure event results in an interruption during which read and write operations fail with an exception. Service is restored when the new primary instance is created, which typically takes less than 10 minutes. Promoting an Aurora Replica to the primary instance is much faster than creating a new primary instance.

For more information, kindly refer to:
[+] https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/Concepts.AuroraHighAvailability.html

I sincerely hope the above information is helpful. However, if this does not lie in line with your requirements or if you have any further questions regarding the same, please feel free to revert back to me. I will be glad to help you.

Thanks a lot for your kind understanding on this. Be safe & have a great day!

We value your feedback. Please share your experience by rating this correspondence using the AWS Support Center link at the end of this correspondence. Each correspondence can also be rated by selecting the stars in the top right corner of each correspondence within the AWS Support Center.

Best regards,
Haritik S.
Amazon Web Services

\=======================

Lưu ý: với RDS thông thường, sẽ có stand by instance, nhưng với Aurora thì sử dụng luôn reader instance để tăng performance cho RDS cluster, Chi tiết tham khảo tài liệu [1] và [2]

[1] (Mục Multi-AZ Deployments with Aurora Replicas) https://aws.amazon.com/rds/aurora/features/

[2] aws.amazon.com/rds/features/multi-az

Q: I want to know if the load is distributed between the writer instance and the reader instance when connecting to the reader endpoint.

A: When connecting to a reader endpoint, load balancing is performed only on the reader instance.

Therefore, in the configuration of one writer instance and one reader instance, only the reader instance is connected, so the load-balancing behavior does not occur. [1]

As an alternative method, it is possible to load the balance between the writer instance and the reader instance by using a custom endpoint. [2]
Please refer to our blog [3] for how to create a custom endpoint.

Reference document
[1] https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/Aurora.Overview.Endpoints.html#Aurora.Overview.Endpoints.Types
[2] https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/Aurora.Overview.Endpoints.html#Aurora.Endpoints.Custom
[3] https://dev.classmethod.jp/articles/amazon-aurora-custom-endpoints/