Log_reuse_wait_desc showing as Availability_Replica
Recently in one my Always On Availability replica
environment I got an issue where one of the availability replica database log
file has grown completely, and we got a request to shrink the issue. When we
run the below command we found that “log_reuse_wait_desc” showing as “Availability_Replica
What this means is this
indicates that the logged changes in the availability database at the primary
replica that have not arrived and that these changes were applied to the
availability database at one of the secondary replicas. Until logged changes
arrive and are applied, the changes cannot be truncated from the availability
database log at the primary replica.
Troubleshooting:
"Log
Send Queue" and "Redo Queue" are measurable data points during
availability database synchronization. You can monitor these data points in
order to determine whether an availability database log cannot be truncated
because of the log uses type AVAILABILITY_REPLICA.
· Log Send Queue
As soon as it is hardened to the secondary database log file, a dedicated redo thread applies the log records. If the redo operation cannot keep up with the transaction log that is generated, log growth may occur. If the secondary replica redo operation is behind in applying those changes to a corresponding secondary database, the primary will truncate the transaction log.
· Redo Queue
As soon as it is hardened to the secondary database log file, a
dedicated redo thread applies the log records. If the redo operation cannot
keep up with the transaction log that is generated, log growth may occur. If
the secondary replica redo operation is behind in applying those changes to a
corresponding secondary database, the primary will truncate the transaction log
Identifying the secondary database that is delaying log truncation:
If there is more than one secondary, there may be a particular secondary that is responsible for the log truncation issue. To identify the secondary database that is delaying log truncation, log on to the primary replica, query the truncation_lsn column of the sys.dm_hadr_database_replica_statesdynamic management view (DMV), and then review the log_send_queue and redo_queue data points to help diagnose the issue.
The AlwaysOn dashboard and the sys.dm_hadr_database_replica_states DMV help monitor the redo progress. The following table lists some key fields:
Field | Description |
---|---|
log_send_queue_size | Size of log records that did not arrive at the secondary replica. |
log_send_rate | Rate at which log records are being sent to the secondary databases. |
redo_queue_size | Size of log records in the log files of the secondary replica that has not yet been redone, in kilobytes (KB). |
redo_rate | Rate at which the log records are being redone on a given secondary database, in KB per second. |
last_redone_lsn | Actual log sequence number of the last log record that was redone on the secondary database. The last_redone_lsn value is always less than the last_hardened_lsn value. |
last_received_lsn | Log block ID that identifies the point up to which all log blocks are received by the secondary replica that hosts this secondary database. This field reflects a log-block ID that is padded with zeros. It is not an actual log sequence number. |
select ar.replica_server_name,
drs.truncation_lsn,
drs.log_send_queue_size,
drs.redo_queue_size
from sys.dm_hadr_database_replica_states
drs join sys.availability_replicas ar
on drs.replica_id=ar.replica_id
where drs.is_local=0
Workaround:
After you identify the secondary database that makes this occur, try one or more of the following methods to work around this issue temporarily:
- Take the database out of the availability group for the offending secondary
- Note This method will result in the loss of the High Availability/Disaster Recovery scenario for the secondary. You may have to set up the Availability Group again in the future.
- Add more log space or log files.
Comments