The syncer rate configuration parameter
        should be configured with care as the synchronization rate can
        have a significant effect on the performance of the DRBD setup
        in the event of a node or disk failure where the information is
        being synchronized from the Primary to the Secondary node.
      
In DRBD, there are two distinct ways of data being transferred between peer nodes:
Replication refers to the transfer of modified blocks being transferred from the primary to the secondary node. This happens automatically when the block is modified on the primary node, and the replication process uses whatever bandwidth is available over the replication link. The replication process cannot be throttled, because you want to transfer of the block information to happen as quickly as possible during normal operation.
            Synchronization refers to the process
            of bringing peers back in sync after some sort of outage,
            due to manual intervention, node failure, disk swap, or the
            initial setup. Synchronization is limited to the
            syncer rate configured for the DRBD
            device.
          
Both replication and synchronization can take place at the same time. For example, the block devices can be being synchronized while they are actively being used by the primary node. Any I/O that updates on the primary node will automatically trigger replication of the modified block. In the event of a failure within an HA environment, it is highly likely that synchronization and replication will take place at the same time.
Unfortunately, if the synchronization rate is set too high, then the synchronization process will use up all the available network bandwidth between the primary and secondary nodes. In turn, the bandwidth available for replication of changed blocks is zero, which means replication will stall and I/O will block, and ultimately the application will fail or degrade.
        To avoid enabling the syncer rate to consume
        the available network bandwidth and prevent the replication of
        changed blocks you should set the syncer rate
        to less than the maximum network bandwidth.
      
        You should avoid setting the sync rate to more than 30% of the
        maximum bandwidth available to your device and network
        bandwidth. For example, if your network bandwidth is based on
        Gigabit ethernet, you should achieve 110MB/s. Assuming your disk
        interface is capable of handling data at 110MB/s or more, then
        the sync rate should be configered as 33M
        (33MB/s). If your disk system works at a rate lower than your
        network interface, use 30% of your disk interface speed.
      
Depending on the application, you may wish to limit the synchronization rate. For example, on a busy server you may wish to configure a significantly slower synchronization rate to ensure the replication rate is not affected.
        The al-extents parameter controls the number
        of 4MB blocks of the underlying disk that can be written to at
        the same time. Increasing this parameter lowers the frequency of
        the meta data transactions required to log the changes to the
        DRBD device, which in turn lowers the number of interruptions in
        your I/O stream when synchronizing changes. This can lower the
        latency of changes to the DRBD device. However, if a crash
        occurs on your primary, then all of the blocks in the activity
        log (that is, the number of al-extents
        blocks) will need to be completely resynchronized before
        replication can continue.
      


User Comments
Add your own comment.