This section discusses the manner in which MySQL Cluster divides and duplicates data for storage.
Central to an understanding of this topic are the following concepts, listed here with brief definitions:
(Data) Node. An ndbd process, which stores a replica —that is, a copy of the partition (see below) assigned to the node group of which the node is a member.
Each data node should be located on a separate computer. While it is also possible to host multiple ndbd processes on a single computer, such a configuration is not supported.
It is common for the terms “node” and “data node” to be used interchangeably when referring to an ndbd process; where mentioned, management (MGM) nodes (ndb_mgmd processes) and SQL nodes (mysqld processes) are specified as such in this discussion.
Node Group. A node group consists of one or more nodes, and stores partitions, or sets of replicas (see next item).
          The number of node groups in a MySQL Cluster is not directly
          configurable; it is function of the number of data nodes and
          of the number of replicas (NumberOfReplicas
          configuration parameter), as shown here:
        
[number_of_node_groups] =number_of_data_nodes/NumberOfReplicas
          Thus, a MySQL Cluster with 4 data nodes has 4 node groups if
          NumberOfReplicas is set to 1 in the
          config.ini file, 2 node groups if
          NumberOfReplicas is set to 2, and 1 node
          group if NumberOfReplicas is set to 4.
          Replicas are discussed later in this section; for more
          information about NumberOfReplicas, see
          Section 17.3.2.6, “Defining MySQL Cluster Data Nodes”.
        
All node groups in a MySQL Cluster must have the same number of data nodes.
Prior to MySQL Cluster NDB 7.0, it was not possible to add new data nodes to a MySQL Cluster without shutting down the cluster completely and reloading all of its data. In MySQL Cluster NDB 7.0 (beginning with MySQL Cluster version NDB 6.4.0), you can add new node groups (and thus new data nodes) to a running MySQL Cluster — see Section 17.5.11, “Adding MySQL Cluster Data Nodes Online”, for information about how this can be done.
Partition. This is a portion of the data stored by the cluster. There are as many cluster partitions as nodes participating in the cluster. Each node is responsible for keeping at least one copy of any partitions assigned to it (that is, at least one replica) available to the cluster.
A replica belongs entirely to a single node; a node can (and usually does) store several replicas.
          MySQL Cluster normally partitions
          NDBCLUSTER tables automatically.
          However, in MySQL 5.1 and later MySQL Cluster releases, it is
          possible to employ user-defined partitioning with
          NDBCLUSTER tables. This is
          subject to the following limitations:
        
              Only KEY and LINEAR
              KEY partitioning schemes can be used with
              NDBCLUSTER tables.
            
              The maximum number of partitions that may be definied
              explicitly for any NDBCLUSTER
              table is 8 per node group. (The number of node groups in a
              MySQL Cluster is determined as discussed previously in
              this section.)
            
For more information relating to MySQL Cluster and user-defined partitioning, see Section 17.1.5, “Known Limitations of MySQL Cluster”, and Section 18.5.2, “Partitioning Limitations Relating to Storage Engines”.
Replica. This is a copy of a cluster partition. Each node in a node group stores a replica. Also sometimes known as a partition replica. The number of replicas is equal to the number of nodes per node group.
The following diagram illustrates a MySQL Cluster with four data nodes, arranged in two node groups of two nodes each; nodes 1 and 2 belong to node group 0, and nodes 3 and 4 belong to node group 1. Note that only data (ndbd) nodes are shown here; although a working cluster requires an ndb_mgm process for cluster management and at least one SQL node to access the data stored by the cluster, these have been omitted in the figure for clarity.

The data stored by the cluster is divided into four partitions, numbered 0, 1, 2, and 3. Each partition is stored — in multiple copies — on the same node group. Partitions are stored on alternate node groups:
Partition 0 is stored on node group 0; a primary replica (primary copy) is stored on node 1, and a backup replica (backup copy of the partition) is stored on node 2.
Partition 1 is stored on the other node group (node group 1); this partition's primary replica is on node 3, and its backup replica is on node 4.
Partition 2 is stored on node group 0. However, the placing of its two replicas is reversed from that of Partition 0; for Partition 2, the primary replica is stored on node 2, and the backup on node 1.
Partition 3 is stored on node group 1, and the placement of its two replicas are reversed from those of partition 1. That is, its primary replica is located on node 4, with the backup on node 3.
What this means regarding the continued operation of a MySQL Cluster is this: so long as each node group participating in the cluster has at least one node operating, the cluster has a complete copy of all data and remains viable. This is illustrated in the next diagram.

In this example, where the cluster consists of two node groups of two nodes each, any combination of at least one node in node group 0 and at least one node in node group 1 is sufficient to keep the cluster “alive” (indicated by arrows in the diagram). However, if both nodes from either node group fail, the remaining two nodes are not sufficient (shown by the arrows marked out with an X); in either case, the cluster has lost an entire partition and so can no longer provide access to a complete set of all cluster data.


User Comments
Add your own comment.