Zookeeper working principle and basic concepts

Zookeeper working principle and basic concepts

ZooKeeper is a distributed, open source distributed application coordination service. It is a software that provides consistent services for distributed applications. The functions provided include: data release/release, load balancing, configuration maintenance, domain name service, Distributed synchronization, group services, etc. And why we choose zookeeper, because Zookeeper has the following characteristics:

Features Description
Final consistency Show the same view to the client, this is a very important function in zookeeper
reliability If the message is accepted by one server, it will be accepted by all servers.
real-time Zookeeper cannot guarantee that two clients can get the newly updated data at the same time. If you need the latest data, you should call the sync() interface before reading the data.
Independence Each client does not interfere with each other
Atomicity Updates can only succeed or fail, and there is no intermediate state.
Sequential For all servers, the same message is published in the same order.

1. Zookeeper design goals

Zookeeper is committed to providing a distributed coordination service with high performance, high availability, and strict sequential access control capabilities (mainly the strict sequentiality of write operations). It has the following design goals:

  1. Simple data model, Zookeeper enables distributed programs to coordinate with each other through a shared tree structured name space, that is, the data model in the Zookeeper server memory consists of a series of data nodes called ZNodes, and Zookeeper will fully Data is stored in memory to increase server throughput and reduce latency.
  2. Clusters can be constructed. A Zookeeper cluster is usually composed of a group of machines, and each machine maintains the current server state in memory, and each machine communicates with each other.
  3. Sequential access, for each update request from the client, Zookeeper will assign a globally unique incremental number, which reflects the sequence of all transaction operations.
  4. High performance, Zookeeper stores the full amount of data in memory and directly serves all non-transactional requests from the client, so it is especially suitable for application scenarios where read operations are the mainstay.

2.zookeeper architecture diagram

Zookeeper role:

Character description
leader Responsible for the initiation and resolution of voting and updating the system status
learner Including followers (follower) and observers (observer), follower is used to accept client requests and want the client to return results, participate in voting in the process of selecting the master. Observer can accept client connections and forward the write request to the leader, but The observer does not participate in the voting process, but only synchronizes the state of the leader. The purpose of the observer is to expand the system and improve the reading speed.
client Request originator

Each Server has three states during its work:

  1. LOOKING: The current server does not know who the leader is and is searching
  2. LEADING: The current Server is the elected leader
  3. FOLLOWING: The leader has been elected, and the current server is synchronized with it

How to elect a server leader?
Half passed, odd election
-3 machines linked to one 2>3/
2-4 machines linked to 2 2! >4/2
Specific election process:
After each server is started, it will ask other servers who it wants to vote for.
For inquiries from other servers, the server will reply with the id of the leader recommended by itself and the zxid of the last transaction based on its own status (every server will recommend itself when the system is started)
After receiving all the server replies, it will be calculated Find out which server has the largest zxid, and set the server related information as the server to be voted next time.
The sever with the most votes in the calculation process is the winner. If the winner has more than half of the votes, the server is selected as the leader. Otherwise, continue this process until the leader is elected The leader will start waiting for server connection
Follower connects to the leader and sends the largest zxid to the leader
Leader determines the synchronization point according to the follower's zxid
After synchronization is completed, the follower is notified that the follower has become an uptodate state
After the follower receives the uptodate message, it can accept the client's request again for service

3. How zookeeper works

The core of Zookeeper is atomic broadcasting. This mechanism ensures synchronization between various servers. The protocol that implements this mechanism is called the Zab protocol. The Zab protocol has two modes, which are recovery mode (primary selection) and broadcast mode (synchronization). When the service is started or after the leader crashes, Zab enters the recovery mode. When the leader is elected and most of the servers are synchronized with the state of the leader, the recovery mode ends. State synchronization ensures that the leader and server have the same system state.
Once the leader has synchronized the state with most of the followers, he can start broadcasting messages, that is, enter the broadcasting state. At this time, when a server joins the zookeeper service, it will start in recovery mode, discover the leader, and synchronize its state with the leader. When the synchronization is over, it also participates in the message broadcast. The Zookeeper service has been maintained in the Broadcast state until the leader crashes or the leader loses most of the follower support.
Broadcast mode needs to ensure that proposals are processed in order, so zk uses an increasing transaction id number (zxid) to ensure. All proposals are added with zxid when they are made. In the implementation, zxid is a 64-bit number, and its high 32 bits are used by the epoch to identify whether the leader relationship has changed. Every time a leader is selected, it will have a new epoch. The lower 32 bits is an up count.
When the leader crashes or the leader loses most of its followers, zk enters the recovery mode. The recovery mode needs to re-elect a new leader to restore all servers to a correct state.

4. Zookeeper's data model

Hierarchical directory structure, naming conforms to conventional file system specifications
Each node is called znode in zookeeper, and it has a unique path identifier
Node Znode can contain data and sub-nodes, but EPHEMERAL type nodes cannot have sub-nodes
The data in Znode can have multiple versions. For example, if there are multiple data versions under a certain path, then you need to bring the version to query the data under this path
Client applications can set up monitors on the node
The node does not support part Read and write, but read and write completely at one time

5.Zookeeper node

Znode has two types, ephemeral and persistent

The type of Znode is determined when it is created and cannot be modified afterwards

When the client session of the short-lived znode ends with Zookeeper's guarantee, zookeeper will delete the short-lived znode, and the short-lived znode cannot have child nodes

Persistent znode does not depend on the client session, it will only be deleted when the client explicitly wants to delete the persistent znode

Znode has four types of directory nodes


6. Guarantee of Zookeeper

Update requests are performed in sequence, and update requests from the same client are executed in the order in which they are sent
Data update atomicity, a data update either succeeds or fails
Globally unique data view, no matter which server the client is connected to, the data view is always Consistent
Real-time, within a certain range of events, the client can read the latest data


Zookeeper adopts ACL (Access Control Lists) strategy for permission control, which defines the following five permissions:

CREATE: Permission to create child nodes.

READ: Access to node data and child node list.

WRITE: The authority to update node data.

DELETE: Delete the permission of the child node.

ADMIN: Set the authority of the node ACL.

Is it okay, everybody? If you like it, move your finger to click , click to follow! ! Thank you for your support!

Welcome to pay attention to the public account [ Ccww Technology Blog ], the original technical articles will be released as soon as possible