Linux Virtual Server (LVS) introduction
LVS is a linux-based load balance technology, it can provide
high performance, LVS is originated from a Chinese open source project in 1998
and now it is probably thought as the most powerful load balance solution in
the world. For example, the world largest e-economic sites www.taobao.com used LVS to handle over 1
million concurrent http connections. No wonder, the Founder of LVS , Dr Zhang
is the CTO of ‘taobao’.
LVS architecture
There are 3 layers of a traditional LVS system.
Loader balancer (Director
Server): it accepts the request from
the client and dispatch it to the backend Server Array.
Server Cluster (Real
Server): the servers serve the request and provide the response to the
client.
Shared Storage (optional):
the layer provides a consistent content to the client. In some simple system, the
layer does not existing.
The basic idea of LVS is when the request comes to the
Director Server ‘s VIP, Director Server will pick up a real server and forward
the request to it. Then how the real server is selected (load balance algorithm)
and how the request/response are dispatched(load balance mode) are the two
major topics in LVS
Load balance mode:
VS/NAT(Virtual Server via Network Address Translation): When
a request packet destined for virtual IP address (the external IP address for
the load balancer) arrives at the load balancer. The load balancer will choose
a real server by a scheduling algorithm, and the connection is added into the hash table which record the established
connection. Then, the destination
address (IP) and the port (optional)
of the packet are rewritten to
those of the chosen server, and the packet is forwarded to the server. When the
incoming packet belongs to this connection and the chosen server can be found
in the hash table, the packet will be rewritten and forwarded to the chosen
server. When the reply packets come back,
the load balancer rewrites the
source address (IP) and port (optional) of the packets to those of the virtual
service. After the connection terminates or timeouts, the connection record
will be removed in the hash table.
It is called NAT is because the IP(and port) will be
rewritten then the request dispatched and when the response coming back.
VS/TUN(Virtual Server via IP Tunnelling): When a request
packet destined for virtual IP address (the external IP address for the load
balancer) arrives at the load balancer. The load balancer will choose a real
server by a scheduling algorithm, and the connection is added into the hash table which record the established
connection. Then, the load balancer encapsulates the packet within an IP
datagram and forwards it to the chosen server. When an incoming packet belongs to this
connection and the chosen server can be found in the hash table, the packet
will be again encapsulated and forwarded to that server. When the real server
receives the encapsulated packet, it decapsulates
the packet and processes the request, finally return the result directly to the user according to its own routing
table. After a connection terminates or timeouts, the connection record will be
removed from the hash table.
The key technology is ‘encapsulation’ the requests to the IP
tunnel.
VS/DR(Virtual
Server via Direct Routing): When
a request packet destined for virtual IP address (the external IP address for
the load balancer) arrives at the load balancer. The load balancer will choose
a real server by a scheduling algorithm, and the connection is added into the hash table which record the established
connection. Then, the load balancer directly forwards it to the chosen server (layer 2 forwarding). When the incoming
packet belongs to this connection and the chosen server can be found in the
hash table, the packet will be again directly routed to the server. When the
server receives the forwarded packet, the server finds that the packet is for
the address on its alias interface or for a local socket, so it processes the
request and returns the result directly
to the user. After a connection terminates or timeouts, the connection
record will be removed from the hash table.
Note: the essential idea is LD and RS must in the same LAN
and the response is back to the user directly.
There is a short comparation about the three modes
NAT is supported by almost all OS and the real server can
use the private IP addresses. But since every package (including the response
packages), so performance is a major issue.
Tunneling provides a good performance capacity but it needs
the client OS to support "IP Tunneling" (IP Encapsulation) protocol.
DR supported by all OS, the performance is very high. But it
needs the RS and DR in a same subnet which is a major limitation for it.
Load balance algorithm:
The algorithm decides how the real server is selected.
- Round-Robin Scheduling: The round-robin scheduling algorithm sends each incoming request equally to the next server in its list
- Weighted Round-Robin Scheduling: Each server can be assigned a weight, an integer value that indicates the processing capacity; the server is selected based on the weighted round-robin.
- Least-Connection Scheduling: The least-connection scheduling algorithm directs network connections to the server with the least number of established connections
- Weighted Least-Connection Scheduling: The weighted least-connection scheduling is a superset of the least-connection scheduling, in which you can assign a performance weight to each real server. The servers with a higher weight value will receive a larger percentage of live connections at any one time.
There are other algorithms used in LVS.
No comments:
Post a Comment