Download (direct link):
The advent of the Internet has given rise to a whole set of new applications or services: Web, DNS, FTP, SMTP, and so on. Fortunately, dividing the task of processing Internet traffic is relatively easy. Because the Internet consists of a number of clients requesting a particular service and each client can be identified by an IP address, it’s relatively easy to distribute the load across multiple servers that provide the same service or run the same application.
This chapter introduces the basic concepts of server load balancing, and covers several fundamental concepts that are key to understanding how load balancers work. While load balancers can be used with several different applications, load balancers are often deployed to manage Web servers. Although, we will use Web servers as an example to discuss and understand load balancing, all of these concepts can be applied to many other applications as well.
First, let’s examine certain basics about Layer 2/3 switching, TCP, and Web servers as they form the foundation for load-balancing concepts. Then we will look at the requests and replies involved in retrieving a Web page from a Web server, before leading into load balancing.
Here is a brief overview of how Layer 2 and Layer 3 switching work to provide the necessary background for understanding load-balancing concepts. However, a detailed discussion of these topics is out of the scope of this book. A Media Access Control (MAC) address uniquely represents any network hardware entity in an Ethernet network. An Internet Protocol (IP) address uniquely represents a host in the Internet. The port on which the switch receives a packet is called the ingress port, and the port on which the switch sends the packet out is called the egress port. Switching essentially involves receiving a packet on the ingress port, determining the egress port for the packet, and sending the packet out on the chosen egress port. Switches differ in the information they use to determine the egress port, and switches may also modify certain information in the packet before forwarding the packet.
When a Layer 2 switch receives a packet, the switch determines the destination of the packet based on Layer 2 header information, such as the MAC address, and forwards the packet. In contrast, Layer 3 switching is performed based on the Layer 3 header information, such as IP addresses in the packet. A Layer 3 switch
changes the destination MAC address to that of the next hop or the destination itself, based on the IP address in the packets before forwarding. Layer 3 switches are also called routers and Layer 3 switching is generally referred to as routing. Load balancers look at the information at Layer 4 and sometimes at Layer 5 through 7 to make the switching decisions, and hence are called Layer 4-7switches. Since load balancers also perform Layer 2/3 switching as part of the load-balancing functionality, they may also be called Layer 2-7 switches.
To make the networks easy to manage, networks are broken down into smaller subnets or subnetworks. The subnets typically represent all computers connected together on a floor or a building or a group of servers in a data center that are connected together. All communication within a subnet can occur by switching at Layer 2. A key protocol used in Layer 2 switching is the Address Resolution Protocol (ARP) defined in RFC 826. All Ethernet devices use ARP to learn the association between a MAC address and an IP address. The network devices can broadcast their MAC address and IP address using ARP to let other devices in their subnet know of their existence. The broadcast messages go to every device in that subnet, hence also called a broadcast domain. Using ARP, all devices in the subnet can learn about all other devices present in the subnet. For communication between subnets, a Layer 3 switch or router must act as a gateway. Every computer must at least be connected to one subnet and be configured with a default gateway to allow communication with all other subnets.
The Transmission Control Protocol (TCP), documented in RFC 793, is a widely used protocol employed by many applications for reliable exchange of data between two hosts. TCP is a stateful protocol. This means, one must set up a TCP connection, exchange data, and terminate the connection. TCP guarantees orderly delivery of data, and includes checks to guarantee the integrity of data received, relieving the higher-level applications of this burden. TCP is a Layer 4 protocol, as shown in the OSI model in Figure 1.1.
Figure 2.1 shows how TCP operates. The TCP connection involves a three-way handshake. In this example, the client wants to exchange data with a server. The client sends a SYN packet to the server. Important information in the SYN packet includes the source IP address, source port, destination IP address, and the destination port. The source IP address is that of the client, and the source port is a value chosen by the client. The destination IP address is the IP address of the server, and the destination port is the port on which a desired application is running on the server. Standard applications such as Web and File Transfer Protocol (FTP) use well-known ports 80 and 21, respectively. Other applications may use other ports, but the clients must know the port number of the application in order to access the application. The SYN packet also includes a starting sequence number that the client chooses to use for this connection. The sequence number is incremented for each new packet the client sends to the server. When the server receives the SYN packet, it responds back with a SYN ACK that includes the server’s own starting sequence number. The client then responds back with an ACK that concludes the connection establishment. The client and server may exchange data over this connection. Each TCP connection is uniquely identified by four values: source IP address, source port, destination IP address, and destination port number. Each packet exchanged in a given TCP connection has the same values for these four fields. It’s important to note that the source IP address and port number in a packet from client to the server become the destination IP address and port number for the packet from server to client. The source always refers to the host that sends the packet. Once the client and server finish the exchange of data, the client sends a FIN packet, and the server responds with a FIN ACK. This terminates the TCP connection. While the session is in progress, the client or a server may send a TCP RESET to one another, aborting the TCP connection. In that case the connection must be established again in order to exchange data.