Books
in black and white
Main menu
Home About us Share a book
Books
Biology Business Chemistry Computers Culture Economics Fiction Games Guide History Management Mathematical Medicine Mental Fitnes Physics Psychology Scince Sport Technics
Ads

Load Balancing Servers, Firewalls and Caches - Kopparapu C.

Kopparapu C. Load Balancing Servers, Firewalls and Caches - Wiley Computer Publishing, 2002. - 123 p.
ISBN 0-471-41550-2
Download (direct link): networkadministration2002.pdf
Previous << 1 .. 16 17 18 19 20 21 < 22 > 23 24 25 26 27 28 .. 70 >> Next

Q) CIH tx inlitolc* the cxmlroJ cottntiOmi lo TCP putt 21. | ^
iO- CD Sfoyr vnd* an jwtnlrary port rnwiilwr (> 1.023) foe »hr* ijjtu roiwwikwL I (j) Clkm initiate* tin* d.it-4 cumicclkin to »hr vpcofkil port
Figure 3.7: How passive FTP works.
When we load balance passive FTP traffic, we must use an appropriate persistence method to ensure that the data connection goes to the same server as the control connection. The session-persistence method based source IP and VIP will work for this because this method ensures that all connections from a given source IP to a given VIP are sent to the same server. But that’s overkill if all we need is to ensure that the control and data connections for a passive FTP go to the same server, while load-balancing other application traffic. In the concurrent connections method, the load balancer checks to see if there is already any active connection from a given source IP to a given VIP. If there is one, a subsequent connection from the same source IP to the VIP will be sent to the same server.
On the other hand, active FTP will not need any session persistence. But it will need appropriate NAT, if the real servers are assigned private IP addresses or if they are behind a load balancer. This is discussed in Chapter 2, section Reverse NAT.
The Megaproxy Problem
So far, we have discussed various session-persistence methods that use source IP address to uniquely identify a user. However, there are certain situations where the source IP is not a reliable way to identify a user, also known as the megaproxy problem. The megaproxy problem has two flavors: a session-persistence problem and a load-balancing problem.
Most ISPs and enterprises have proxy servers deployed in their network. When an ISP or enterprise user accesses the Internet, all the requests go through a proxy server. The proxy server terminates the connection, finds out the content the user is requesting, and makes the request on the user’s behalf. Once the reply is received, the proxy server sends the reply to the user. There are two sets of connections here. For every connection between the user’s browser and the proxy server, there is a connection between the proxy server and the destination Web site. The term megaproxy essentially refers to powerful proxy servers that serve thousands or even hundreds of thousands of end users in a large enterprise or ISP network. Figure 3.8 shows how a megaproxy works.
30
Source IP-Based Persistence Methods
Clients
aa
Proxy
Swm
rl'
Is-
(T) From IP
(2) from proxy?* II*
3 © r~
f31
Load
Balancer



laiCLMiv.il nelworK lor .in enter |M i\r or ISP with multiple proxy servers
Ihe vxond connection comes from proxy2‘s IP address. « dosing lli** load ImI.mm et lo assign (In* roiwec lion to a server based on load rather than on session persistence.
Figure 3.8: Session persistence problem with megaproxy.
When the user opens multiple connections, and if these connections are distributed across multiple proxy servers, the proxy server that makes the request to the destination Web site may be different for each connection. Since the load balancer at the destination Web site sees the IP address of the proxy server as the source IP address, the source IP address will be different for each connection, although it’s the same user initiating connections behind the proxy servers. If the load balancer continues to perform session persistence based on the source IP address, the connections from the same user may be sent to different servers, causing the application transaction to break. Therefore, the load balancer cannot rely on the source IP address to identify the user in this situation.
Another aspect of the megaproxy problem is that, even if all connections from a given user are sent to same proxy server, we may still have a load-balancing problem with that, as shown in Figure 3.9. Let’s take the case of an ISP who has two giant, powerful proxy servers, where each server can handle 100,000 users. Although the session persistence will work fine because source IP remains the same for a given user, we have a load-balancing problem. The load balancer directs all connections from a given proxy server to the same application server to ensure session persistence. This will cause the load balancing to break, as one server may get requests from 100,000 users at the same time, while the others remain idle. By identifying each individual user coming through the proxy server, the load balancer can perform better load distribution while maintaining session persistence. Whether megaproxy causes a load-balancing problem or not really depends on how much traffic we get from the megaproxy relative to the total traffic to our server farm. Some of the largest megaproxy servers in the industry are located at big dial-up ISPs such as America Online (AOL), Microsoft Network, and EarthLink, because they have millions of dial-up users who all access the Internet through the ISP’s proxy servers. But if a Web site has 10 Web servers, and the traffic from AOL users to this site is about 2 percent of the total traffic, we don’t really have to worry about a load-balancing problem. Even if all of the AOL users are sent to a single server, it should not cause a load-balancing problem overall. But if the traffic from AOL users to the Web site is about 50 percent of the total traffic, then we definitely have a load-balancing problem. These are simplified examples of the problem, because ISPs such as AOL have many proxy servers. Nevertheless, we can expect each of their proxy servers to serve thousands of their users and that can cause a load-balancing problem.
Previous << 1 .. 16 17 18 19 20 21 < 22 > 23 24 25 26 27 28 .. 70 >> Next