Download (direct link):
When we defined the megaproxy problem earlier, we discussed virtual source as one way to address session persistence. But virtual source does not solve the problem in all situations. Further, we still did not identify any way to solve the megaproxy load-balancing problem. Thatâ€™s because we were limited by the information in the TCP SYN packet to identify the end user.
By performing delayed binding, we now can look at the application request packet. For HTTP applications, the load balancer can now look at the HTTP GET request, which contains a wealth of information. RFC 2616 provides the complete specification for HTTP version 1.1 and RFC 1945 provides the specification for HTTP version 1.0.
In subsequent sections, we will particularly focus on HTTP-based Web applications and examine the application information, such as cookies and URLs, for use in load balancing. When performing delayed binding to get the cookie or URL, the first packet in the HTTP request may not have the entire URL or the required cookie. The load balancer may have to wait for subsequent packets to assemble the entire URL. RFC 1738 defines the syntax and semantics of URL, and the URL may span multiple packets. If the load balancer
needs to wait for subsequent HTTP-request packets, it stresses the memory available on the load balancer significantly. The load balancer may have to copy and hold the packets waiting for subsequent packets. Once all the packets are received, to give the load balancer the cookie or the URL it needs, the load balancer must send all these packets to the server and keep them in the memory until the server sends ACK to confirm the receipt.
A cookie is an object that is controlled by the Web servers. When the user makes a request, the Web server
may set a cookie as part of the reply. The browser stores the cookie on the userâ€™s computer and sends the
cookie in all subsequent requests to the Web server. A cookie is defined as a name=value pair. There is a name that identifies the cookie and it is given a value. For example, a cookie can be user=1, where the cookie name is user and its value is 1. Figure 3.12 shows the request-and-reply flow that shows how cookies get stored and retrieved. On the client side, cookie management is handled by the browser and is transparent to the end user.
(T) HTTP GET aH|â„–*4
(p H I'lPGF-T iImI nmtiiinvttmkir ustt I
^ Tlir iwxi Hl'lt* CRT rnywvl fnnn Ilf <-IhÂ»hI will Iwvi* roofctr u\ri I
sloiw the cuukk* im'f I on th* lor4l computer
Figure 3.12: How cookies work.
For details on cookie attributes and formats, please refer to a book titled Cookies, by Simon St. Laurent, published by McGraw-Hill.
There are at least three distinct ways to perform cookie switching: cookie-read, cookie-insert, and cookie-rewrite. Each has a different impact on the load-balancer performance and server-side application design.
Figure 3.13 shows how cookie-read works at a high level without showing the TCP protocol semantics. We are using the same scenario as megaproxy so we can see how cookie-read helps with this situation. The first time the client makes a request, it goes to proxy server 1 and it has no cookie in it since this is the first time the user is visiting this Web site. The request is load balanced to RS1. Keep in mind that the load balancer has performed delayed binding to see whether there was a cookie. Now, the RS1 sees that there is no cookie called server, so it creates and sets a cookie called server with the value of 1. When the client browser receives the reply, it sees the cookie, and stores it on the local hard disk on the clientâ€™s computer. The TCP connection may now be terminated, depending on how the browser behaves and how the HTTP protocol version is used between the client and server. When the user requests the next Web page, a new connection may be established. After the connection is established, the browser transparently sends the cookie server=1 as part of the HTTP request. Since the load balancer is configured for cookie-read mode, it performs delayed binding and looks for the cookie in the HTTP request. The load balancer finds the cookie server=1, and binds the connection to RS1. The fact that the new connection went through a different proxy server does not matter, because the load balancer is not looking at the source IP address for session persistence anymore. Further, this also solves the megaproxy load-balancing problem, because the load balancer recognizes each individual
user, based on the cookie.
pan of the HTTP reply.
Figure 3.13: Load balancer with cookie-read.
Now, letâ€™s look at whatâ€™s needed to make the cookie-read method work. Obviously, the load balancer must support this method of cookie switching. Most importantly, the Web application on the server must set a cookie called server with the value equivalent to the server ID, as defined on the load balancer. The server must know what its identification is, and the load balancer must know to associate this server ID uniquely with this server. The identifier may be just a numberâ€”such as 1, 2, or 3 for 3 serversâ€”or it may be a name, such as RS1, RS2, and RS3. This really depends on how this feature is supported by the specific load-balancing product used.