Caveman's Blog

My commitment to learning.

Archive for the ‘Web Applications’ Category

How to: Web application architecture – Part 1

with 3 comments


This is a multi-part blog post that will detail the process of architecting a web application. The first part of this article is a list of terms that one might want to be familiar with,when designing a web application. I have tried to structure this glossary based on the design needs of a web application in general and promise to update this list as and when I can remember more.

Project Scope

Prototype

Source Control tool (Subversion, TFS, Visual Source safe, IBM Clearcase)

Bug/Issue tracker (HP Quality Center, IBM Clear Quest, JIRA, etc)

Infrastructure Solution, Code Solution, DB Solution

Data security and compliance (SSL, encryption, web service authentication, etc)

Core Framework

Application Scalability

Smoke testing (Selenium, etc)

Stress test (VSTS, Silk, MS – Web stress tool, etc)

Continuous Integration (Hudson, Cruise Control, etc)

Content Management, Search engine optimization, Analytics,

Infrastructure: Web Servers, DB Servers, Back office servers, CDN, etc

Production/Mock deployment environment: DNS, Firewall, Load Balancer, Sticky Session audit, failover, mail.

Deployment Process

Hudson Production Job

Rollback on failed deployment

SQL Merge / change script

Why is my web application slow?

with 10 comments


Web application slowness in different environments in a software factory-line is something that we all must have come-across at some point or the other. Here is my perspective on “where to begin” to address this issue in a general sense for a Microsoft centric web application. Let as assume that our application was developed using ASP.Net and a SQL Server database.

Web Server

1. Go though the Event Viewer log for any errors, warnings and informational messages. Watch out for messages that were logged by your application and another other applications on your web server.

2. Check the IIS logs to see if there is any unusual response rate i.e. errors (http 500, 404, etc).

3. The application pool in IIS can be a source of the slowness of the application.

4. The web server could have ran out of disk space (lack of error log rolling and backup service).

5. IIS crash because of a memory leak, thread locking, etc. Check out Troubleshooting an IIS Crash/Hang

6. JavaScript interaction with UI objects (Flash, Applets, Silverlight).

7. Make sure that the web server is up-to-date with all the latest “software patches” 😉 (oops !!!  service packs)

8. Make sure that the database connection pool settings are correct.

9. Consider rendering the website content using a content delivery network (CDN) service provider like Akamai, Amazon CloudFront, Microsoft Azure, AT&T, etc.

10. If none of the above seems to cause an issue then read-on.

Application Code

1. Analyzing the web request and the web response across multiple pages of the application using tools like Charles, Fiddler, Firebug, etc can provide you lot of information that you would not know otherwise.

2. Narrow down the scope of the slowness in the page execution.

3. Not disposing objects after use can eat up lot of resources on the web server causing the slowness.

4. Make sure that the response times of all the Ajax, web service calls  are in line with the expectations.

5. Run extended load tests to determine if there might be a new cause of failure, that might not have been noticed during regular load tests.

6. Always employ best practices for implementing web site acceleration [1].

7. Consider fetching multiple result sets in one database call as opposed to one result set per database call. This will reduce the number of round trips to the database.

8. If none of the above seems to cause an issue then read-on.

Database

1. Low disk space on the SQL Server/Cluster.

2. Not following SQL Server best practices.

3. Go though the execution plans of the various suspect SQL scripts/statements to isolate the issue. Table scan can be a very costly operation as opposed to a index scan.

4. Run a SQL Trace for a few hours in an environment with lot of traffic and feed the trace file to the SQL Server Performance Tuning wizard.

5. Apply the recommendations of the SQL Server Performance Tuning wizard to the database to see if that helps.

6. Verify and make sure that the background SQL, SSIS and SSRS tasks are scheduled to run during off-peak hours.

7. Considering breaking down a huge database into smaller ones. as an example an e-commerce website should be accessing data from a Catalog, Marketing, Sales and Audit databases instead of one big database.

8. Consider regular archiving and cleanup of historical data from all databases.

9. Index defragmenting [2] can improve the SQL execution times too.

10. Sorting of huge record sets might be best left to he done at the application level than at the database level. This could be controversial depending on who you speak to. With this kind of a solution chances are that a modern day beefed-up web server in a web farm(load balanced environment) should be able to handle expensive data operations. This would conserve SQL Server processing time to handle more requests.

11. I hope your issue might have been resolved by now.

I will update this blog, as and when I can think of other ways to help the cause.

Good luck and happy programing !

kick it on DotNetKicks.com

Also find this post on

References:

1. Cost-effective website acceleration

2. Microsoft SQL Server 2000 Index Defragmentation Best Practices

Load balancing a web application

with 4 comments


Load Balancing (LB) is the technique of trying to achieve an even distribution of a given load between the load bearers of a system. The goal in LB is to achieve scalability of the system with increasing load, thus improving the performance of the system as a whole. The most common use of this technique is in areas such as: Telecommunications, Web Servers, Database Servers, Avionics, Shipping Industry and Power Grids to name a few.

Introduction

When a user on a computer uses a browser to request a web page from a web server, the browser makes a call to the DNS Server to determine the IP address of the webserver, followed by the browser making a connection to the  webserver using that IP address to establish a connection for request of information. As web server receives many such requests, resulting in the increase of the load/traffic at the webserver. Boosting the server capabilities via adding more RAM and more computational power will be the first order of implementation for improving the server performance(response time). This kind of scaling is limited and alternate methods of improving performance came into existence.

DNS [2] round robin is a model that was one of the early strategies employed for load balancing web servers. This mechanism was based on the fact that several IP addresses can be assigned to one host name, meaning to say the web server traffic can be distributed between multiple IP addresses (computers). Caching of IP addresses by the DNS servers can lead to traffic distribution limitations in the DNS round robin. Lets say when a cached IP addressed computer goes down, this load distribution solution becomes ineffective. At that point the DNS server doest not know how to route the request.  This short coming has lead to the evolution another effective and scalable solution via Server Load Balancing (SLB). Server load balancing is important especially because of the unpredictable nature of the web traffic (number of requests).

Server Load Balancing (SLB)

High availability and scalability is the most important criteria to be kept in mind when designing a Enterprise (Web) Application solution. Fortunately SLB is able to provide scalability and availability to cater to the needs of ever increasing server load. Typically multiple web servers are employed to host a website so that load can be distributed evenly when one server gets swamped. A web farm environment is like one large virtual computer where the load balancer acts as a controller that knows which processing unit (web server) has to be delegated a pending client web request and then promptly sends web responses to that client. This environment is multi-server scenario where we may have a server in each state of US for example. Then when the load on one server is in excess of the configured capacity, the other servers step in to bear the brunt.

LoadBalancer

How a load balancer bears the load it is based on various models listed below:
1. RoundRobin. (All servers share load equally)
2. NLB (economical)
3. HLB (expensive but can scale up to 8192 servers)
4. Hybrid (of 2 and 3).
5. CLB (Component load balancer).

State Management

One of the shortcomings of HTTP is that it is stateless protocol. It works in a disconnected fashion, meaning to say the once the server processes and send a response to a client, the web server does not retain the identity of the client. Hence the necessity for a mechanism that can keep track of the client’s identity and the client specific data, called also as State. Implementing the most suitable mechanism of state management is one of the most challenging part while setting up a web farm. State can be stored in this environment at one or more of the following three places:

1. State Server – single point of failure
2. Database. – additional overhead in processing the web request
3. Web Server – unreliable because of load balancing

I personally like a hybrid approach where the session is stored both in the database and the web server. This is why this might be prudent approach; when a request is processed, the web server first checks its own cache for the session state information, if it is not found it will hit the database to re-establish the state for that request. This way you get the best of both worlds via saving the session very reliably and being able to access it in very fast and efficient manner. Any changes to the state during the processing of a request will be persisted to the database. Ofcourse, storing the encrypted session id in a cookie is necessary to identify the state between round trips to the web server by the client activity.

Note: A web garden is different from a web farm in the sense that, a web garden is a multi-processor setup. i.e., a single server (not like the multi server above).

References:

1. Wikipedia – Load Balancing
2. Domain Name Server – DNS

Host Multiple Sites from One IP Address

with 2 comments


IIS allows you to assign any number of sites to a single IP address. Host headers can be used to acheive this. When IIS receives a request for a Web page, it looks at the information sent in by the browser. The HTTP header contains the actual domain name requested. IIS uses this to determine which site should answer the request. [1]

Step-by-step procedure about how to configure host header names [1]:

1. Start the Internet Service Manager (Inetmgr.exe).
2. Right-click the Web site to be configured, and then click Properties.
3. On the Web Site tab, select the IP address that the site will use.

Note: if you only have one IP address on the server, select All Unassigned, and set the TCP Port that should be used (usually 80).

4. Click Advanced.
5. In the Multiple identities for this Web Site list, select the identity that you want to use.
6. Click Edit, and then add the desired host header name.

NOTE: If you want this site to respond to more than one host header name, use the Add button to add additional identities to this list. Specify a different host header name for each identity, but be sure to use the same IP address and port.

7. Apply these changes and start the Web site (if it is not already running).
8. Register the host header name with the appropriate name resolution system.

If the computer is on an intranet (a private LAN that uses Internet technology), register it with the intranet’s name resolution system, such as the Windows Internet Name Service (WINS).
If the computer is on the Internet, register the host header name with the Domain Name System (DNS), which is administered by InterNic.

9. After the host header name is registered with the name resolution system, test it from a browser by attempting to browse the host header name. The browser should open the expected Web site.

Reference:
1. Microsoft Help and Support

Written by cavemansblog

March 31, 2008 at 10:02 pm

The Response.Redirect Menace

with one comment


You know how we all take things for granted and never bother to think of the caveats…. yep you know what I am talking about…. exactly… I ran into one such situation with Response.Redirect.

I have learnt that the Response.Redirect internally has to make a call to Response.End to force stop the execution of the current thread there by throwing a ThreadAbort exception [2].

The way this works is:

A call to HttpResponse.Redirect(string url) actually calls an overload HttpResponse.Redirect(string url, bool endResponse) with endResponse set to true. If endResponse is set to true, HttpResponse.Redirect will make a call to HttpResponse.End(). [1]

Microsoft recommends that we use the overloaded Response.Redirect(String url, bool endResponse) method that passes false so that a call to Thread.Abort() can be avoided by suppressing the call to Response.End. Is there a catch? Yes, the page will execute the code that follows Response.Redirect.

One solution I can think of to minimize the effort in execution of the code that follows the Response.Redirect is, actually code around the Response.Redirect

if(HaveToredirect == true)
{
  Response.Redirect(url, false); //as described in [1]
  HttpContext.ApplicationInstance.CompleteRequest();
  return;
}

Check for the IsRequestBeingRedirected property of the Response object and return if it is true, in all the page events as the first step. This way we can mimimize the execution time of the page.

if (Response.IsRequestBeingRedirected == true)
return;

This approach might be feasible in all scenarios though.

Also note that Server.Transfer calls Response.End interally.

References:

1. Response.Redirect(url) ThreadAbortException Solution
2. Microsoft Support Article.

Written by cavemansblog

February 13, 2008 at 3:09 pm