rdbmsadv: July 2006

Wednesday, July 12, 2006

More about microsoft Webservices

There is a client-side limit to how many simultaneous calls you can make
from one AppDomain to one web service (see the
ServicePointManager.DefaultCon-nectionLimit property).

http://www.webserviceshelp.org/wsh/Discussions/dotnet/postings/Web+Service+To+No+Where+Lbadnc0Wm4I5m1HfRVniQspeakeasynet.htm

Web Services and Connection Management using the ServicePointManager

After having seen this same problem at more than one client, I thought it important to share this valuable piece of information with others. When discovering it I was amazed that with all the .NET Web Services articles floating around, there is not any more mention of this important piece of information.

Let me start by describing the problem. You are using web services to interact with an external system (either your own or an external source). Everything works great in functional testing, but in load testing or (better yet) a production environment, under heavy load, requests to take an unusually long time or seem as though they are single threaded.

Contacting the 'owner' of the service you are consuming (this may be looking in the mirror), they will claim that they are fulfilling requests in compliance with service level agreements, but logging from your client application proves that this cannot be true.

The answer and cause lie in an http 1.1 specification intended to keep a user from saturating a web server. To prevent this, part of RFC2616 states that clients should limit their connections to servers.

...

Clients that use persistent connections SHOULD limit the number of
 simultaneous connections that they maintain to a given server. A
 single-user client SHOULD NOT maintain more than 2 connections with
 any server or proxy. A proxy SHOULD use up to 2*N connections to
 another server or proxy, where N is the number of simultaneously
 active users. These guidelines are intended to improve HTTP response
 times and avoid congestion.

...

Unfortunately this becomes a problem when your intention is to make multiple requests. In the cases where I saw this problem, we had a wonderfully multi-threaded queue listener that concurrently processed long running requests and purposely throttled the number of concurrent connections to the host to comply with our agreement. We complied and then some.

To 'fix' this problem, you must change the default setting for the maximum number of connections to the Internet host. This can be done in the machine.config, web.config or application configuration files as shown:

See MSDN: http://msdn.microsoft.com/library/default.asp?url=/library/en-us/cpgenref/html/gngrfconnectionmanagementelement.asp for more information

It can also be changed programmatically:

Uri con = new Uri ("http://www.parivedasolutions.com/");

// Use the FindServicePoint method to find an existing
// ServicePoint object or to create a new one.
ServicePoint servicePoint = ServicePointManager.FindServicePoint (con);
servicePoint.ConnectionLimit = 10;

In .net , there
are connection management configuration settings which control the
connections allowed to remote servers. The setting are under the

# Element
http://msdn.microsoft.com/library/e...nnectionmanagem
entelement.asp?frame=true

the default connection limitation for every remote server is 2. That's why
you'll found your 3rd request to a single server pending until one of the
former 2 complete. We can override the limitation for that certain server
in our client app's app.config, for example:

maxconnection = "4" />

the above app.config adjust the connection limit for "remoteserver" to 4.
Then, we can make at most 4 requests concurrently to that server.

In addition, the following kb article also include this info and some other
common issues we may meet when consuming asp.net webservice in .net client
app.

#PRB: Contention, poor performance, and deadlocks when you make Web service
requests from ASP.NET applications
http://support.microsoft.com/?id=821268

Hope also helps.

Regards,

Steven Cheng
Microsoft Online Support

See MSDN: http://msdn.microsoft.com/library/default.asp?url=/library/en-us/cpref/html/frlrfsystemnetservicepointmanagerclassfindservicepointtopic.asp for more information

Thursday, July 06, 2006

scale out sqlserver 2000 with federation

Online transaction processing (OLTP) environments and large Web site databases usually consist of many individual queries requesting or manipulating relatively small amounts of data. As the system grows, and users issue more queries against the database, DBAs usually try to improve response time by scaling up, which means increasing the server's power. You can add CPUs, replace CPUs with faster ones, add more memory, expand the network, or add faster disk drives with smarter controllers. But at some point, you exhaust the available resources for scaling up because you reach your machine's—or your budget's—limit. SQL Server 2000 introduces a solution for the growing need for more processing power—scaling out.

When you scale out, you split huge tables into smaller tables, each of which is a subset, or partition, of the original table. You place each partition on a separate server. You manage each server independently, but together the servers form a federation. To access the data on any of the partitions, you define a view with the same name on each of the servers, making transparent the fact that the data is distributed among several nodes. A user or an application connecting to any of the servers can issue all Data Manipulation Language (DML) statements (i.e., SELECT, INSERT, UPDATE, and DELETE) against the view as if it were the original table. SQL Server 2000 intercepts the statements and routes them to the relevant servers. This configuration distributes the processing load among the federation members. . . .

Sunday, July 02, 2006

Improve the webservice performance

General Considerations
As general rule, capacity planning is very important for the webservice to work as per designed. though tuning the system to its best performance is necessary too. When the work load is more than system could take, it willl inevitably lead to the pool performance. The resource the webservice consuming includes connections and threads. Too many thread will cause the client or server to busy with switching between the threads.

Design Considerations
Check Description

Design chunky interfaces to reduce round trips.
Prefer message-based programming over remote procedure call (RPC) style.
Use literal message encoding for parameter formatting.
Prefer primitive types for Web service parameters.
Avoid maintaining server state between calls.
Consider input validation for costly Web methods.
Consider your approach to caching.
Consider approaches for bulk data transfer and attachments.
Avoid calling local Web Services.

Connections
Check Description

Configure the maxconnection attribute.
Prioritize and allocate connections across discrete Web services.
Use a single identity for outbound calls.
Consider UnsafeAuthenticatedConnectionSharing with Windows Integrated Authentication.
Use PreAuthenticate with Basic authentication.

Threading
Check Description

Tune the thread pool using the formula for reducing contention.

Consider minIoThreads and minWorkerThreads for intermittent burst load.

Remarks:
Intermittent burst load could be serious problem for real time application/system. Especially when the system is approach its capacitiy limitations. The webservice might suddenly become slow once a while. At this point of time, accessing the webservice description from web browser become slow suddenly too, which can be used as an indicator that the webservice is expriencing a burst load.

One Way (Fire and Forget) Communication
Check Description

Consider using the OneWay attribute if you do not require a response.

Asynchronous Web Methods
Check Description

Use asynchronous Web methods for I/O operations.
Do not use asynchronous Web methods when you depend on worker threads.

Asynchronous Invocation
Check Description

Consider calling Web services asynchronously when you have additional parallel work.
Use asynchronous invocation to call multiple unrelated Web services.
Call Web services asynchronously for UI responsiveness.

Remarks:
Asynchronous Invocation involves creating extra working thread. As result, the clients might become very busy with switching between threads, and also occupying the connections to the target servers. Asynchronous invocation of webservice might not suitable with calling the webservice on a continuous basis. For the later, limit the number of calling thread is necessary. However, it doesn't mean that you should resort to synchronous invocation causing responsiveness issue, a queue (buffer) which is used to synchronize slow device and fast process could be the solution.

Timeouts
Check Description

Set your proxy timeout appropriately.
Set your ASP.NET timeout greater than your Web service timeout.
Abort connections for ASP.NET pages that timeout before a Web services call completes.
Consider the responseDeadlockInterval attribute.

WebMethods
Check Description

Prefer primitive parameter types.
Consider buffering.
Consider caching responses.
Enable session state only for Web methods that need it.

Serialization
Check Description

Reduce serialization with XmlIgnore.
Reduce round trips.
Consider XML compression.

Caching
Check Description

Consider output caching for less volatile data.
Consider providing cache-related information to clients.
Consider perimeter caching.

State Management
Check Description

Use session state only where it is needed.
Avoid server affinity.

Horizontal DB partitioning from microsoft

http://www.microsoft.com/technet/prodtechnol/sql/2000/maintain/rdbmspft.mspx

Goden Rules for DB performance

http://www.b-eye-network.com/blogs/linstedt/archives/2006/03/golden_rules_of.php

I've been applying performance and tuning techniques to systems world-wide for the past 15 years. I grew up as an assembly level programmer on a CPM-Z-80/Z-8080 Digital VT-180 computer, along with Unix System 5, and a few other machines. It used to be that many would say Performance and Tuning is an art more than a science, well, these days - the science part of it is what really makes this work. In this entry I'm going to introduce to you the top golden rules for performance and tuning your systems, and architectures - this is just a peek at what I'm going to be teaching at the TDWI conference in San Diego in August. In my assessments I cover everything from hardware, to architecture, to systems, to overloading, and platform sizing.

Have you ever wondered how to reduce ETL/ELT processing times from 72 hours to 23 hours? or 18 hours to 2 hours? Have you wondered how to gain 400% to 4000% performance improvements from the systems you have? How do you know if your RAM/CPU, Disk, IP/Network, Applications and RDBMS are in balanced and peak performance modes? Have you questioned when to buy new hardware, and what platform to move to?

The following golden rules are the top tips of performance across systems architecture - these are all part of a workshop course, and assessment that I offer on-site - which tailors the responses to your organization.

The top rules to performance and tuning any system are as follows:
1. Reduce the amount of data you're dealing with
2. Increase Parallelism

The rest of the rules are assigned to meet different categories and can include:
3. Balance load across the applications
4. Re-Architect the processes
5. Limit the RDBMS engines to their use of hardware
6. Partition, partition, partition
7. Manage restartability, fault-tolerance, fail-over
8. Classify Error categories
9. Do not overload hardware with too much parallelism
10. Tune disk access

There are about 250 such recommendations which go in to tuning any sort of system ranging from midrange to client/server based. Mainframes work slightly differently and require (sometimes) a lifting of the CU limitation put on the login. But let me talk about the first two rules: 1 & 2.

Decreasing the data set:
There are a multitude of ways in which to decrease the data set:

The first is identify which data you will actually be using during processing, and then ensures that only that data actually passes through the process. Sometimes this requires re-architecture in order to see the performance gains or to be able to reduce the data set.
The second is to partition the data, and apply the second rule - increase parallelism. Once partitioned, each partition within the parallel set of processes deals with "less data", therefore if the hardware can handle it, performance will increase.
Vertical and horizontal partitioning are two kinds of partitioning available: Vertical is split by number of "columns" or precision of the data set, horizontal is what we are used to with RDBMS table partitioning. These two are NOT mutually exclusive, unfortunately most RDBMS engines today do NOT do a good job of vertical partitioning.

Increasing Parallelism:

Remember this: DBA's often make the mistake of setting only "1" switch in the RDBMS engine to engage parallelism, then if the performance gain isn't seen, they change the switch back. This approach will NOT work. Most RDBMS engines require 10 to 16 switches be set just to engage parallelism the proper way, and allow the engine to rewrite queries, perform inserts and updates in parallel, and so on.
By simplifying (re-architecting) the processes, most processes can be created to run in parallel. Large complex processes are bound (in most cases) to serialize unless they are constructed to execute block style SQL. There are some RDBMS vendors that don't allow any other kind of processing because they execute everything in parallel for you.
WATCH YOUR PARALLELISM - too much of a good thing can overload your system, again, balance must be achieved. Watch your system resources - there are ways to baseline your system to gain the maximum performance for the minimum amount of changes. I can help you identify the quick changes to be made.
Remember: most RDBMS engines these days have parallel insert, update, and delete - but to take advantage of parallel updates and parallel deletes usually requires a script be executed within the RDBMS (as opposed to a direct connect). This is most of the RDBMS vendors don't offer PARALLELISM for updates/deletes in their API / SDK's for applications to use (this should change in the near future).

I/O's can kill performance, balancing I/O's and caching activity can be a huge performance gain (or loss if done improperly). One day when we have nanotech storage devices, the "disk" I/O will disappear. Until then, we must live with it.

I'd love to hear what you've done to tune your environments, if I use your story at TDWI I'll quote you as the source. Please let me know if you'd like to be quoted, feel free to drop me an email in private as well. This entry is just a glimpse into the P&T world.

increase rdbms performance by Horizontal partitioning

Increase SQL Server performance with horizontal partitions

The primary motive behind horizontal partitioning—the process of creating at least two physical files for a database's tables—is to move seldom-used data into a second file. Here's a good way to accomplish this task

Horizontal partitioning is the process of creating at least two physical files for a database's tables. The larger a table, the longer it takes to scan. So the general motive behind horizontal partitioning is to move seldom-used data into a second file.

A common way to do this is to assign date ranges to each partition. For example, suppose that in a given application, the data of interest is almost always from the current year. (Other data is occasionally examined, so it must be available, but it doesn't need to be in the main physical file.) You could create just two horizontal partitions—perhaps Current and Archive—or you could create a partition for each year's data. It depends on your requirements.

Books Online's information on creating a horizontal partition is quite good, so I'll just mention the steps here.

* Create a publication with a Publisher (the current database) and a Subscriber (the archive database).
* For each article that you want to horizontally partition, select Provide Support For Horizontal DTS Transformation Partitions.
* Build the DTS package using the Transform Data Wizard. For each table to be partitioned, write an ActiveX script that defines the partition. In general, you have to determine whether any new or changed rows in the Publisher need to be moved to the Subscriber.

Books Online also offers a nice example of the ActiveX script you have to write for each article you want to partition. You can copy and paste the example and, with just a few additions, you'll be ready to go. (Search for "Defining a Horizontal Partition" in Books Online.)

How horizontal partitioning helped me
At one time, I examined one of my databases to see how I might benefit from horizontal partitioning. There were three tables of particular interest. All three had a DateEntered column, whose default was GetDate(). I made two partitions, using the last year as Current and everything before that as Archive.

Sales were down the year before, ironically resulting in even better performance—the principal interest data was about one-tenth of the total data. The performance gain wasn't quite as good but still obvious. On the other hand, multiyear queries were noticeably slower but were executed so infrequently that it didn't matter. We gained significantly using horizontal partitions. In our case, they were based on date.

rdbmsadv

Wednesday, July 12, 2006

More about microsoft Webservices

Thursday, July 06, 2006

scale out sqlserver 2000 with federation

Sunday, July 02, 2006

Improve the webservice performance

Horizontal DB partitioning from microsoft

Goden Rules for DB performance

increase rdbms performance by Horizontal partitioning

About Me

Links

Previous Posts

Archives