Caveman's Blog

My commitment to learning.

Posts Tagged ‘https://cavemansblog.wordpress.com

SQL Server: How to copy a very large table?

with one comment


This blog is an extension to my earlier post on how to copy a table. This post is specific to the situation where you have to copy huge  amount of data from your source table and to a destination table as quickly as possible. The amount of time to copy and the explosion of the log file are the two points of concern during this activity.

Let us look at a table with large amounts of data. We can use sp_spaceused on a table to find out the physical attributes. The following screen-cap tells us that the faded out table has close to 7 millions rows of data with the total storage size of 37 GB approximately.

It took more thank 10 hours to copy the data in the source table to a destination table in another database of the same SQL Server instance via the using standard Insert-Select Transact-SQL statements. The byproduct is this operation was large log file. The solution to deal with this issue would be the one where the copy action is not logged at all or is logged at the minimum. This would reduce the I/O and will speed up the copy process.

Bulk_Logged Recovery mode (plus “Select Into”) and BCP Utility are the two tools that can be used to achive faster copying times with no or minimal amount of transaction logging.  Following is the usage of technique # 1 via using the SELECT Into clause in tandem with the Bulk_Logged recovery mode to copy the same table. The first step in this process is to programtically or manually set the database recovery mode to “Bulk-logged”

--Programatically setting the Recovery mode
USE master;
ALTER DATABASE DemoX SET RECOVERY BULK_LOGGED;

You can manually set the recovery mode by accessing the Database properties dialog box:


Step 2 is to actual copy the data via executing the “Select Into” Transact-SQL. The last step in this process is to reset the database recovery mode to “Full”.

select * into DemoX.dbo.Employee from Demo.dbo.Employee

USE master;
ALTER DATABASE DemoX SET RECOVERY SIMPLE;

Copy Time: 2.75 hours. This is pretty decent considering that the traditional copying took more than 10 hours to do the same job. My next goal is to use the BCP utility to copy the same table and compare the timings. Here is the basic idea of using this utility

Step 1: Export data to a file using the BCP utility.
Step 2: Import data into the destination table using the BCP utility.

Will post my results soon.

References:
1. MSDN Online – Bulk Logged Recovery Model
2. MSDN Online – BCP Utility

SQL Server: How to search for text/phrase in a Stored Procedure?

with one comment


Following are some of the queries to find the stored procedures that contain a search text/phrase:


SELECT ROUTINE_NAME, ROUTINE_DEFINITION
FROM INFORMATION_SCHEMA.ROUTINES
WHERE ROUTINE_DEFINITION LIKE '%employee%' AND ROUTINE_TYPE IN ('PROCEDURE')

--OR

SELECT OBJECT_NAME(id) FROM syscomments
WHERE [text] LIKE '%employee%' AND OBJECTPROPERTY(id, 'IsProcedure') = 1

--OR

SELECT OBJECT_NAME(object_id)
FROM sys.sql_modules
WHERE definition LIKE '%employee%' and OBJECTPROPERTY(object_id, 'IsProcedure') = 1

Following some more “How to search” posts:

1. SQL Server: How to search for a number in a string?
2. SQL Server: how to search for a table name or a column name or a given stored procedure name?

SQL Server: GO command

leave a comment »


GO is a command that is a batch terminator. A “batch” is a set of Transact-SQL statements that get sent to an instance of the SQL Server in one network packet. Basically we can use GO to signal the end of a batch. Transact-SQL statements of a batch are compiled as a single execution plan.

Following is an example of dropping and recreating the Employee table in the Demo database. The GO command in this example generates three batches of SQL statement. The first batch sets the context database, while the second batch is used to drop an existing Employee table. The third batch is used to create the employee table.


USE [Demo]
GO

/****** Object:  Table [dbo].[Employee]    Script Date: 03/23/2012 00:22:26 ******/
IF  EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[Employee]') AND type in (N'U'))
DROP TABLE [dbo].[Employee]
GO

CREATE TABLE [dbo].[Employee](
[EmpID] [int] IDENTITY(1,1) NOT NULL,
[EmpName] [varchar](50) NOT NULL,
)
GO

Following are some rules around the usage of GO command:

  1. A Transact-SQL statement cannot occupy the same line as a GO command. However, the line can contain comments.
  2. Any execution of a stored procedure after the first statement in a batch must include the EXECUTE keyword.
  3. The scope of local (user-defined) variables is limited to a batch, and cannot be referenced after a GO command.

GO command can also accept one optional  integer as an argument. In this case all the statements preceding the GO with an integer argument will execute as many times as specified by the argument. The following example generates two random unique identifiers

--SYNTAX: GO [count]

select newid()
GO 2

OUTPUT

References:
1. MSDN Online

Written by cavemansblog

March 23, 2012 at 12:47 am

SQL Server: Database deployment best practices

with one comment


Here are some of the SQL Server database deployment tips that have been very useful to me.

1. Store all your database object scripts as individual files under source control. This provides an excellent way to keep the database code organized and will also be very useful when auditing code changes. Another advantage would be that these scripts can be used to creating an instance of an application database with minimal amount of data.

2. Maintain a senior resource as a single point of contact to handle all SQL changes in non-DEV environments. This resource could also be responsible for code reviews and helping with code optimization. This provides an excellent way for enforce organizational best practices and it will also make it simple to keep track of changes in a given database environment.

3. Compare the database which is getting deployed with the last database that got updated in the software factory line. Comparing the schema and/or data will give you a contemporary view of the database state, thus helping you with a diligent way of preparing your deployment scripts. You can use tools like Redgate SQL-Compare, dbcomparer, etc for this purpose.

4. Make sure that the SQL scripts always successfully pass through the development, QA and stage environments before reaching the production/live environment. On a side note data in the production should be replicated (after scrubbing is necessary) to all the non-production environments as frequently as possible to achieve best level of code compatibility.

5. Always take a backup of the database, turn off replication agents and all SQL jobs, before deploying to the production/staging environment. This might add some overhead, but making this a habit will help with a error free, faster and a easy to restore deployment.

6. This may seem trivial by it is good practice to end every SQL Script with “GO”. SQL Server utilities interpret GO as a signal that they should send the current batch of Transact-SQL statements to an instance of SQL Server. The current batch of statements is composed of all statements entered since the last GO, or since the start of the ad-hoc session or script if this is the first GO [1].

7. Maintain the deployment scripts with a naming convention. This will help in organizing the scripts for executing them in predetermined sequence and will give you a good idea about what a script is intended to do. I usually name my files as follows: naming the scripts with number in the following format give established the order of execution and also give you an opportunity to add new script(s) in between any two scripts.

Convention: Script_[Number]_[Number]_[ObjectName]_[Operation].sql

  • Script_100_001_table_employee_add.sql
  • Script_200_001_spGetAllEmployees_update.sql
  • Script_200_002_viewEmployeeDetails_update.sql

8. It is good practice to sandwich your SQL script in between a Transaction and Rollback/Commit block. One exception to this rule (if you will) would be when you have a script that performs a huge number of data manipulation operations. This could some times cause the transaction log to grow really big and cause the server to run out of space.

9. Create data validation scripts for the deployment changes. This way you do not have to wait for the application testing to respond to any errors.

10. I have used script consolidation for some of the deployments I was involved with. File consolidation helps in saving time when you have to execute a ton of script files. Here is the code for the SQL Merge tool.

11. Prepare a deployment document for establishing and organizing the deployment process with the following information; this will be useful with maintaining compliance with the Sarbanes-Oxley Act [2].

  • Deployment Schedule
  • Activity ownership
  • Pre-deployment activities
  • Deployment activities
  • Post-deployment activities
  • Rollback plan
  • Contact list

References:
1. MSDN Online
2. Sarbanes-Oxley Act

Start/Stop SQL Server Replication Agent using TSQL

leave a comment »


On a recently client engagement we ran into an issue with a nightly SQL Server job. The job was unable to generate data for a certain warehouse database. As it tuns out to be via process of elimination, I was able to figure out that there were some SQL distribution agents that were interfering with the nightly SQL job that caused the issue. I have stopped the distribution agents and ran the job manually to able to successfully populate the data, followed by re-starting the agents. The process of stopping and starting was done using the SQL Server Management studio.

Now that I knew how to populate the data, I had to automate the process of stopping and starting the distribution agents before and after running the nightly SQL job. Here are the stored procedures that can be used to achieve the same:

--STOP the Distribution Agent:
sp_MSstopdistribution_agent @publisher, @publisher_db, @publication, @subscriber, @subscriber_db

--START the Distribution Agent:
sp_MSstartdistribution_agent @publisher, @publisher_db, @publication, @subscriber, @subscriber_db</pre>
  • @publisher is the name of the Server that is publishing
  • @publisher_db is the database of the publisher server
  • @publication is the name of the table/object that is getting published
  • @subscriber is the name of the subscriber server
  • @subscriber_db is the name of the subscriber database

The above commands have to be executed in the publisher server against the ‘distribution’ database. Here is an example:


exec distribution.dbo.sp_MSstartdistribution_agent @publisher  = 'PUBLISHERSERVER',
@publisher_db   = 'PUBLISHERDB',
@publication    = 'TABLE_EMPLOYEE',
@subscriber     = 'SUBSCRIBERSERVER',
@subscriber_db  = 'WAREHOUSEDB'

Written by cavemansblog

March 12, 2012 at 11:21 pm