How to import bulk data in Azure SQL

Background

Its a common scenario for sql server based application to facilitate import/export data from various sources. With sql server import/export wizard it was so easy, that it never been a concern, but when you try to do the same for azure sql environment, that’s a different story.

Problem

When importing bulk data, azure sql has some limitation. It do not facilitate bulk data import yet. So when I was in need of importing 2 million+ records to my Sql DB in azure, I was in trouble.

Few common options were

  1. generating data script from local sql server and run the same in azure
  2. from local sqlserver run a bulk insert query directly to azure sql

both options are miserably failed, as sqlserver started to show out of memory exception due to volume of data. One workaround is to spill data in small chunks and insert into diff temporary table, then generate data script from those table. But I do not like these kind of workaround anyway, because of the nature of effort and re-usability.

Solution

Solution was rather easy, have to use a command-line tool called The bcp utility. The bcp  client is installed when you install Microsoft SQL Server tools.

My requirement was to export 2millon+ data from local sql server, where database name is Enrollment and table name Enrollments

Local Sql Server Azure Sql
Database Enrollment azuredb
Table Name Enrollments Enrollments
Db UserName N/A username
Db pass N/A password

From command prompt and enter the following line of code:

BCP <databasename>.dbo.<tablename> OUT <localfolder>\<filename>.txt -T -c

According to my example:

BCP Enrollment.dbo.Enrollments OUT E:\BulkInsert.txt -T -c
You will see a nice text file generated with all data from the targeted table, now its time for import to azure sql

 BCP <databasename>.dbo.<tablename> IN <localfolder>\<filename>.txt -S <servername>;
 -U <username>@<servername> -P <password> -c

According to my example:

BCP azuredb.dbo.Enrollments IN E:\BulkInsert.txt -S myazure.database.windows.net; -U username@myazure.database.windows.net -P password -c

Entire process take few minutes, with simple bcp utility the import/export became a life saver for large bulk data manipulation. And with a batch file this can be turned to a reusable one too.

TSQL function to replace HTML tags with delimiter

Introduction

Its often needed to remove all html tags for a column(s) from a database table. One way to do is regular expression pattern matching or using a custom function that will hunt down html tags and remove them. In this small post I will demonstrate this second method.

T-Sql Function

Here is a very simple function that replace/remove html tag’s and returns clean texts without using any patterns. Additionally you can pass a delimiter that will replace the html tag’s in-case you do not want to mix everything in one big chunk of text.

CREATE FUNCTION [dbo].[CleanHTMLTags] (@HTMLText VARCHAR(MAX),@ReplaceChar char(1))
RETURNS VARCHAR(MAX)
AS
BEGIN
DECLARE @Start INT
DECLARE @End INT
DECLARE @Length INT

SET @Start = CHARINDEX(‘<‘,@HTMLText) SET @End = CHARINDEX(‘>’,@HTMLText,CHARINDEX(‘<‘,@HTMLText)) SET @Length = (@End – @Start) + 1 WHILE @Start > 0 AND @End > 0 AND @Length > 0
BEGIN
IF (UPPER(SUBSTRING(@HTMLText, @Start, 4)) <> ‘
‘) AND (UPPER(SUBSTRING(@HTMLText, @Start, 5)) <> ”)
begin
SET @HTMLText = RTRIM(LTRIM(STUFF(@HTMLText,@Start,@Length,@ReplaceChar)));
end
ELSE
SET @Length = 0;

SET @Start = CHARINDEX(‘<‘,@HTMLText, @End-@Length) SET @End = CHARINDEX(‘>’,@HTMLText,CHARINDEX(‘<‘,@HTMLText, @Start))

SET @Length = (@End – @Start) + 1
END

RETURN isnull(RTRIM(LTRIM(@HTMLText)) ,”)
END

And thats it. Say you have a table named “PRODUCTS” and that contains a column “ProductDetails” that contains html tags, in our scenario we have to remove all html tags and replace them by comma(,). Utilizing above function we can make the call like this

SELECT [dbo].[CleanHTMLTags](ProductDetails,’,’) FROM PRODUCTS

Resultant data should be a big text respect to each records in “PRODUCTS” table with html tags removed.

Solution To The MSSQL Server “Suspect”

Background

When you see the your database in Suspect mode that a code red situation. Its not something that you face everyday. It can happen, however, most often due to faulty hardware or operational mistakes (like deleting a transaction log file). This post will show how to recover from suspect mode, but still go through SQL-server error logs and find out the root cause of the error.

Reason

At start-up, SQL Server attempts to obtain an exclusive lock on the device file. If the device is being used by another process  or if the file is missing, sql server start displaying error.

In these cases, there is usually nothing wrong with the devices and database. For the database to recover correctly, the device must be made available, and the database status must be reset.

Possible reason for changed to suspect mode in sql server can be

  • The system cannot find the file specified.) during the creation/opening of physical device
  • Failed to open device where data or log file resides
  • SQL server went down/restarted in the middle of a transaction causes transactions log to be corrupted
  • SQL server can not access data or log file while coming online , because of you beloved antivirus

Solution

To resolve this issue run the commands listed below,

EXEC sp_resetstatus ‘DATABASE_NAME’;
ALTER DATABASE DATABASE_NAME SET EMERGENCY
DBCC checkdb(‘DATABASE_NAME’)
ALTER DATABASE DATABASE_NAME SET SINGLE_USER WITH ROLLBACK IMMEDIATE
DBCC CheckDB (‘DATABASE_NAME’, REPAIR_ALLOW_DATA_LOSS)
ALTER DATABASE DATABASE_NAME SET MULTI_USER

What the above command do is perform a series of step to check the logical & physical consistency of the database and tries to repair. Very first step is to turns off the suspect flag on a database, you can achieve his by using sytem stored procedure sp_resetstatus. Using this procedure change the suspect flag to emergency. Once the database is set to EMERGENCY mode it becomes a READ_ONLY copy and only members of sysadmin fixed server roles have privileges to access it. Then perform a consistency check by executing DBCC command on the master database.  Next step is to rollback any transactions if any are present in the database and bring the database  into Single User mode. Run the repair and finally bring database to Multi User mode.

Remember sp_resetstatus can be executed only by the system administrator. Always shut down SQL Server immediately after executing this procedure.

If the database is still marked as suspect after performing these steps, there may be other problems preventing the database from recovering. At this point, only option left is either restore from a good backup or set the database to emergency mode and use the bulk copy program copy the data out.

TSQL Recursive Split Function

Background

In my last post I showed how to split a string in TSQL. Problem with that splitting is it can split a string if resultant split count is <=2, i.e. it can split only string like “a,b” or “first,last” or “java,c”. But while doing the parsing of  string of whom resultant split count is >2, it wont work. We need more effort to do so.

Solution

I am gonna show how to split a string of any number of delimiter and we will recursive function to achieve that. Idea is very simple,  Create a function that will take the input string and get the first index of delimiter and return the string up to first index and insert that first half into a table. Next is find the rest of the string other then first half and call the same function with it. This loop continues until we find there is no delimiter left in string to do further recursive call, so we return the original caller function and all the way to parent function who initiate the recursive call.

Implementation of the above idea is the function illustrated below,

CREATE FUNCTION [GetName] (@StrVal nvarchar(MAX),@delimeter nvarchar(1))
RETURNS @OldNames TABLE (Name varchar(MAX))
AS
BEGIN
DECLARE @SplitVal nvarchar(MAX);
DECLARE @RestOfTheStrVal nvarchar(MAX);
SET @SplitVal=LTRIM(SUBSTRING(@StrVal, 0,CHARINDEX(@delimeter,@StrVal)));
SET @RestOfTheStrVal=LTRIM(SUBSTRING(@StrVal, CHARINDEX(@delimeter,@StrVal)+1,len(@StrVal)));

IF CHARINDEX(@delimeter,@RestOfTheStrVal)<=0
BEGIN
IF @SplitVal IS NOT NULL
INSERT INTO @OldNames(Name) SELECT @SplitVal ;
IF @SplitVal IS NOT NULL
INSERT INTO @OldNames(Name) select @RestOfTheStrVal;
return;
END
INSERT INTO @OldNames(Name)
SELECT @SplitVal
UNION
SELECT Name FROM dbo.GetName(@RestOfTheStrVal,@delimeter)
RETURN
END

And this time  let me remind you this solution is applicable  if you have a string with a single delimiter (means you can use only one delimiter). Though its returns resultant strings as a tabular format, its quite handy for situation like I explained above.

TSQL split a string by delimiter

Background

Say we have a registration table which store successful registration info. At that table we have a column “Name” which stores value as combination of first name & last name delimited by ‘ ‘ [space]. Later at some point we need to extract the first name or last name from that column, what to do?

Solution

Solution is a simple technique that will take given input string (or column in my scenario) and split against the given delimiter. Technique is get the first index of delimiter and return the string up to first index and return the first half. For second half start from first index of delimiter and take till end of input string and you will get second half. Simple isnt it !!

objects-substring

— Split and get the first half, delimiter is ‘ ‘ [space]

LTRIM(SUBSTRING([COLUMN_NAME], CHARINDEX( ‘ ‘,[COLUMN_NAME])+1,len([COLUMN_NAME])))

— Split and get the second half, delimiter is ‘ ‘ [space]

LTRIM(SUBSTRING([COLUMN_NAME], 0,CHARINDEX(‘ ‘,[COLUMN_NAME])+1))

And let me remind you this solution is only applicable if you have a string with a single delimiter and resultant split count is <=2. Though is small and wont work for more then one delimiter but its quite handy for situation like I explained above.

How to solve SSIS error code 0xC020801C/0xC004700C/0xC0047017

Background

SSIS is the one of the best ETL tool available in market and it can load large amount of data from any heterogeneous data source whether structured, unstructured,application, cloud or real-time data. This post discuss the solution of a common problem that usually arise while importing data from excel data source.

Problem

For beginners while doing the extract, face some weird issues, which seems very fuzzy to solve at first. Such a problem is while doing extract from excel data source, after running the application shows error,

at Package, Connection manager “Excel Connection Manager”: SSIS Error Code DTS_E_OLEDB_EXCEL_NOT_SUPPORTED: The Excel Connection Manager is not supported in the 64-bit version of SSIS, as no OLE DB provider is available.Error: 0xC020801C at Data Flow Task — Excel File, Excel Source [1]: SSIS Error Code DTS_E_CANNOTACQUIRECONNECTIONFROMCONNECTIONMANAGER.  The AcquireConnection method call to the connection manager “Excel Connection Manager” failed with error code 0xC00F9304.  There may be error messages posted before this with more information on why the AcquireConnection method call failed.Error: 0xC0047017 at Data Flow Task — Excel File, SSIS.Pipeline: component “Excel Source” (1) failed validation and returned error code 0xC020801C.Error: 0xC004700C at Data Flow Task — Excel File, SSIS.Pipeline: One or more component failed validation.

And the extraction failed.

Reason for this run-time error is excel do not have 64-bit driver which is selected by default in SSIS running on 64bit OS. So solution is also hidden behind the reason, change the driver from default and you are good to go. Steps to solve is issue is,

Step 1: Navigate to Project-> [PROJECT_NAME] Properties.

Capture

Step 2: Navigate to “Debugging” option from left panel and from Right panel change Run64BitRuntime value to false.

Capture-2
that’s it !! Keep extracting !!

Fixing database is in use error while restore database from backup

Background

When performing a full restore over an existing database, SQL Backup sometime report that the database cannot be restored because it is currently in use, like shown below

This error shows not only during restore backup, also for drop database or alter database as well.

Workaround

In this situation database administrator needs to disconnect all the connected users from a SQL Server Database to get exclusive access of the Database.In the situation where a full restore is blocked because users are currently connected to the database, the best solution to this problem is to take the database offline prior to running the restore, as this is the easiest way to kill all connections to the database.

Before restore attempt
Database administrator can execute the below mentioned TSQL command to get a Single User access to a database. SINGLE_USER WITH ROLLBACK IMMEDIATE command is used only one user can connect to the database at a time.

use master
alter database DB_NAME set offline with rollback immediate;

After restore
Database administrator can execute the below mentioned TSQL command to give Multiple User access to a database. MULTI_USER command is used any number of users who have rights to connect to the database will be able to connect to the database.

use master
alter database DB_NAME set online with rollback immediate;

So now all authenticated database user can connect to  the database again.