SSDT With Visual Studio 2012 and Using Custom Assemblies

SQL Server Data Tools


SSIS 2012 / SQL Server Data Tools (SSDT) offers considerable benefits over SSIS 2008 / Business Intelligence Development Studio (BIDS). The UX is much quicker, smoother, feature rich (undo and redo, woohoo!), and one of the best new features is that we can finally attach the debugger to script components - which has been a bit of an issue in the past. It's not exactly perfect (opening the VSTA environment seems to take a lot longer than it used to, breakpoints are unpredictable, and there are a few little issues surrounding package encryption that I keep running in to), but the improvements are more than welcome, and certainly enhance development.

If you're familiar with SSIS, but you've avoided script tasks / components before, I have to say you're missing out; they're very powerful tools in the right hands, and basically allow you to do anything you can do in a .Net app within the context of a control flow or data flow in a package. Do remember though, that most of the time SSIS already offers a component that already does what you're looking for. It's very easy to go crazy with SSIS scripting, so pick and chose your components well.

Often you'll find the built in components lacking in some way, and you'll either have to implement your own component, buy one in, or write your own script task to do it. I usually prefer the script task method, mainly due to the fact it gives me more control over what I'm trying to do. Just to point out a few examples, SFTP isn't supported by the FTP component, and if you want to connect to a web service using anything more advanced than anonymous HTTP/S, then you'll have to go custom. I've ran in to both of these problems in the last few years, and solved them using both script tasks and script components.

SQL Server 2012 comes bundled with SSDT (using the VS2010 shell), but there are also options to install SSDT in both VS2010 and VS2012 separately. This page kept popping up when I was searching for the right installer, but none of the links actually installed anything into VS2012, even though they claimed to do so. It was last year, and there have been a few updates to the pages since, so I'm hoping the packages are now valid. In response to criticism received, Microsoft have since released a separate Visual Studio 2012 SSDT installer, though there are still some gotchas when running it.

If you already have an instance of SQL Server 2012 (I use SQL Server 2012 Business Intelligence on my development machines), the installer in the above link gives you the option to upgrade or install a new instance of SQL Server. Obviously I didn't want to install another SQL Server instance, so I tried to upgrade. Once I'd confirmed the components, I was told that my CPU Architecture was different in my current installation, so I resigned to the "New Instance" option and everything installed properly.


This installs a new Start Menu short cut inside your SQL Server 2012 folder:



Which opens either a VS2012 Shell (as per BIDS 2008 and the SSDT VS2010 shell), or (thankfully) your already installed version of VS2012, giving you new project templates to play with:


Custom .Net Assemblies


There are already a number of blog posts out there explaining how to get started with SSDT, so instead I thought I'd post a little on getting custom .Net assemblies working with SSDT 2012. The concept is similar to the BIDS 2008 method, but there a few specifics that might catch you out. Unfortunately there's still no assembly installer / package manager, so all assemblies that you want to reference in your script tasks and components still have to be installed in to the GAC.

On the plus side, you can now use .Net 4.0 (VS2010 shell) and .Net 4.5 (VS2012) assemblies, which is great news to those who want to use any of the latest .Net features. My examples are all going to use VS2012 and .Net 4.5, and I'll be connecting to a local copy of the Microsoft provided AdventureWorks SQL Server 2008 R2 sample database (all SQL Server versions can be downloaded from here). You can use any compatible version of SQL Server to do this, the data source is not important for the purposes of the demo.

Brief


Let's take a list of all Products in the database, and do something arbitrary with the data (for the purposes of the example, I'll be simply sending the SSDT data to a .Net assembly doing a very simple calculation, and returning it to the script component. Usually the assembly would do something more complicated with the row data, like send it to an MSMQ, a WCF service, validate it, or clean it up). We'll create a very simple SSDT package to do this.

The below example assumes you are familiar with SQL Server, Visual Studio, basic windows batch files / commands, and how to use basic SSDT components.

Project Setup


1. Start Visual Studio 2012 and create a new Integration Services Project (with a new solution).


2. Add a new "Class Library" project to the solution.


3. Check if the Class Library is set to target .Net 4.5 (Right Click Project in Solution Explorer -> Properties -> Application Tab -> Target Framework drop down).


4. Create a new Class with the following code and compile the ClassLibrary1 project:

namespace ClassLibrary1 {
public class ProductHelper {
public int GetStockCount(int id) {
// TODO: Perform inventory stock count via 3rd party service
return 4321;
        }
    }
}

5. Open Package.dtsx and create a Data Flow Task:


6. Double Click the Data Flow Task and create an "ADO NET Source", a "Script Component" (select "Transformation"), and an "ADO NET Destination", linking them together as follows:

7. Configure the ADO NET Source to pull all products out of the database:


8. Double click the Script Component and perform the following steps:
  1. Select "ProductID", "Name" and "ProductNumber" as the input columns.
  2. Click "Edit Script" on the "Script" tab to open up the VS2012 VSTA Editor
  3. Right Click the VSTA project, and change the "Target Framework" to ".Net Framework 4.5"
  4. Add a project reference to "ClassLibrary1" by browsing to the "bin\Debug\" directory of the solution.
  5. Go to the Code Editor and find "public override void Input0_ProcessInputRow(Input0Buffer Row)"
  6. Add the following code into the method:
            var x = new ClassLibrary1.ProductHelper();
      var count = x.GetStockCount(Row.ProductID);
    
    
  7. Close the VSTA Editor and click "OK" on the Script Transform Editor dialogue.
Keen observers will notice that I haven't actually done anything with the "count" reference. This tutorial is to demonstrate that you can get Script Components and Script Tasks to work with your own custom assembly, not to demonstrate what to do with the results. Normally, you would update a column or append this value to a row or variable, but for now I just want to show you that it'll compile.

9. Double click the "ADO NET Destination" component, create a new table in the AdventureWorks database, then click "OK":


10. If we weren't trying to get an external assembly working with the Script Component, the SSIS package would now normally be in a compilable state. However, if you debug it (F5), you will receive the following error:


This is due to the fact that the assembly is not available to the DTSDebugHost.exe process. Dependent assemblies are not copied or loaded as they would be with a normal .Net project, so you need to tell the project to use a strongly signed GAC assembly instead.

Global Assembly Cache


1. The first thing we need to do is sign the dependent assemblies. You can do this by right clicking your Class Library project -> Properties -> Signing Tab -> Check "Sign the assembly". You can then create a new strong key file:



In the "Create Strong Name Key" dialogue, I usually name the key file the same as the libraries root namespace (in this instance, "ClassLibrary1"), then uncheck the "Protect my key file with a password" box, and leave the Signature Algorithm as the default. Of course, you can still protect the file with a password if you want, but you will need to adjust a number of the below steps.


Once created, a filed called "ClassLibrary1.snk" will show in the root of your ClassLibrary1 project.

2. Now we need to create a directory that the dependent assemblies can live in once they're installed to the GAC. I usually use a project or deployment path, but for the purposes of this demo, I'll use C:\MyAssemblies\.

3. The next step is to keep your local GAC up to date with the latest assemblies, and inform Visual Studio 2012 where these dependent assemblies are so they can be displayed in the "Add References" dialogue. The following commands expect you to be logged in as an administrator, and should be copied into the ClassLibrary1 project's Post-Build commands:

reg add HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Microsoft\.NETFramework\v4.0.30319\AssemblyFoldersEx\MyAssemblies\ /f /t REG_SZ /d C:\MyAssemblies\
xcopy "$(TargetPath)" /Y/R/F "C:\MyAssemblies\"
"C:\Program Files (x86)\Microsoft SDKs\Windows\v8.0A\bin\NETFX 4.0 Tools\gacutil.exe" /uf "$(TargetName)"
"C:\Program Files (x86)\Microsoft SDKs\Windows\v8.0A\bin\NETFX 4.0 Tools\gacutil.exe" /i "C:\MyAssemblies\$(TargetFileName)" /f



Here's a summary of each line:

reg add HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Microsoft\.NETFramework\v4.0.30319\AssemblyFoldersEx\MyAssemblies\ /f /t REG_SZ /d C:\MyAssemblies\

This creates / overwrites a registry key that tells Visual Studio where to find your custom GAC assemblies. It's not strictly necessary in a post-build command (you only need to run it once), but it does no harm, and allows other developers who might start working on the package to get up and running straight away. You can read more about it here.

xcopy "$(TargetPath)" /Y/R/F "C:\MyAssemblies\"

This command copies and overwrites any files in C:\MyAssemblies\ with the deployed project files.

"C:\Program Files (x86)\Microsoft SDKs\Windows\v8.0A\bin\NETFX 4.0 Tools\gacutil.exe" /uf "$(TargetName)"

This command uses the latest version of gacutil to UNINSTALL the target assembly from the GAC. You'll see a warning in the output window the first time it executes on your machine, but it's needed to reinstall the assembly every time you compile your code.

"C:\Program Files (x86)\Microsoft SDKs\Windows\v8.0A\bin\NETFX 4.0 Tools\gacutil.exe" /i "C:\MyAssemblies\$(TargetFileName)" /f

This command uses gacutil to install the assembly into the GAC.

4. You can now build your ClassLibrary1 project again. If it succeeds, you will see something similar to this in the Output window:

1>------ Build started: Project: ClassLibrary1, Configuration: Debug Any CPU ------
1>  ClassLibrary1 -> D:\code_temp\Integration Services Project1\ClassLibrary1\bin\Debug\ClassLibrary1.dll
1>  The operation completed successfully.
1>  
1>  D:\code_temp\Integration Services Project1\ClassLibrary1\bin\Debug\ClassLibrary1.dll -> C:\MyAssemblies\ClassLibrary1.dll
1>  1 File(s) copied
1>  Microsoft (R) .NET Global Assembly Cache Utility.  Version 4.0.30319.17929
1>  Copyright (c) Microsoft Corporation.  All rights reserved.
1>  
1>  No assemblies found matching: ClassLibrary1
1>  Number of assemblies uninstalled = 0 
1>  Number of failures = 0
1>  Microsoft (R) .NET Global Assembly Cache Utility.  Version 4.0.30319.17929
1>  Copyright (c) Microsoft Corporation.  All rights reserved.
1>  
1>  Assembly successfully added to the cache
========== Build: 1 succeeded, 0 failed, 0 up-to-date, 0 skipped ==========

The assembly is now installed, and your project is set up to keep the GAC reference up to date as you develop it.

5. The next step is to update the reference in the Script Component in Package.dtsx to look to the new GAC assembly. In the Data Flow Task, double click the Script Component and click "Edit Script". In the VSTA editor, remove the ClassLibrary1 reference from the project, and add a new assembly reference.

In the Reference Manager dialogue, click "Assemblies" -> "Extensions". You should see two entries for ClassLibrary1. My screenshot only shows one, because I've restarted VS2012 since removing the original reference. Just hover over each of them and add the one that says "C:\MyAssemblies\ClassLlibrary1.dll" in the tooltip.

6. In the VSTE editor, stick a breakpoint on the following line, then save and exit the editor:


7. You can now debug (F5) the project, and after a little while, the breakpoint will be hit (give it time, it takes at least 15 seconds to open on my development machine). If you step through the code, you can see that the assembly is now being hit, and the values from the assembly are being returned:


Each row (all 504 of them) will hit the breakpoint, so delete it and press F5 to close the VSTE editor and finish the package:


Installing on a Server


Although I'm sure you have your own methods when deploying an SSIS package, figuring out how to install a GAC assembly isn't the easiest of tasks, and it's not usually something your casual DBA or even Database Developer would be familiar with. However, knowing how to install to the GAC on your development machine gives you an advantage, and the good news is that you don't need to install an SDK to get it working.

My install package is a simple zip file, which contains the following items:
  • The assemblies you want to install into the GAC.
  • The SSIS packages to deploy.
  • An installation batch file.
  • gacutil.exe and its dependencies.
gacutil.exe, like many other Visual Studio tools, changes with each version of .Net. To install a .Net 4.5 assembly, copy the following files into your install package folder 

C:\Program Files (x86)\Microsoft SDKs\Windows\v8.0A\bin\NETFX 4.0 Tools\1033\gacutlrc.dll
C:\Program Files (x86)\Microsoft SDKs\Windows\v8.0A\bin\NETFX 4.0 Tools\gacutil.exe
C:\Program Files (x86)\Microsoft SDKs\Windows\v8.0A\bin\NETFX 4.0 Tools\gacutil.exe.config

The installation batch file needs to contain a number of commands, similar to the post build command we set earlier, only using the server's physical path instead of your local development path. The first line should be changed to the directory you extract the zip file to:

set installPath=C:\MyProject
set assemblyName=ClassLibrary1
"%installPath%\gacutil.exe" /uf "%assemblyName%"
"%installPath%\gacutil.exe" /i "%installPath%\%assemblyName%.dll" /f

After you've successfully installed the assembly to the GAC, all you need to do is deploy your SSIS package and schedule it to run - if everything is set up correctly, your GAC enabled SSIS package is ready to go!

Benefits

I'm not a fan of using the GAC, if I'm entirely honest, but with any kind of complex SSIS package that requires you to write or reuse .Net code, it's a necessary evil. The main benefits of importing a custom assembly into a script task / component is separation of concerns - you can unit test, build and manage the assembly separately from SSDT, and of course, keep any complex logic out of the fiddly script tasks and script components.
I've used this method to carry out many different tasks, including complex validation, advanced web service integration, custom logging, complex transformations, and data retrieval and generation. It's easy to see how powerful SSIS can be, especially when using the GAC opens up your entire code base for reuse within an ETL process. The opportunities are endless.

Popular posts from this blog

TDD and Unit Testing with Moq

Handling uploads with MVC4, JQuery, Plupload and CKEditor

Generating a self-signed SSL certificate for my QNAP NAS