Installing and configuring an SAS OLEDB-driver for MSSQL

To get the correct SAS OLEDB driver or newest SAS OLEDB driver. It is best to do a search e.g. on Google.
NB! You need an account at SAS to be able to download.

And even thou this guide for Installing and configuring an SAS OLEDB-driver for MSSQL is old – it is still very useful.

Be aware, that it is possible to code a program in e.g. .NET that reads a SAS-dataset. It can be done with the SasReader (currently in version 1.0.6).

Below code in C# reads a SAS-dataset and outputs it into a .CSV-file.
Credits to my colleague that figured this out.

using System;
using System.IO;
using SasReader;
using System.Text;

namespace SasToCsvConverter
{
    class Program
    {
        static void Main(string[] args)
        {
           // Define paths
           // string sasFilePath = @"C:\temp\<YOUR SAS-DATASET>.sas7bdat";
           // string csvFilePath = @"C:\temp\output.csv";
            try
            {
                // Initialize SAS file reader
                using (FileStream sasToParseFileInputStream = File.OpenRead(sasFilePath))
                {
                    SasFileReader sasFileReader = new SasFileReaderImpl(sasToParseFileInputStream);

                    // Open the CSV file for writing
                    using (var writer = new StreamWriter(csvFilePath, false, Encoding.UTF8))
                    {
                        // Read and write META DATA
                        var sasMetaColumns = sasFileReader.getColumns();

                        // Write header
                        var headerNames = new StringBuilder();
                        foreach (var column in sasMetaColumns)
                        {
                            headerNames.Append(column.getName()).Append(",");
                        }

                        // Remove the trailing comma
                        writer.WriteLine(headerNames.ToString().TrimEnd(','));

                        // Write DATA
                        long rowCount = sasFileReader.getSasFileProperties().getRowCount();
                        for (int i = 0; i < rowCount; i++)
                        {
                            var row = sasFileReader.readNext(); // object[]
                            var rowValues = new StringBuilder();
                            foreach (var value in row)
                            {
                                var stringValue = value?.ToString() ?? string.Empty;
                                rowValues.Append(EscapeCsvValue(stringValue)).Append(",");
                            }

                            // Remove the trailing comma
                            writer.WriteLine(rowValues.ToString().TrimEnd(','));

                            // Optional: Log progress
                            Console.WriteLine($"Processed row {i + 1}/{rowCount}");
                        }
                    }
                }

                Console.WriteLine("Conversion to CSV completed successfully.");
            }

            catch (Exception ex)
            {
                Console.WriteLine($"An error occurred: {ex.Message}");
            }
        }

        private static string EscapeCsvValue(string value)
        {
            if (value.Contains(",") || value.Contains("\"") || value.Contains("\n"))
            {
                return $"\"{value.Replace("\"", "\"\"")}\"";
            }
            return value;
        }
    }
}

SSIS and IBM DB2 – Don’t do it!

SSIS is a product that should be indifferent to whatever RDBMS you use. That’s also true. It will work with every RDBMS through drivers ODBC, OLE DB, ADO.NET etc. BUT it works best with Microsofts SQL-server.

This blogpost will take a look at using IBM DB2 with SSIS. Below is a test using ODBC-driver for IBM DB2 from IBM, OLE DB for IBM DB2 from Microsoft and an ADO.NET-driver.

Transfer rate of data

Transferring 100.000 rows from one IBM DB2-server to another IBM DB2-server.

OLE DB-driver for IBM DB2 from Microsoft 6 minuts
ADO.NET-driver 30 seconds
ODBC-driver for IBM DB2 from IBM 5 seconds

As seen above the ODBC-driver for IBM DB2 from IBM is the best solution when it comes to transfer rate. The OLE DB-driver for IBM DB2 from Microsoft is a really poor choice. But using the ODBC-driver for DB2 from IBM is not the solution to everything!

Lookup-task in SSIS

As seen in the picture below it’s only possible to use an OLE DB-connection in the Lookup-task. The Lookup-task is really slow and it might be better doing it in SQL. Regarding a lot of the tasks in SSIS, they are really slow (see links below).

OLE DB Command

SSIS also contains a ‘OLE DB Command’-task and no ‘ODBC Command’-task.

Fast load/Bulk load

OLE DB
It’s NOT possible to use the fast load option with the OLE DB-driver when it comes to IBM DB2. It’s possible to choose the fastload option, but you will get the error below.

The reason why the OLE DB-driver is that slow is probably that it transfers one row at a time.

ODBC
It’s possible to use a batch/fast load load option in ODBC.

ADO.NET
It’s also possible to use a bulk/fast load option in ADO.NET.

Links

The links below is to different sources that can help you with SSIS.

SQL Server Integration Services Design Patterns
A book that describes different design pattern for SSIS.

Task Factory for SSIS from Pragmatic Works
A collection of optimized SSIS-tasks.

Cozyroc
A collection of optimized SSIS-tasks.

PragmaticWorks free online training videos
Free online training videos about SSIS (search for SSIS).

Pluralsight
Online training videos in SSIS (requires a paid subscription to Pluralsight).

SSIS and versioning

This post will describe the big challenges that SSIS has when it comes to versioning solutions in SSIS when working as a team on the same SSIS-solution.

These experiences are based on using SubVersion (SVN) and Microsoft Team Foundation Server (TFS) for versioning solutions in SSIS. I have no experience with Git. But I will assume that the same problem occurs.

An example could be that John and Allan is working together on a SSIS-solution. Let’s call it ‘Staging DW’. They start out with local (their own) identical working copy of the SSIS-solution.

John creates a new SSIS-package in the SSIS-solution ‘Staging DW’. Let us call the SSIS-package ‘Stage Company’.
Allan creates a new SSIS-package in the SSIS-solution ‘Staging DW’. Let us call the SSIS-package ‘Stage Sales’.

John updates his working copy of the SSIS-solution ‘Staging DW’ and commits his changes to ‘Staging DW’ containing the SSIS-package ‘Stage Company’. Everything goes well.

Now Allan wants to commit his changes to the SSIS-solution ‘Staging DW’. He updates his working copy of the SSIS-solution ‘Staging DW’ – but he gets a conflict in the file ‘Staging DW.dtproj’ (an XML-file containing information about the SSIS-solution ‘Staging DW’ and its SSIS-packages).

The problem is that John’s new SSIS-package ‘Stage Company’ is a new package in the SSIS-solution ‘Stage DW’ and Allans SSIS-package ‘Stage Sales’ is a new package n the SSIS-solution ’Stage DW’. In the XML-file ‘Staging DW.dtproj’ information about the SSIS-package ‘Stage Company’ created by John occupies the same line as information about Allan new SSIS-package ‘Stage Sales’ in the XML-file ‘Staging DW.dtproj’.

The tools for solving conflicts in SVN and TFS isn’t capable of adding to the file. They can only overwrite.

One solution could be that Allan overwrites his ‘Staging DW.dtproj’ file. This means that his SSIS-solution ‘Staging DW’ looses it’s knowledge of his new SSIS-package ‘Stage Sales’. But he doesn’t lose his SSIS-package ‘Staging Sales’ it will still in the directory of the SSIS-solution ‘Stage DW’. He can now add this SSIS-package ‘Staging Sales’ to the SSIS-solution ‘Staing DW’ now containing John’s new SSIS-package ‘Staging Company’ and commit the SSIS-solution ‘Staging DW’ containing both John’s new SSIS-package ‘Stage Company’ and Allan’s new SSIS-package ‘Stage Sales’ without any problems.

Another solution is good old communication between team members. John could have told Allan that he was going to create the SSIS-package ‘Stage Company’ and that Allan should wait a bit with creating his SSIS-package ‘Stage Sales’ until John had committet the SSIS-solution ‘Stage DW’ containing his SSIS-package ‘Stage Company’. Then Allan could have updated his local copy and got the changes John made. Then Allan could start making his SSIS-package ‘Stage Sales’.

Or they could upfront have made all the SSIS-packages (without content) in the SSIS-solution. They could also choose to make SSIS-solution that was so specific or small that they did have to work together on them.

As of medio 2017 there is no real solution to this challenge.