Inserting Data into XML Files

Have you had one of those days where you just don’t feel fresh?  You know – the feeling that you get when you need to insert something into an XML file?  I know I have!  This is a task that BIDS (aka Business Intelligence Studio, aka SQL Server Integration Services, aka Guai zhao ruan pi she) does pretty well.  But, before we get into the nitty-gritty of creating a Data Flow task inside BIDS, you have to have a basic understanding of how T-SQL deals with XML formatting.

In order to display SQL data in XML format, you have to use the “FOR XML” command at the end of your T-SQL query.  SQL provides the following command options.

FOR XML AUTO

  • Begins each row of data with the table name or alias
  • If you have complex joins, it will try to figure out the main table names or alias and use that for the entire row (may not always work as expected)
  • Example:
    <table_name Element1="Value1" Element2="Value2"/>
    <table_name Element1="Value1" Element2="Value2"/>
    etc...
    

FOR XML RAW

  • Begins each row of data with a “row” designation
  • Example:
    <row Element1="Value1" Element2="Value2" />
    <row Element1="Value1" Element2="Value2" />
    etc...
    
  • You can also add a string to replace “row” like this: FOR XML RAW(‘dirt’)
  • Example:
    <dirt Element1="Value1" Element2="Value2" />
    <dirt Element1="Value1" Element2="Value2" />
    etc...
    

FOR XML PATH

  • Wraps each row of data with the a “row” designation and wraps each data point with the column name
  • Example:
    <row>
      <Element1>Value1</Element1>
      <Element2>Value2</Element2>
    </row>
    etc...
    
  • You can also add a string to replace “row” like this: FOR XML PATH(‘dirt’)
  • Example:
    <dirt>
      <Element1>Value1</Element1>
      <Element2>Value2</Element2>
    </dirt>
    etc...
    

TYPE

  • You can add a TYPE command to any of the above FOR XML commands to explicitly cast the results of the query into an XML data type like this: FOR XML RAW(‘dirt’), TYPE
  • The results of the query can be loaded into a variable and recalled with another select statement
  • Example:
Declare @xmlvariable xml

Set @xmlvariable = (select … from dbo.table_name for xml auto, type)

Select @xmlvariable

ROOT

  • You can (or in many cases, have to) add a ROOT or a ROOT(‘mcstain’) command to any of the above commands like this: FOR XML RAW(‘dirt’), ROOT(‘mcstain’)
  • Wraps all of the data with a “root” designation or specified string
  • Example:
<mcstain>
<dirt Element1="Value1" Element2="Value2" />
<dirt Element1="Value1" Element2="Value2" />
etc...
</mcstain>

ELEMENTS

  • This essentially makes the results of AUTO and RAW look like PATH, and the command looks like this: FOR XML AUTO, ROOT(‘mcstain’), ELEMENTS

XMLDATA, XMLSCHEMA, BINARY BASE64

  • Not incredibly common, so you can look it up elsewhere

FOR XML EXPLICIT

  • This expects an specific column naming convention in order to create a universal table
  • This is too much of a topic of it’s own, so look it up elsewhere

Now that your XML-Fu is as mighty as your SQL-Fu, you will be confident in utilizing BIDS to do your… bidding.

Setting up the SSIS Package

  • First, you must create a blank XML file
    • This can be done in Explorer by right-clicking on the background of any folder and selecting New then Text Document
      • Simply rename the file to the desired name and change the extension from .txt to .xml
      • This file can now be viewed using Notepad, Excel, and Internet Explorer
    • This can also be done directly through Notepad
      • Save the file as type “All Files”, and Encoding “ANSI”
    • If you can’t do the above then you have failed as a human being
      • Feel free to hit your hand with a hammer
  • Create a Flat File Connection in the Connection Managers window
    • Provide a name for the Connection Manager
    • Click on the Browse button
      • Change the Files of type to All Files, select your XML file and click on the Open button
    • Click on Columns in the list on the left, which will just update your column information and click OK
  • Create an OLE DB connection in the Connection Managers window and connect it to your SQL Server data source
    • ADO.NET will also work, but it requires the addition of a Data Conversion between the Source and the Destination to convert data type DT_NTEXT to DT_TEXT
  • Create a Data Flow Task and double-click to open it
  • Create an OLE DB Data Flow Source
    • Double click on the Source to edit it
    • The data access mode must either be a SQL Command, or a SQL Command from variable (OLE DB only)
      • You can also build up a SQL Command using an expression (this is an ADO.NET workaround for creating a dynamic command similar in function to a “SQL Command from variable”
    • Create your SQL command in the SQL command text window
    • Example:
    • SELECT TOP 100
          A.Column_1,
          A.Column_2
      FROM dbo.table_name A
      for xml path('dirt'), ROOT('mcstain'), ELEMENTS
      
    • In order for this to be parsed correctly, the entire select statement must be cast as a varchar(max) and aliased to a column name
    • Example:
    • SELECT
          CAST((
              SELECT TOP 100
                  A.Column_1,
                  A.Column_2
              FROM dbo.table_name A
              FOR XML PATH('dirt'), ROOT('mcstain'), ELEMENTS
              ) as varchar(max)
           ) as Column0
      
    • Naming your column alias “Column0” is preferable because it will auto-map to the name BIDS gives the Output Column
      • Known fact: It also retroactively prevented Y2K (a la Terminator 2)
    • Go to the columns section of the Source Editor and make sure that External Column automatically maps to Output Column “Column0”
    • Exit the Source Editor
  • Create a Flat File Destination in your Data Flow and connect the Source to the Destination
    • Double click on the Destination to edit it
    • Select the appropriate Flat File connection manager in the drop-down menu (typically already selected for you)
    • Make sure that Overwrite data in the file is selected
    • For the Header, this is kind of up to you (if you know what you’re doing)
      • Since your wife tells me that you don’t, use this example header:
    • <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
      
  • Go to the Mappings section of the Destination editor and make sure that your Input Column is connected to your Destination Column
  • Close the Destination Editor, verify that there are no errors and run the package to test

Not complicated whatsoever!  Any questions?

SSIS Connections to Excel

I thought that I’d have my first SSIS/SQL topic cover something simple, but poorly documented on the internets – how do you create an Excel Connection to an .XLSM or .XLSB file in Business Intelligence Studio (aka BIDS, SSIS, and Integration Services)?

The answer is rather simple, but first, a little background on the topic. When Microsoft made the switch from Office/Excel 2003 to Office/Excel 2007, they made some drastic changes under the hood.  Most notably, the XLS file extension went away, replaced by the XLSX (Excel XML), XLMS ((Excel Macro-Enabled XML), and XLSB (Excel Binary) file extension.  XLSX clearly became the primary file type Microsoft wanted you to use, while XLSB was the true inheritor of XLS under the hood, since both are binary.  XLS has the option of containing Macros, but now this has been split into its own file type – XLSM.  Aside from that, Microsoft switched from the Jet Database engine to ACE (Access Database Engine – should be ADE, but whatever).  As a result, BIDS 2008 has two Excel versions options for Excel connections: Excel 97-2003 and Excel 2007 (which technically includes Excel 2010 as far as I know).

The problem arises after you select the Micorsoft Excel 2007 for the Excel version and find that you are limited to the XLS and XLSX filetypes.  What if you want to connect to XLSM or XLSB?  You have no option to do so, and if you switch your Files of Type to All, and try to open the file, you get this nasty message:

Surely, this is a sign of the coming apocalypse, but luckily for us, there is a workaround.

1) Pick any old crusty XLS or XLSX file to connect to in the Excel Connection Manager
2) Go over to the Properties window
3) Find the ExcelFilePath property and edit the path to point to the desired XLSM or XLSB file.

That’s it!  It’s so easy, even grandma can do it!

A note from the author

Don’t let the name mislead you.  This blog has nothing to do with grave digging, horticulture, or the oldest profession.  To tell you the truth, I have no idea what this blog is or will be about.  Let’s just wait and see where this goes, and hopefully we’ll learn something interesting on the way.  For example, did you know that 100% of whales are big?  Neither did I, Billy.  Neither did I.

Follow

Get every new post delivered to your Inbox.