Quantcast
Channel: Data Architecture
Viewing all articles
Browse latest Browse all 17

Click & Clone demo files

$
0
0

It was my pleasure to again demonstrate my “Click & Clone” solution, this time at SQL Saturday Southampton 2015 on December 5, 2015. Unfortunately I was plagued by the dreaded “demo gremlins” but fortunately I got it sorted and was able to show the “live demo” part to some of the attendees during the breaks.

Please click here for the previous post describing the background of the “Click & Clone” solution. However, the latest version of the presentation can be viewed here.

Some of the attendees requested copies of the packages and code so I am making it available via this blog post.

I have previously published code for building and loading DIM_Date and DIM_Time. Similar versions are now incorporated in this code release. See that previous blog post here for background.

The code I am providing is as follows

  1. SSDT produced dacpac file (SQL Server database project), compatible with SQL Server 2012 for the basic “retail” data warehouse (this includes all elements of the database discussed in the demo, especially the stored procedures called by the SSIS packages)
  2. a folder called “ExternalDatabaseSchemas”, which contains two system dacpacs, that you must link to when reapplying the required database references to support some supplied stored procedures
  3. Project.params file to use in the Integration Services project you will need to create to host the packages
  4. CloneToTargetLoader_00001.dtsx SSIS package
  5. PopulateDateAndTimeDims_00002.dtsx SSIS package

Successful usage of these items requires a reasonable level of competence with SQL server 2012 & Visual Studio, SSDT database and SSIS projects.

It is assumed all your activities will be conducted on a development machine. I highly discourage using any of this solution in production, until it has been fully integrated into your site standards and procedures (and of course, thoroughly tested). As is, it is not fit for production purposes.

You can access the compressed ZIP containing the files here.

DEPLOYMENT INSTRUCTIONS

1. Retail Database SQL server project

  • From Visual Studio (I used version 2012), create a new SQL Server database project
  • from the Solution Explorer import the dacpac file named “clickAndCloneDatabaseElements_20160104_18-13-08.dacpac”
  • I prefer to use the option that imports the files into folders by object type (by tables, storage, security. scripts etc)
  • some items are not imported (or perhaps buried, e.g.. database references)

To reinstate the required database references, first copy the ExternalDatabaseSchemas folder onto your local hard drive. I recommend that you obtain the dacpac files contained within (for the master and msdb databases) from your local server instance, and replace the ones I have included. This is a good blog post going into detail about this task – Schott SQL.

Then in your database project, create a database reference for each, pointing to the relevant dacpac and leave the default names and other attributes that are assigned.

Attempt to Build the project and debug as necessary. A publish file is included within the dacpac. You will need to edit this to change the server name and your local values for the SQLCMD variables used. You should also name the target database according to your needs.

The supplied Post Deployment script should execute when your run the Publish process, loading the DIM tables with the “Kimball” audit rows used for substitution during ETL backroom processes. You will need to merge your own dimension data into these tables also, but that is not required as part of this procedure.

This should create the demo retail DW database in your local SQL Server 2012 instance.

Loading of the DIM_Date and DIM_Time tables are performed using the included SSIS package later on.

2. Integration Services project

  • From Visual Studio, create a new Integration Services project (don’t use the wizard)
  • Copy the two packages I have supplied onto your desktop (or other easily accessible location, but not directly into the project itself)
  • Rename them if you wish to at this stage
  • In your new project, right click in the Solution Explorer, select “add existing package”, repeating to add both the packages to your project (you will get warnings)
  • delete the default package added at project creation time if you wish
  • open the project params file and add a temporary variable called “temp” or similar, to ensure the Project.params file is created
  • close this file in Visual Studio
  • using your favourite editor (I use Notepad++)  open your new Project.params and the one I supplied (right click the file in the Visual Studio Solution Explorer to view the properties to obtain the physical location of the file)
  • replace the “temp” parameter XML node with all the param nodes in the file I supplied
  • close your editor and reopen the Project.params file in Visual Studio (you should be able to see all the params that can be used with the packages I supplied)
  • create at least one OLEDB connection manager to the database you created previously in Part 1
  • open both package I supplied (again you will get warnings), identify all the SQL task objects that have errors and supply the relevant connection manager details (that you just created)
  • to get the CloneToTargetLoader_00001 package building and running, open the data flow tasks and set the source and target tables to the same FACT table (ie. read FACT_Sale and load to FACT_Sale; this is obviously flawed logic, but as you have not yet integrated this into your ETL pipeline, there won’t be any clones to load to your target DB, so it won’t matter)

Once all errors have been resolved, you should be able to execute the “PopulateDateAndTimeDims_00002” package to populate the DIM_Date and DIM_Time tables. Logging inherent in my solution will also occur. Check those tables to ensure the data has been loaded. You can change the begin and end date for loading DIM_Date in the SQL task in the package.

The main package, CloneToTargetLoader_00001, can also be executed and should immediately enter the delay loop portion of the control flow, repeating this process until it shuts down gracefully, having exceeded the number of times the loop may be executed (default is 3).

Examine the tables admin.ExecutionLog and admin.ExecutionCheckpoint to examine the logging that will have taken place.

3. Next steps

This completes the basic deployment and execution of the code I have supplied. To fully integrate this into your ETL pipeline solutions, you need to introduce the following

  • creation of FACT table clones
  • separation of FACT creation from FACT target loading (in the presentation layer); that is, different ETL jobs
  • loading of the clones
  • deployment of the CloneToTargetLoader_00001 package into your job streams to load the clones as they are “released” for target loading

Please refer to my presentation to review the above tactics and the usage of clone tables in the ETL pipeline using the “Table name or view name variable – fast load” option in the OLEDB destination object.

4. Support

If you have any questions regarding the steps described above, please email me via this website or contact me via twitter (@ericjlawson). I will attempt to answer these questions, where possible, by posting the question and answer as comments to this blog post. Hopefully this will answer common issues for all, as they arise.

Should you require significant support to integrate this sort of solution into your existing or new data warehouse solutions, I am available for short and medium term contracts. This could include training, design and coding.


Viewing all articles
Browse latest Browse all 17

Latest Images

Trending Articles





Latest Images