An easy way to compare all tables’ DDL between two SQL environments

To continue my story about data warehouse migration from UAT to PROD environment.

The DDL difference between the same source loading table from two environemnts will cause my data warehouse load fail, so I need to find an easy to compare the two for all my sourcing tables . There are a lot of good tools with free trail out there to choose. However, there are some regulation from my client side to installing third party tool so I need to create a query to help me to do so instead.

Inspired by an answer from this Post which compares one table. I generated the following query which can loop through all of my targeted sourcing tables.

DECLARE @Table VARCHAR(100);
DECLARE @i int=1;
While @i<= (select max([RANKColumn]) from [dbo].[EvaluationTableList] ) — this is the list table which contains all the tables you want to validate
Begin
SET @Table=(select [TABLENAME] from [dbo].[EvaluationTableList] where [RANKColumn]=@i)
insert into [dbo].[EnviromentObjectDifference] — (this is the output table)
SELECT Table1.ServerName,
Table1.DBName,
Table1.SchemaName,
Table1.TableName,
Table1.ColumnName,
Table1.name DataType,
Table1.Length,
Table1.Precision,
Table1.Scale,
Table1.Is_Identity,
Table1.Is_Nullable,
Table2.ServerName as T2ServerName,
Table2.DBName as T2DBName,
Table2.SchemaName as T2SchemaName,
Table2.TableName as T2TableName,
Table2.ColumnName as T2ColumnName,
Table2.name as T2DataType,
Table2.Length as T2Length,
Table2.Precision as T2Precision,
Table2.Scale as T2Scale,
Table2.Is_Identity as T2Is_Identity,
Table2.Is_Nullable as T2Is_Nullable
FROM
(SELECT @Server1 ServerName,
@DB1 DbName,
SCHEMA_NAME(t.schema_id) SchemaName,
t.Name TableName,
c.Name ColumnName,
st.Name,
c.Max_Length Length,
c.Precision,
c.Scale,
c.Is_Identity,
c.Is_Nullable
FROM [ServerName01].[DatabaseNameO1].sys.tables t
INNER JOIN [ServerName01].[DatabaseNameO1].sys.columns c ON t.Object_ID = c.Object_ID
INNER JOIN sys.types st ON St.system_type_id = c.System_Type_id AND st.user_type_id = c.user_type_id
WHERE t.Name =@Table) Table1
FULL OUTER JOIN
(SELECT @Server2 ServerName,
@DB2 DbName,
SCHEMA_NAME(t.schema_id) SchemaName,
t.name TableName,
c.name ColumnName,
st.Name,
c.max_length Length,
c.Precision,
c.Scale,
c.Is_Identity,
c.Is_Nullable
FROM [ServerName02].[DatabaseNameO2].sys.tables t
INNER JOIN [ServerName02].[DatabaseNameO2].sys.columns c
ON t.Object_ID = c.Object_ID
INNER JOIN sys.types st
ON St.system_type_id = c.System_Type_id
AND st.user_type_id = c.user_type_id
WHERE t.Name = @Table) Table2
ON Table1.ColumnName = Table2.ColumnName
where Table1.ColumnName is null or Table2.ColumnName is null;
SET @i=@i+1
END

Hopefully, you find this helpful.

Thanks,

Your friend, Annie

How to use T-SQL to validate all SSIS packages at once

Recently, I need to do a data warehouse migration for a client. Since there might be some difference between the Dev environment source databases and Prod environment source databases. The migrated SSIS packages for building data warehouse might have some failures because of the changes. So the challenge is how can I validate all my DW packages (100 +) all at once.

There are a good amount of posts out there to do one package validation at a time using the Management Studio GUI (just right click on the package under SSIS Catalog)

validate SSIS package

Or using T-SQL store procedure [SSISDB].[catalog].[validate_package]. Here is the explanation of this store procedure from Microsoft.

To get the validation all at once, we can use a loop query to execute [SSISDB].[catalog].[validate_package] one by one.

Here is the code I created and would love to share.

DECLARE @i INT= 1;
DECLARE @validation_id BIGINT;
DECLARE @packageName NVARCHAR(250);
WHILE @i <= (SELECT MAX(package_id)
FROM SSISDB.catalog.packages)
BEGIN
SET @packageName = (SELECT concat(NAME, ‘.dtsx’)
FROM SSISDB.catalog.packages
WHERE package_ID = @i);
EXEC [SSISDB].[catalog].[validate_package]
@package_name = @packageName,
@folder_name = N’(the folder name which the packages are in)‘,
@project_name = N’(the project name which the packages are in)‘,
@use32bitruntime = False,
@environment_scope = S,
@reference_id = 2,   /* this is the reference id of the environment the package may use. If there is no environment reference, you can assign NULL’ */
@validation_id = @validation_id OUTPUT;
SET @i = @i + 1;
END;
After you ran the above statements, you can when use the following query to check the validation result.
SELECT  object_name,
                CASE status
                    WHEN 4
                    THEN ‘Fail’
                    WHEN 7
                    THEN ‘Success’
                END AS ‘Status’
FROM SSISDB.catalog.validations;
Thanks for viewing and I would love to have any feedbacks.
Your friend, Annie

Create Dynamics CRM Online – User Activity PowerBI Report

Couple days ago, a client of mine is asking me to build a Dynamics CRM Online user activities dashboard. In Dynamics CRM On-premises version, I know, this information resides in audit table. However, with Dynamics CRM online, I am not able to access directly to the backend database. Fortunately, there are few PowerBI content packs for Dynamics CRM online out there in market place so that I was able to read through the PBIX file and find out how PowerBI is connected to the Dynamics CRM Online data.

PowerBI is using OData as data source to connect to Dynamics CRM online web services which gives you almost all data available (There are still some important fields and tables which are in On-premises version and causing the challenge of build reports effectively) . You just need to change the highlighted piece below to your organization CRM web name.

CRM Online Audit Report snapshots

 

Odata Tables available

I did find the ‘Audit’ table available. However, there are few challenges to interpret the data. I am listing the challenges below and the steps I handled it.

  1. Initially, the ‘Audit’ table is blank.
    • This is because in CRM online you need to enable the setting of Audit Tracking. You can follow this blog to enable it.
  2. Every field is in its code or id format. And, there is no mapping table in Odata connection to map between field id and field value.
    • I have check the Microsoft provided content pack and found out those mapping table are hard coded. So I have to google the CRM Audit Entity documentation in Microsoft to get the hard coded mapping for the following fields.
      • Operation – The action that causes the audit – It will be create, delete, or update
      • Action – Actions the user can perform that cause a change
      • Audit Table fields
  3. You can expand the userID field to get the information  (such as full name, login account, address, etc.) of user who caused a change. This however, only provide information for users who make changes like ‘create’, ‘update’, and ‘delete’. For ‘access’ status, this userid only reflect as ‘SYSTEM’ as the user.
    • To get the information about the user’s name who access CRM, we need to link the ‘_objectid_value’ field from Audit table to the ‘UserId’ field from systemusers table.
  4. Missing ‘ObjectTypeCode‘ field. This is the key field is missing from this table for the Odata connection data pull. Thus, we are not able to find which entities the changes were made on.
    • Unfortunately, I have not find a solution for this unless CRM team reveal this field for the CRM Audit Entity.
    • If this field is available in the future, we can use this field in combine with ‘Attributemask’ fields to get the column names of the objects which has been changed. So that we can build further analysis report.

For now, with the information we can get, we can start to build dashboard with user login activities.

Final Result.PNG

Thanks.

PowerBI – two ways to dynamically change measures in the same visual

I love that you can always find multiple ways to solve one problem. Inspired by the Youtube video by Guys in the cube and blog reading from SQLBI.com, I found there are two ways to dynamically change measures in the same visual. The first way is more towards front end report developer using bookmark and the second way is more towards data modeler using a technique called disconnected tables.

Bookmarks method:

Bookmarks

The basic steps are:

  1. Enable Bookmarks under File ->’Options and Settings’ ->Options->Preview Features->’Bookmarks’
  2. Create one visual for example a bar chart with first measure, for example Revenue by Sales Territory
  3. and copy paste the visual 2 times. in the coped visuals replace the measure you want to replace. For example cost by sales territory.
  4. Overlay the three visuals together.
  5. Go to ‘View’ on the top tool panel and make sure ‘Bookmarks Pane’ and and ‘Selection Pane’ are checked
  6. Add bookmarks with selection of visuals to be visible associated with the bookmark you created
  7. .Bookmark setting
  8. Add the images/shapes which represent the measure you want to select on and link it to the bookmark you created.
  9. Image link

Disconnected Table method:

This method is more towards PowerBI modelers. Basically, the idea is to have a Field in a independent table (no relationship to other tables) as Slicer with your measure choice and then create a measure using SELECTEDVALUE function to have the measure dynamically switch referring measures based on the choice made on the slicer.

Step 1. Created an new table called ‘Measurechoice’ like image below.

Discounnected table

Step 2. Create a measure called ‘Selected Value’

Selected Value = SWITCH(SELECTEDVALUE(‘Measurechoice'[Index]),1,[Total Revenue],2,[Total Cost],3,[Total Profit],BLANK())

Step 3. Create a slicer using ‘MeasureChoice’ from the new table created.

Step 4. Create a visual use the ‘Selected Value’ as Value

Selection Result

Personally, I like the disconnected table method because I don’t need to spend time on creating images and link bookmarks. However, the bookmarks choice can help dynamically change the title of the visual and give you the ability to change the layout of your bookmark views. Now it is your choice to choose either one to fit your reporting needs.

Thanks for spending time reading this blog.

Database table partitioning – Build filegroup and partitions

Step 1

CREATE PARTITION FUNCTION PF_Fact_test (int)

AS RANGE

LEFT FOR VALUES

(20140101,20150101,20160101);

]

Step 2 

CREATE PARTITION SCHEME [PS_iGPT_FACT] AS PARTITION [PF_iGPT_FACT] TO ([iGPT2014], [iGPT2015], [iGPT2016],[iGPT2017])

Step 3

 Assign Partition Scheme and Partition Funtion to a table

USE [iGPT]

GO

BEGIN TRANSACTION

CREATE CLUSTERED INDEX [ClusteredIndex_on_PS_Fact_Test_636351103815297616] ON [dbo].[iGPTextraInfo]

([MaxDateAces])WITH (SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF) ON [PS_Fact_Test]([MaxDateAces])

DROP INDEX [ClusteredIndex_on_PS_Fact_Test_636351103815297616] ON [dbo].[iGPTextraInfo]

COMMIT TRANSACTION

Step 4

Load data into the partitioned table

 –The following query returns one or more rows if the table PartitionTable is partitioned

SELECT *

FROM sys.tables AS t

JOIN sys.indexes AS i

ON t.[object_id] = i.[object_id]

AND i.[type] IN (0,1)

JOIN sys.partition_schemes ps

ON i.data_space_id = ps.data_space_id

WHERE t.name = ‘GPTClaimPendedFact’;

 

 

— The following query returns the boundary values for each partition in the PartitionTable table.

SELECT t.name AS TableName, i.name AS IndexName, p.partition_number,p.rows AS Rows, p.partition_id, i.data_space_id, f.function_id, f.type_desc, r.boundary_id, r.value AS BoundaryValue

FROM sys.tables AS t

JOIN sys.indexes AS i

ON t.object_id = i.object_id

JOIN sys.partitions AS p

ON i.object_id = p.object_id AND i.index_id = p.index_id

JOIN sys.partition_schemes AS s

ON i.data_space_id = s.data_space_id

JOIN sys.partition_functions AS f

ON s.function_id = f.function_id

LEFT JOIN sys.partition_range_values AS r

ON f.function_id = r.function_id and r.boundary_id = p.partition_number

WHERE t.name = ‘iGPTInquiryFact_WGS’ AND i.type <= 1

ORDER BY p.partition_number;

 

 

–The following query returns the name of the partitioning column for table. PartitionTable.

SELECT

t.[object_id] AS ObjectID

, t.name AS TableName

, ic.column_id AS PartitioningColumnID

, c.name AS PartitioningColumnName

FROM sys.tables AS t

JOIN sys.indexes AS i                           ON t.[object_id] = i.[object_id]  AND i.[type] <= 1 — clustered index or a heap

JOIN sys.partition_schemes AS ps  ON ps.data_space_id = i.data_space_id

JOIN sys.index_columns AS ic             ON ic.[object_id] = i.[object_id]  AND ic.index_id = i.index_id   AND ic.partition_ordinal >= 1 — because 0 = non-partitioning column

JOIN sys.columns AS c                           ON t.[object_id] = c.[object_id]   AND ic.column_id = c.column_id

WHERE t.name = ‘GPTClaimPendedFact’ ;

 

 

 

SELECT row_count              ,*

FROM sys.dm_db_partition_stats

WHERE object_id = OBJECT_ID(‘dbo.GPTClaimPendedFact’) –and row_count > 0

order by partition_number

SELECT * FROM sys.dm_db_partition_stats;

SELECT SUM(used_page_count) AS total_number_of_used_pages,

SUM (row_count) AS total_number_of_rows

FROM sys.dm_db_partition_stats

WHERE object_id=OBJECT_ID(‘dbo.GPTClaimPendedFact’)    AND (index_id=0 or index_id=1);

New DAX Functions only in SSAS 2016 and Above

SQL Server 2016 Analysis Services (SSAS)*, Power Pivot in Excel 2016, and Power BI Desktop include the following new Data Analysis Expressions (DAX) functions:

Date and Time Functions

CALENDAR Function
CALENDARAUTO Function
DATEDIFF Function

Information Functions

ISEMPTY Function
ISONORAFTER Function

Filter Functions

ADDMISSINGITEMS Function
SUBSTITUTEWITHINDEX Function

Math and Trig Functions

ACOS Function
ACOSH Function
ASIN Function
ASINH Function
ATAN Function
ATANH Function
COMBIN Function
COMBINA Function
COS Function
COSH Function
DEGREES Function
EVEN Function
EXP Function
GCD Function
ISO.CEILING Function
LCM Function
MROUND Function
ODD Function
PI Function
PRODUCT Function
PRODUCTX Function
QUOTIENT Function
RADIANS Function
SIN Function
SINH Function
SQRTPI Function
TAN Function
TANH Function

Statistical Functions

BETA.DIST Function
BETA.INV Function
CHISQ.INV Function
CHISQ.INV.RT Function
CONFIDENCE.NORM Function
CONFIDENCE.T Function
EXPON.DIST Function
GEOMEAN Function
GEOMEANX Function
MEDIAN Function
MEDIANX Function
PERCENTILE.EXC Function
PERCENTILE.INC Function
PERCENTILEX.EXC Function
PERCENTILEX.INC Function
SELECTCOLUMNS Function
XIRR Function
XNPV Function

Text Functions

CONCATENATEX Function

Other Functions

GROUPBY Function
INTERSECT Function
NATURALINNERJOIN Function
NATURALLEFTOUTERJOIN Function
SUMMARIZECOLUMNS Function
UNION Function
VAR

How to create customized color themes for your PowerBI visuals

I found it is just as important to have a nice looking dashboard/reports as the data itself. In this blog, I want to share with you some good tricks which I learned recetly from PowerBI.com and a Youtube channel EnterpriseDNA to create customized color themes in PowerBI desktop.

By default, the customized theme feature is not enabled in PowerBI desktop. So what you need to do is go to ‘File’ -> ‘Options and Settings’

PowerBI Option Change

then, enable the ‘Custom Report Themes’ in ‘Preview Features’ section.

Custom Report Theme Enable

After restarting your PowerBI desktop, you will see ‘Themes’ under ‘Home’ tab

Themes tab

By choosing ‘Switch Themes’, you can import your customized theme saved as JSON file.

Typical JSON Code for themes like this

{    “name”: “St Patricks Day”,
“dataColors”: [“#568410”, “#3A6108”, “#70A322”, “#915203”, “#D79A12”, “#bb7711”, “#114400”, “#aacc66”],
“background”:”#FFFFFF”,
“foreground”: “#3A6108”,
“tableAccent”: “#568410” }
From <https://powerbi.microsoft.com/en-us/documentation/powerbi-desktop-report-themes/>

But, how could we get a good color combination which looks good if I am not a artist?

Usually, I will create the foundation colors based on company logo or image. Since Halloween is coming soon, let’s use a Halloween picture as an example.

There is a site recommended http://palettefx.com/ which helps you to find all the colors used in the image.

Color Pickers

After that, you use Notepad and edit the JSON code I shared above and save as .json file. Json code

Now, you can import the JSON file into PowerBI use the ‘Switch Themes’ add-in we got as your customized theme.

Jason file

Now, you can use your new customized theme to build your PowerBI reports!

There is a PowerBI theme Gallery: https://community.powerbi.com/t5/Themes-Gallery/bd-p/ThemesGalleryHopefully it helps.

Your friend, Annie

 

From SQL to DAX- ‘Lead’ and ‘Lag’ window functions

In SQL, we have two window functions call lead and lag, and with these two functions, you can get the previous and next value of a column partition by and order by other columns in a table.

Use our Advanturework Sales.SalesOrderhead table as an example. The following code can give you the previous and next SalesOrderID for a SalesPersion order by OrderDate.Lead and Lag SQL code

However, it is a very expensive function because the SQL engine need to fetch through the entire table for every row calculation where the functions are called.

It is much faster to use DAX in SSAS tabular model in this case, where the column-store and vertipaq compression technologies are embedded. To use DAX replace lead and lag function, we will be using a key function in DAX called ‘Earlier’.

Using the same example mentioned above,

You can write Previous Order ID calculate column like this:DAX previous and next.PNG

As we know, ‘Calculate’ Function covers the current row context to filter context of the calculation (as the first argument) inside ‘Calculate’. However, the filter contexts (second and the following arguments of ‘Calculate’) created in side ‘Calculate’ block those external filter contexts when they are referring the same columns. In this case, ‘Filter’ function blocked all the filter contexts added externally on the ‘SalesOrderHeader’ table, in other word, the calculation in the first argument of ‘Calculate’ MAX(SalesOrderHeader[SalesOrderID]) don’t know which row it is at SalesOrderHeader table. Only the ‘Earlier’ function brings the previous filter contexts back, which allows MAX(SalesOrderHeader[SalesOrderID]) aware of which row it is at.

Using the second row of the above screenshot as an example, the DAX calculation of PreviousOrderID column can be explained as:

Find the max SalesOrderID where SalesPersonID equal to the SalesPersionID of existing row (called by ‘Earlier’ function) which is ‘274’ and OrderDate are older than the OrderDate of existing row (called by ‘Earlier’ function) which are all the records with OrderDate older than 9/1/2015. Thus, the result is ‘43846’.

 

Thanks.

Your friend Annie.

 

 

 

How to use Dynamic Management Views against on your Desktop PowerBI reports

Power BI contains a local instance of Analysis Services tabular model. By querying Dynamic Management Views (DMVs) query against PowerBI desktop, we can get metadata information about your PowerBI model.

Here are the steps to do so:

Step 1: Open your Power BI report

Step 2: Find your Power BI Analysis Model Instance Port ID.

There are two ways to do that:

Option 1: Open up DAXStudio (a great free tool to help you develop DAX). And then connect to the Power BI report you opened.

Get DAX 1.PNG

Then, Find the local Analysis Service instance address of this Power BI report on the right bottom of the DAX studio window

Get DAX 2.PNG

Option 2: Fine the Power BI temp directory
C:\Users\username\AppData\Local\Microsoft\Power BI Desktop SSRS\AnalysisServicesWorkspaces\…\Data

PowerBI Port

 

Step 3. Open SQL Server Management Studio. And connect to the local instance of Analysis

Get DAX 3

 

Step 4: Create an new query against the only Database under the local instance

Get DAX 4

Step 5: In the query window, you can then run the Dynamic Management Views (DMVs) to Monitor your PowerBI local instance.

The ones I use often including the following:

  • The DMV provide the DAX query behind the report:
Select * from $System.discover_sessions
  • This DMV Provide you all the fields in your model
Select  * from $system.MDSchema_hierarchies
  • This DMV Provide you all the measures in your model
Select * from $System.MDSCHEMA_MEASURES

Find relationships:

Select [ID], [ModelID], [IsActive], [Type], [CrossfilteringBehavior], [FromTableID], [FromColumnID], [FromCardinality], [ToTableID], [ToColumnID], [ToCardinality], [ModifiedTime]
from $SYSTEM.TMSCHEMA_RELATIONSHIPS
Select [ID], [ModelID], [Name]from $SYSTEM.TMSCHEMA_TABLES
Select [ID], [TableID], [ExplicitName] from $SYSTEM.TMSCHEMA_COLUMNS

For More DMV, there is a good post: https://datasavvy.me/2016/10/04/documenting-your-tabular-or-power-bi-model/

Bonus

If you do not have SSMS, you can use Power Query in Excel or Power BI to query those DMVs, here is an sample M query.

let
    Source = AnalysisServices.Database(TabularInstanceName, TabularDBName, [Query=”Select [ID], [TableID], [Name], [Description], [DataSourceID], [QueryDefinition], [Type], [Mode], ModifiedTime from $SYSTEM.TMSCHEMA_PARTITIONS”]),
    #”Renamed Columns” = Table.RenameColumns(Source,{{“ID”, “ID”}, {“DataSourceID”, “Data Source ID”}, {“QueryDefinition”, “Query Definition”}, {“ModifiedTime”, “Modified Time”}})
in
    #”Renamed Columns”

Thanks

Your friend, Annie

 

 

Compare Formula Engine VS. Storage Engine in SSAS Tabular

To improve SSAS tabular performance or to improve your DAX query, it is important to know the different between formula engine and storage engine. Here, I have created a table for you to better distinguish these two in Tabular (I need to reinforce it is for tabular modeling not multidimensional because the back end technology used is quite different). Both engines play vital roles to process DAX query requests.

FE and SE

Category Formula Engine Storage Engine
Query received Interpret DAX/MDX formula Can handle single logic (xmSQL) from formula Engine
Target Data Iterate over datacaches produced by storage engine (datacaches are in-memory tables) Iterate over compressed data in vertipaq column stores
Result Produce result set and send back to requestor Produce Datacaches send back to formula Engine
Thread Single – Threaded Multi – Threaded
Cache utilization No Yes
Area of focus for  Performance Tuning Check physical plan for bottleneck Check xmSQL query for bottleneck

 

Blog at WordPress.com.

Up ↑