Analysing inventory of Managed Instances using T-SQL

October 1, 2018, 12:10 pm

≫ Next: September 2018 Leaderboard of Database Systems contributors on MSDN

≪ Previous: Public Preview of Graph Edge Constraints on SQL Server 2019

Sometime you would need to know how many Managed Instance you have created in Azure cloud. Although you can find all information about the Azure SQL Managed Instances in Azure portal or API (ARM, PowerShell, Azure CLI), sometime it is hard to list all instances and search them using some criteria. In this post you will see how easily you can load list of your Managed Instances and build inventory of your resources.

Problem

Imagine that you have a large number of Managed Instances and you need to know how many instances you have, in what regions, subnets, and virtual networks they are placed, how much compute and storage is allocated to each of them, etc. Analyzing inventory of Managed Instances might be hard if you just use PowerShell.

Solution

Azure ARM templates enable you to get a list of your managed instances with all necessary properties as JSON object. All you need to to is to load this JSON object in your database table and analyze it sing standard T-SQL language.

As an example, you can list all of your Managed Instances using Azure CLI:

az login

az account set --subscription a7c7b824-xxxx-xxxx-xxxx-e6b827082m1a

az sql mi list >> mi.txt

This command will return a list of your managed instances within the subscription as one JSON array stored in the file mi.txt.

Now you need to get the content of mi.txt, connect to some Azure SQL Database, Managed Instance, or SQL Server 2016 or higher where you have JSON support and use OPENJSON function to load this JSON in a table or query it.

An example of a query that reads data from JSON is shown in the following code:

declare @json nvarchar(max) = N'<<put the content of mi.txt here>>';

select *,
	vNet = SUBSTRING(subnetId, 
			PATINDEX('%/virtualNetworks/%', subnetId) + 17, 
                        PATINDEX('%/subnets/%', subnetId) - PATINDEX('%/virtualNetworks/%', subnetId) - 17),
	subnet = SUBSTRING(subnetId, 
			PATINDEX('%/subnets/%', subnetId) + 9, 
                        200),
	[Number of instances in this subnet] = count(*) over (partition by subnetId)
from openjson(@json)
with (name nvarchar(400), storageSizeInGb int, vCores tinyint,
		location varchar(30), subnetId nvarchar(4000),
		tier varchar(20) '$.sku.tier', hardware varchar(8) '$.sku.family', licenseType varchar(20), resourceGroup varchar(100), state varchar(20)
		)

This query will return all information about your Managed Instances that can be loaded into a table and analyzed using standard SQL language.

↧

September 2018 Leaderboard of Database Systems contributors on MSDN

October 8, 2018, 2:26 pm

≫ Next: Announcing Public Preview of Accelerated Database Recovery

≪ Previous: Analysing inventory of Managed Instances using T-SQL

Congratulations to our September top 10 contributors! Alberto Morillo and Visakh Murukesan maintain their top positions.

This Leaderboard initiative was started in October 2016 to recognize the top Database Systems contributors on MSDN forums. The following continues to be the points hierarchy (in decreasing order of points):

↧

Announcing Public Preview of Accelerated Database Recovery

October 16, 2018, 10:29 pm

≫ Next: SQL Server IaaS Extension Query Service for SQL Server on Azure VM

≪ Previous: September 2018 Leaderboard of Database Systems contributors on MSDN

Today we are excited to announce the public preview of Accelerated Database Recovery!

Accelerated Database Recovery (ADR) is a new SQL Server Engine feature (available in Azure SQL, Azure SQL Data Warehouse, and in the upcoming version of SQL Server on-premises) that greatly improves database availability, especially in the presence of long running transactions, by completely redesigning the current SQL Server recovery process from the ground up. The primary benefits of ADR are:

Fast and consistent Database Recovery
With ADR, long running transactions do not impact the overall recovery time, enabling fast and consistent database recovery irrespective of the number of active transactions in the system or their sizes.

Instantaneous Transaction rollback
With ADR, transaction rollback is instantaneous, irrespective of the time that the transaction has been active or the number of updates that has performed.

Aggressive Log Truncation
With ADR, the transaction log is aggressively truncated, even in the presence of active long running transactions, which prevents it from growing out of control.

For more details, please refer to the Azure SQL Database documentation.

Who should consider Accelerated Database Recovery

– Customers that have workloads with long running transactions.
– Customers that have seen cases where their active transaction log is growing significantly.
– Customers that have experienced long periods of database unavailability due to SQL Server long running recovery.

How to learn more and participate in the Public preview

Please send an email to adr@microsoft.com to learn more and try out Accelerated Database Recovery (ADR). In the e-mail, include the name of your logical server (for single databases, elastic pools, and Azure Data Warehouse). Since this is a preview feature, your testing server should be a non-production server.

↧

SQL Server IaaS Extension Query Service for SQL Server on Azure VM

October 25, 2018, 4:09 pm

≫ Next: Meet the Azure SQL Database team at PASS Summit 2018

≪ Previous: Announcing Public Preview of Accelerated Database Recovery

SQL Server IaaS Extension is installed by default on Azure virtual machines deployed from SQL Server based images on Azure Market Place. SQL Server IaaS extension manages SQL Server configurations on the VM including SQL server connectivity, storage configuration, automated backup, automated security patching and AKV integration. SQL IaaS Extension automates all these administrative tasks and enables monitoring and management through Azure Portal without any need to login to the VM.

Starting with SQL Server IaaS Extension version 2.0, two Windows services are created on VMs as

1- Microsoft SQL Server IaaS Agent: Main service for SQL Server IaaS Extension runs as Local System account.

2- Microsoft SQL Server IaaS Query Service: Helper service for Microsoft SQL Server IaaS Extension that runs SQL queries against SQL Server on IaaS Virtual Machine and runs as NT Service account.

The reason behind adding the new Query Service is to run the SQL IaaS Extension with the least privileged accounts on the VM. SQL Server IaaS Agent Service needs Local System rights to be able to install and configure SQL Server, attach disks and enable storage pool and manage automated security patching of Windows and SQL server.

SQL Server IaaS Query Service does not need Local System rights as it only executes T-SQL for the automated administrative tasks. SQL Server IaaS Query Service is started with an NT Service account which is a Sys Admin on the SQL Server. SQL server IaaS Extension is enabling the SQL Server configurations blade on Azure Portal. If you lower the SQL Server permissions for the NT Service\SQLIaaSExtensionQuery account, then you will not be able to successfully use the SQL Server Configurations blade on the portal.

↧

Meet the Azure SQL Database team at PASS Summit 2018

November 5, 2018, 8:34 am

≫ Next: Introducing Scalar UDF Inlining

≪ Previous: SQL Server IaaS Extension Query Service for SQL Server on Azure VM

Azure SQL Database Engineering team (@azuresqldb) represented by Lindsey Allen, Joachim Hammer, Ajay Jagannathan, Joe Sack, Borko Novakovic, Mine Tokus, Kevin Farlee, Xiaochen Wu, Alice Kupcik, Jakub Szymaszek, Shreya Verma, Perry Skountrianos, Alain Dormehl, Mirek Sztajno will be in full force speaking at several sessions at the PASS Summit 2018 conference (the Largest Technical Conference for the Microsoft Data Platform Community). The conference will take place Nov 6-9, 2018, in Seattle, WA.

The team will present several sessions and share some of the cool new innovations in Azure SQL Database to help you with your cloud journey. In addition, through some of these sessions, we will also share success stories from several customers that have successfully deployed and are running their mission critical workloads on Azure SQL Database. Feel free to stop by to attend and learn from our sessions as well as interact with the team to share your feedback, and any questions that you may have on your on-going projects with Azure SQL Database.

Here is a complete list of sessions that we will be presenting.

Breakout sessions:

Date	Time	Session	Room	Speaker	Co Presenters
2018/11/07	10:15 – 11:30	Introducing Azure SQL Database Hyperscale	TCC Tahoma 5	Ajay Jagannathan	Xiaochen Wu,Kevin Farlee,Lindsey Allen
2018/11/07	13:30 – 14:45	Improving Availability in SQL Server and Azure SQL Database	606	Perry Skountrianos	Mirek Sztajno
2018/11/07	16:45 – 18:00	SQL DB Security Overview	606	Joachim Hammer	Mirek Sztajno,Alice Kupcik
2018/11/08	10:15 – 11:30	Meet the Modern SQL Server: Graph and Machine Learning Services	606	Nellie Gustafsson	Shreya Verma
2018/11/08	10:15 – 11:30	What’s New – Query Performance Insights	2AB	Pedro Lopes	Joseph Sack
2018/11/08	10:45 – 12:00	Azure SQL DB Managed Instances – Built to Easily Modernize Application Data Layer	TCC Skagit 4	Borko Novakovic
2018/11/08	13:30 – 14:45	Enhanced SQL Server on Azure VM Service	TCC Yakima 1	Mine Tokus
2018/11/09	09:30 – 10:45	SQL DB Managed Instance – Best Practices and Lessons Learned	TCC Tahoma 5	Dimitri Furman	Mike Weiner,Kun Cheng

Theater sessions:

Date	Time	Session	Room	Speaker
2018/11/07	14:00 – 14:20	Use Azure SQL Data Sync to build a Hybrid SQL data platform	Microsoft Booth #505	Xiaochen Wu
2018/11/08	10:00 – 10:20	Confidential computing with Always Encrypted with secure enclaves in SQL Server	Microsoft Booth #505	Jakub Szymaszek
2018/11/08	14:30 – 14:50	(Repeat Session) Use Azure SQL Data Sync to build a Hybrid SQL data platform	Microsoft Booth #505	Xiaochen Wu
2018/11/09	11:00 – 11:20	(Repeat Session) Confidential computing with Always Encrypted with secure enclaves in SQL Server	Microsoft Booth #505	Jakub Szymaszek

In addition to these sessions, you can interact with various members of our engineering team at

Birds of a feather luncheon – Azure SQL Database table (Dining Hall 4EF, W/Th – 11:30-1:30, Fri 12-2)
Microsoft Data Clinic (4C – W/Th 9:30-6, Fri 10-2)
Azure SQL Database booth (4B – Exhibit Hall W/Th 9:45-3:30, Fri 10:30-2)

↧

Introducing Scalar UDF Inlining

November 7, 2018, 10:00 am

≫ Next: Public preview of derived tables and views on graph tables in MATCH queries

≪ Previous: Meet the Azure SQL Database team at PASS Summit 2018

Last year SQL Server 2017 and Azure SQL Database introduced query processing improvements that adapt optimization strategies to your application workload’s runtime conditions. These improvements included: batch mode adaptive joins, batch mode memory grant feedback, and interleaved execution for multi-statement table valued functions.

In SQL Server 2019 preview, we are further expanding query processing capabilities with several new features under the Intelligent Query Processing (QP) feature family. In this blog post we’ll discuss Scalar T-SQL UDF Inlining, one of these Intelligent QP features that is now available in public preview with SQL Server 2019 CTP 2.1.

T-SQL UDFs are an elegant way to achieve code reuse and modularity across SQL queries. Some computations (such as complex business rules) are easier to express in imperative UDF form. UDFs help in building up complex logic without requiring expertise in writing complex SQL queries. Despite these benefits, their inferior performance discourages or even prohibits their use in many situations.

The goal of the Scalar UDF inlining feature is to improve performance of queries that invoke scalar UDFs, where UDF execution is the main bottleneck.

Why are scalar UDFs slow today?
Years ago, when scalar UDFs were introduced¹, they opened a way for users to express business logic using familiar constructs such as variable assignments, IF-ELSE branching, loops etc. Consider the following scalar UDF that, given a customer key, determines the service category for that customer. It arrives at the category by first computing the total price of all orders placed by the customer using a SQL query, and then uses an IF-ELSE logic to decide the category based on the total price.

CREATE OR ALTER FUNCTION dbo.customer_category(@ckey INT)
RETURNS CHAR(10) AS
BEGIN
       DECLARE @total_price DECIMAL(18,2);
       DECLARE @category CHAR(10);
      
       SELECT @total_price = SUM(O_TOTALPRICE) FROM ORDERS WHERE O_CUSTKEY = @ckey;
   
       IF @total_price < 500000
              SET @category = 'REGULAR';
       ELSE IF @total_price < 1000000
              SET @category = 'GOLD';
       ELSE
              SET @category = 'PLATINUM';
       RETURN @category;
END

This is very handy as a UDF because it can now be used in multiple queries, and if the threshold values need to be updated, or a new category needs to be added, the change must be made only in the UDF. Now, consider a simple query that invokes this UDF.

-- Q1: 
SELECT C_NAME, dbo.customer_category(C_CUSTKEY) FROM CUSTOMER;

The execution plan for this query in SQL Server 2017 (compatibility level 140 and earlier) is as follows:

As the plan shows, SQL Server adopts a simple strategy here: for every tuple in the CUSTOMER table, invoke the UDF and output the results. This strategy is quite naïve and inefficient. Such queries end up performing badly due to the following reasons.

Iterative invocation: UDFs are invoked in an iterative manner, once per qualifying tuple. This incurs additional costs of repeated context switching due to function invocation. Especially, UDFs that execute SQL queries in their body are severely affected.
Lack of costing: During optimization, only relational operators are costed, while scalar operators are not. Prior to the introduction of scalar UDFs, other scalar operators were generally cheap and did not require costing. A small CPU cost added for a scalar operation was enough.
Interpreted execution: UDFs are evaluated as a batch of statements, executed statement-by-statement. Note that each statement itself is compiled, and the compiled plan is cached. Although this caching strategy saves some time as it avoids recompilations, each statement executes in isolation. No cross-statement optimizations are carried out.
Serial execution: SQL Server does not use intra-query parallelism in queries that invoke UDFs. There are several reasons for this, and it might be a good topic for another blog post.

What changes with the new Scalar UDF inlining feature?
With this new feature, scalar UDFs are transformed into scalar expressions or scalar subqueries which are substituted in the calling query in place of the UDF operator. These expressions and subqueries are then optimized. As a result, the query plan will no longer have a user-defined function operator, but its effects will be observed in the plan, like views or inline TVFs. To understand this better, let’s first consider a simple example.

-- Q2 (Query with no UDF): 
SELECT L_SHIPDATE, O_SHIPPRIORITY, SUM (L_EXTENDEDPRICE *(1 - L_DISCOUNT))
FROM LINEITEM INNER JOIN ORDERS ON O_ORDERKEY = L_ORDERKEY
GROUP BY L_SHIPDATE, O_SHIPPRIORITY ORDER BY L_SHIPDATE

This query computes the sum of discounted prices for line items and presents the results grouped by the shipping date and shipping priority. The expression L_EXTENDEDPRICE *(1 – L_DISCOUNT) is the formula for the discounted price for a given line item. It makes sense to create a function that computes it so that it could be used wherever discounted price needs to be computed.

-- Scalar UDF to encapsulate the computation of discounted price
CREATE FUNCTION dbo.discount_price(@price DECIMAL(12,2), @discount DECIMAL(12,2))
RETURNS DECIMAL (12,2) AS
BEGIN
       RETURN @price * (1 - @discount);
END

Now, we can modify query Q2 to use this UDF as follows:

-- Q3 (Query with UDF): 
SELECT L_SHIPDATE, O_SHIPPRIORITY, SUM (dbo.discount_price(L_EXTENDEDPRICE, L_DISCOUNT))
FROM LINEITEM INNER JOIN ORDERS ON O_ORDERKEY = L_ORDERKEY
GROUP BY L_SHIPDATE, O_SHIPPRIORITY ORDER BY L_SHIPDATE

Due to the reasons outlined earlier, Q3 performs poorly as compared to Q2. Now, with scalar UDF inlining, SQL Server substitutes the scalar expression directly into the query, and thereby overcomes the limitations of UDF evaluation. The results of running this query² are shown in the below table:

Query:	Q2 (No UDF)	Q3 without inlining	Q3 with inlining
Execution Time:	1.6 seconds	29 minutes 11 seconds	1.6 seconds

As we can see, Q3 without inlining is prohibitively slow compared to Q2. But with scalar UDF inlining, the performance of Q3 is on par with Q2, with almost no overheads at all! We get all the benefits of UDFs, without compromising on query performance. Moreover, observe that there were no modifications made to the query or the UDF; it just runs faster!

What about more complex, multi-statement scalar UDFs?
With scalar UDF inlining, SQL Server can now inline multi-statement UDFs also. Let us consider the function dbo.customer_category and query Q1 given above to understand how this works. For query Q1, the query plan with the UDF inlined looks as below.

Here are some key observations from the above plan:

SQL Server has inferred the implicit join between CUSTOMER and ORDERS and made that explicit via a Join operator.
SQL Server has also inferred the implicit GROUP BY O_CUSTKEY on ORDERS and has used the Index Spool and Stream Aggregate to implement it.
SQL Server is using parallelism across all operators.

Depending upon the complexity of the logic in the UDF, the resulting query plan might also get bigger and more complex. As we can see, the operations inside the UDF are now no longer a black box, and hence the query optimizer is able to cost and optimize those operations. Also, since the UDF is no longer in the plan, iterative UDF invocation is replaced by a plan that avoids function call overhead.

What are the advantages of scalar UDF inlining?
As described above, scalar UDF inlining enables users to use scalar UDFs without worrying about the performance overheads. Thereby, this encourages users to build modular, reusable applications.

In addition to resulting in set-oriented, parallel plans for queries with UDFs, this feature has another advantage. Since scalar UDFs are no longer interpreted (i.e. executed statement-by-statement), it enables optimizations such as dead code elimination, constant folding and constant propagation. Depending upon the UDF, these techniques might lead to simpler, more efficient query plans.

What kind of scalar UDFs are inlineable?
A fairly broad set of scalar UDFs are inlineable currently. There are a few limitations such as the T-SQL constructs allowed in the UDF. Please refer to this page for a complete description of “inlineability” of scalar UDFs.

How do I know if my scalar UDF is inlineable?
The sys.sql_modules catalog view includes a property called “is_inlineable”, which indicates whether a UDF is inlineable or not. A value of 1 indicates that it is inlineable, and 0 indicates otherwise. This property will also have a value of 1 for inline table-valued functions, since they are inlineable by definition.

When is scalar UDF inlining beneficial? And when is it not?
As mentioned earlier, this feature is most beneficial when UDF execution is the main bottleneck in a query. If the bottleneck is elsewhere, there may not be any benefits. For instance, if a scalar UDF is invoked only a few times in the query, then inlining might not lead to any gains. There could be a few other scenarios where inlining might not be beneficial. Inlining can be turned off for such UDFs using the INLINE=OFF option in the CREATE/ALTER FUNCTION statement.

I’d like to test this new feature. How do I get started?
This feature is enabled by default under database compatibility level 150. To enable the public preview of scalar UDF inlining in SQL Server 2019 CTP 2.1, enable database compatibility level 150 for the database you are connected to when executing the query:

USE [master];
GO
ALTER DATABASE [DatabaseName] SET COMPATIBILITY_LEVEL = 150;
GO

I’d like to know more about this feature. Where can I find more details?
More information about this feature can be found in this page. We will be updating this space with more links to documentation and examples as they become available. The underlying techniques that describe how this feature works are described in a recent research publication titled “Froid: Optimization of Imperative programs in a Relational Database“.

Where can I provide feedback?
If you have feedback on this feature or other features in the Intelligent QP feature family, please email us at IntelligentQP@microsoft.com.

Footnotes:

Scalar UDFs have been around for a while in SQL Server, at least since SQL Server 2000!
These numbers are based on a TPC-H 10GB CCI dataset, running on a machine with dual processor (12 core), 96GB RAM, backed by SSD. The numbers include compilation and execution time with a cold procedure cache and buffer pool. The default configuration was used, and no other indexes were created.

↧

Public preview of derived tables and views on graph tables in MATCH queries

November 7, 2018, 11:26 am

≫ Next: What Azure permissions are required to create SQL Managed Instance?

≪ Previous: Introducing Scalar UDF Inlining

SQL Server 2017 and Azure SQL Database introduced native graph database capabilities for modeling many-to-many relationships. The first implementation of SQL Graph introduced support for nodes to represent entities, edges to represent relationships, and a new MATCH predicate to support graph pattern matching and traversal.

We will be further expanding the graph database capabilities with several new features. In this blog we will discuss one of those features that is now available for public preview in Azure SQL Database and SQL Server 2019 CTP2.1: use of derived tables and views on graph tables in MATCH queries.

Graph queries on Azure SQL Database now support using view and derived table aliases in the MATCH syntax. To use these aliases in MATCH, the views and derived tables must be created either on a node or edge table which may or may not have some filters on it or a set of node or edge tables combined together using the UNION ALL operator. The ability to use derived table and view aliases in MATCH queries, could be very useful in scenarios where you are looking to query heterogeneous entities or heterogeneous connections between two or more entities in your graph.

In this post we will discuss a few examples, based on the following graph schema created in the database, to see how derived tables and views can be used to query heterogeneous associations in a graph. The scripts to create the required graph schema can be downloaded from here. In the graph above, we see two types of heterogeneous associations between entities:

Heterogeneous Nodes: A node is connected to two or more nodes via the same edge, in the graph. For example, consider that WideWorldImporters would like to find all the customers who bought a StockItem from them. In the graph above, we see that customers of WideWorldImporters could be either an individual Customer or a Supplier. Both Customer and Supplier are connected to StockItem via the same edge type Hence, the query will need to find a Supplier or a Customer who bought a StockItem from WideWorldImporters.
Heterogeneous Edges: Two nodes or entities in a graph are connected to each other via two or more relationships or edges. For instance, in the WideWorldImporters example above, to find a Supplier who operates in a given City, the query has to find a Supplier who is either locatedIn or takes deliveryIn that

These types of queries are generally implemented in the relational model by keeping an extra type or ID column on one of the tables. Queries extract the required information or rows of data, based on the value of this extra type or ID column. But, as the schema and application evolve with more data and relationships, writing queries that involve joins and filters on multiple ID or type columns may not be trivial. With derived table/view support in graph queries, users can write such queries easily using a simple MATCH syntax. In the following section, we will look at some examples to understand how this can be done.

Querying heterogeneous edges

Consider that WideWorldImporters wants to find all the Suppliers that operate in a City. As we discussed earlier, this means they have to find all the suppliers who are either located in or take delivery in the given city. Since , WideWorldImporters can define a view on heterogeneous relationships involved and then use the view alias in the MATCH query as follows:

CREATE VIEW OperatesIn AS
SELECT *, 'located' AS relation FROM locatedIn
UNION ALL
SELECT *, 'delivery' FROM deliveryIn
GO

Now, they can use the OperatesIn view in the following query and other queries which might involve querying same relationships.

SELECT SupplierID, SupplierName, PhoneNumber, relation
FROM Supplier,
City,
OperatesIn
WHERE MATCH(Supplier-(OperatesIn)->City)
AND City.CityName = 'San Francisco'

This query will return information about all the suppliers who operate in San Francisco.

Querying heterogeneous nodes connected via same edge

WideWorldImporters wants to find all of their customers located in San Francisco. This means, they have to find all the Customers (distributor or organization) and Suppliers who are locatedIn San Francisco. They can now create a view to combine all heterogeneous types of customers into one entity as follows:

CREATE VIEW Customer AS
SELECT SupplierID AS ID,
SupplierName AS NAME,
SupplierCategory AS CATEGORY
FROM Supplier
UNION ALL
SELECT CustomerID,
CustomerName,
CustomerCategory
FROM Customers
GO

Now, to find all customers locatedIn San Francisco, they can run the following MATCH query:

SELECT Customer.ID, Customer.NAME, Customer.CATEGORY
FROM Customer,
City,
locatedIn
WHERE MATCH(Customer-(locatedIn)->City)
AND City.CityName = 'San Francisco'

Querying heterogeneous nodes and edges

Extending the scenario in the first example above, let’s consider that WideWorldImporters now wants to find all the customers (distributor, organization or suppliers) who operate in San Francisco. Note that here, both the involved edges (locatedIn and deliveryIn) and the involved customer nodes (Supplier and Customers) are heterogeneous. Since WideWorldImporters has already created Customer and OperatesIn views, they can now write the following MATCH query to get the desired results:

SELECT Customer.ID, Customer.NAME, Customer.CATEGORY
FROM Customer,
City,
OperatesIn
WHERE MATCH(Customer-(OperatesIn)->City)
AND City.CityName = 'San Francisco'

Nested derived tables or views on node or edge tables

Assume that WideWorldImporters wants to find the customers or suppliers who are a store or supplier for Novelty goods, Toys or Gifts. They can create the following views on Supplier and Customers tables to filter the required rows:

CREATE VIEW Novelty_Supplier AS
SELECT SupplierID,
SupplierName ,
SupplierCategory ,
ValidTo
FROM Supplier
WHERE SupplierCategory LIKE '%Novelty%' OR SupplierCategory LIKE '%Toy%'
GO

CREATE VIEW Novelty_Customer AS
SELECT CustomerID,
CustomerName,
CustomerCategory,
ValidTo
FROM Customers
WHERE CustomerCategory LIKE '%Novelty%' OR CustomerCategory LIKE '%Gift%'
GO

Now, they want to find out all stores or suppliers for novelty goods, toys or gifts who operate in San Francisco and they have purchased ‘White Chocolate Snow Balls 250g’ from WideWorldImporters. They want to make sure that the supplier or customer still has a valid membership with WideWorldImporters. The following query helps them gather all the information:

SELECT Name, ID, Category
FROM
(SELECT SupplierID AS ID, SupplierName AS Name,
SupplierCategory AS Category, ValidTo
FROM Novelty_Supplier WHERE ValidTo > getdate()
UNION ALL
SELECT CustomerID, CustomerName, CustomerCategory, ValidTo
FROM Novelty_Customer WHERE ValidTo > getdate()) AS NoveltyCust,
StockItems,
bought,
Operates,
City
WHERE MATCH(City<-(Operates)-NoveltyCust-(bought)->Stockitems)
AND StockItemName = 'White chocolate snow balls 250g'
AND city.cityname = 'San Francisco'
GO

Conclusion

The ability to use view and derived table aliases in a MATCH query make many scenarios easier. For example, for fraud detection in banking, finance or insurance organizations, often one needs to find the heterogeneous relationships that a given customer shares with other customers in the organization. Derived tables on nodes or edge tables will make writing those queries easy and these derived tables can be used in several places in an application for different type of queries.

It is now possible to use view and derived table aliases within a MATCH query in Azure SQL Database. For this feature to work, the view or derived table alias must be defined on

Either a node or an edge table with some filter(s) on it
Or a view or derived table which combines several node or edge tables together using the UNION ALL operator.

Combining node or edge tables using other operators like join, is possible, but such view and derived table aliases cannot be used inside a MATCH query.

Next Steps

You can now use derived table and view aliases within a graph match query in Azure SQL Database and SQL Server 2019 CTP2.1. Please give it a try and send us your feedback.

↧

What Azure permissions are required to create SQL Managed Instance?

November 12, 2018, 1:36 pm

≫ Next: Diagnostic Data for Synchronous Statistics Update Blocking

≪ Previous: Public preview of derived tables and views on graph tables in MATCH queries

Azure SQL Managed Instance is a fully managed SQL Server Instance hosted in Azure cloud that is placed in your Azure VNet. Users who are creating instances need to have some permissions. In this post you will see the minimal permissions required to create managed instance.

If you are owner of your Azure subscription, you can create Azure SQL Managed Instances and configure all required networks settings. However, if you want to delegate this actions to someone, and you don’t want to give him full rights, you would need to assign some special permissions to this role.

The minimal set of permissions that some role must have in order to create new managed instances is:

Microsoft.Resources/deployments/*
Microsoft.Sql/managedInstances/write
Microsoft.Sql/servers/write -> this is temporary requirement and it will be removed very soon

This role can create new instances in the existing configured subnet (i.e. the subnet where is deployed at least one instance in the past). However, this role cannot create instances in the new subnet because it don’t have necessary permissions to configure the network. If you want to give permissions to configure managed instance in the empty subnet, you would need to add the following permissions to the role:

Microsoft.Network/networkSecurityGroups/write
Microsoft.Network/routeTables/write
Microsoft.Network/virtualNetworks/subnets/write
*/join/action

These permissions enable role to create requires networks security group, route table and subnet with these objects.

You can add these permissions to the existing roles, or create new role using something like the following PowerShell script:

Connect-AzureRmAccount
Select-AzureRmSubscription '......'
$role = Get-AzureRmRoleDefinition -Name Reader
$role.Name = "SQL Managed Instance Creator"
$role.Description = "Lets you create Azure SQL Managed Instance in the prepared network/subnet with virtual cluster."
$role.IsCustom = $true
$role.Actions.Add("Microsoft.Resources/deployments/*");
$role.Actions.Add("Microsoft.Sql/managedInstances/write");
$role.Actions.Add("Microsoft.Sql/servers/write"); 
$role.Actions.Add("*/join/action");
New-AzureRmRoleDefinition $role

Since these permissions can change, always check the latest documentation to find the latest rules.

↧

Diagnostic Data for Synchronous Statistics Update Blocking

November 13, 2018, 10:14 am

≫ Next: Create Azure SQL Managed Instance using Azure CLI

≪ Previous: What Azure permissions are required to create SQL Managed Instance?

Consider the following query execution scenario:

You execute a SELECT query that triggers an automatic synchronous statistics update.
The synchronous statistics update begins execution and your query waits (is essentially blocked) until the fresh statistics are generated.
The query compilation and execution does not resume until the synchronous statistics update operation completes.

During this time, there are no external signs via common troubleshooting channels that the query is specifically waiting for the synchronous statistics update operation to complete. If the statistics update takes a long time (due to a large table and\or busy system), there is no easy way to determine root cause of the high duration.

This is not an uncommon scenario and up until now there has been a lack of obvious telemetry surfaced to the customer that helps them (or Microsoft customer support) diagnose the root cause of this type of slow-running query.

In SQL Server 2019 CTP 2.1 (and coming soon to Azure SQL Database), we have introduced new diagnostic data to help troubleshoot this specific scenario…

When a query is blocked behind a synchronous statistics update, the command column in sys.dm_exec_requests will now show ‘Command (STATMAN)’ while a statistics update is happening in the background and will revert to the initial command name after the statistics update operation is finished.

Additionally, the new WAIT_ON_SYNC_STATISTICS_REFRESH wait type will measure aggregated wait time (blocks) on synchronous statistics updates. This wait time accumulation will be available in the sys.dm_os_wait_stats dynamic management view.

We believe these two small changes will help address a significant diagnostic gap. For feedback or questions, please reach out to us at IntelligentQP@microsoft.com.

↧

Create Azure SQL Managed Instance using Azure CLI

November 14, 2018, 5:24 pm

≫ Next: Modify Azure SQL Database Managed Instance using Azure CLI

≪ Previous: Diagnostic Data for Synchronous Statistics Update Blocking

Azure command line interface (CLI) is a set of commands that you can use to create and manage Azure resources. In this post your will see how can you create Managed Instance using Azure CLI.

The Azure CLI is Microsoft’s cross-platform command-line experience for managing Azure resources. You can use it in your browser with Azure Cloud Shell, or install it on macOS, Linux, or Windows and run it from the command line.

Azure CLI enables you to create new managed instances using az sql mi create command.

Prerequisite

The prerequisite for this sample is to prepare Azure resource group , Virtual Network and subnet where the Managed Instance will be created using the instructions described here. The easiest way to setup environment is to use ARM template deployment as described in this post.

Make sure that you set the subscription that you will use using something like the following command:

az account set --subscription b9c7a824-4bde-06c0-9778-e7c2a70573e1

Replace the value b9c7a824-4bde-06c0-9778-e7c2a70573e1 with your subscription id.

Creating Managed Instance

The command az sql mi create will create a new Managed Instance:

az sql mi create -n jovanpop-temp-mi -u myMiAdmin -p Stron9Pwd1234
           -g mi_group -l "West Central US"
           --vnet-name mi_vnet --subnet mi_subnet

In this command you need to specify the name of new managed instance (-n), admin username(-u) and password(-p), resource group where it should be placed (-g), location(data center) where the instance will be placed (-l), and VNet/subnet where the instance will be configured.

If the command succeeds, you will see the properties of the created Managed Instance in the output as JSON text. Note that newly created Managed Instance will not be shown in the Azure deployments.

You can also specify the following properties of new Managed Instance in this command:

-c number of cores that will be assigned to the instance.
--storage storage size expressed in GB
--license-type– that can be BasePrice or LicenseIncluded
--tier– GeneralPurpose or BusinessCritical
--family– hardware family that can be Gen4 or Gen

A command that specifies all these properties is shown in the following example:

az sql mi create -n jovanpop-temp-mi -u myMiAdmin -p StrongPwd1234
            -g mi_group
            -l "West Central US"
            --vnet-name mi_vnet --subnet mi_subnet
            -c 8 --storage 256 --tier BusinessCritical
            -license-type LicenseIncluded --family Gen4

When this command finishes, yo can get the details of the created intance using the following command:

az sql mi show -n jovanpop-temp-mi -g mi_group

↧

Modify Azure SQL Database Managed Instance using Azure CLI

November 14, 2018, 5:45 pm

≫ Next: Azure SQL Managed Instance Business Critical tier is Generally Available

≪ Previous: Create Azure SQL Managed Instance using Azure CLI

Azure SQL Managed Instance is fully-managed SQL Server Database Engine hosted in Azure cloud. With Managed Instance you can easily add/remove cores associated to the instance and change the reserved size of the instance. You can use Azure CLI to easily manage size of the instance and automate this process.

Azure CLI enables you to modify properties of the existing Managed Instances using az sql mi update command. If you don’t have managed instance you can find here more information about creating the instance using Azure CLI.

Prerequisite

Make sure that you set the subscription that you will use using something like the following command:

az account set --subscription b9c7a824-4bde-06c0-9778-e7c2a70573e1

Replace the value b9c7a824-4bde-06c0-9778-e7c2a70573e1 with your subscription id.

Modify instance

Now you are ready to change the characteristics of the instance using az sql mi update command:

az sql mi update -g mi_group -n jovanpop-temp-mi --storage 64

This command might be very useful if you need to scale up or scale down the instance by changing the assigned storage size limit or number of cores.

The parameters that can be provided to this command are:

-g resource group
-n name of the instance
–storage new max instance size
–capacity new number of cores that will be assigned to instance
–admin-password new administrator password
–license-type new licence type can be BasePrice or LicenseIncluded

Note that update process is not instance. If you have bigger instances, Azure would need to move all databases on the virtual machine that has enough resources to host your instance with the new size.

↧

Azure SQL Managed Instance Business Critical tier is Generally Available

December 4, 2018, 12:56 pm

≫ Next: Recreate dropped database on Azure SQL Managed Instance

≪ Previous: Modify Azure SQL Database Managed Instance using Azure CLI

We are happy to announce General availability of Business Critical tier in Azure SQL Managed Instance – architectural model built for high-performance and IO demanding databases.

After 5 months of public preview period Azure SQL Managed Instance Business Critical Service tier is generally available.

Azure SQL Managed Instance Business Critical tier is built for high performance databases and applications that require low IO latency of 1-2ms in average with up to 100K IOPS that can be achieved using fast local SSD that this tier uses to place database files.

Azure SQL Managed Instance Business Critical tier is a cluster of SQL server database engines with one primary read/write replica, one free-of-charge read-only replica, and two hidden replicas that ensure 99.99% availability.

High availability is implemented using SQL Server Always On technology that ensures that changes made on the primary server are replicated to at least two replicas before the transaction is committed. changes are synchronously written into the log files of secondary nodes.

With Business Critical General availability, we are also enabling a set of new features:

Managed Instance enables you to specify any server-level collation when you create a new instance.
You can define Azure Active Directory logins that match your Windows logins.
You can configure auto-failover group between two managed instances running in different Azure regions.
You can keep backups used for point-in-time restore up to 35 days and configure backup retention period per database.
If you want to use Transparent Data Encryption, your can Bring Your Own Key (BYOK) through the integration with Azure Key Vault,
A built-in firewall increases security of your Managed Instances.

You can reduce the cost of your Managed Instances up to 80 percent with the Azure Hybrid Benefit and new reserved capacity pricing, which is now available for the Business Critical service tier. For dev-test workloads, we recommend our Enterprise dev-test or Pay-As-You-Go dev-test pricing to save up to 55 percent off license-included rates.

You can find more information about the Managed Instance in General availability announcement blog post and Azure SQL Database documentation page.

↧

Recreate dropped database on Azure SQL Managed Instance

January 21, 2019, 2:01 am

≫ Next: Getting started with Azure SQL Managed Instance

≪ Previous: Azure SQL Managed Instance Business Critical tier is Generally Available

Azure SQL Database – Managed Instance is fully-managed PaaS service that provides advanced disaster-recovery capabilities. Even if you accidentally drop the database or someone drops your database as part of security attack, Managed Instance will enable you to easily recover the dropped database.

Azure SQL Managed Instance performs automatic backups of you database every 5-10 minutes. If anything happens with your database and even if someone drops it, your data is not lost. Managed Instance enables you to easily re-create the dropped database from the automatic backups. Just make sure that you have the following prerequisites:

PowerShell version >= 5.1.17763.134
AzureRM.Resources version >= 6.7.3

I had some hard-to-debug problems with executing this script on a machine that has lower version of AzureRM library so it is better not to experiment with earlier versions.

Now, let’s imagine that someone dropped your database. Below you can find the PowerShell script that can restore it.

Recreate your dropped database

Before you run this script you need to login to your Azure account and select the subscription where your database is dropped using the script like:

Login-AzureRmAccount
$subscriptionId = "cd827379-9270-0791-....."

Select-AzureRmSubscription -SubscriptionId $subscriptionId

Now you are ready to restore your database.
First, populate information about the instance and database (in this case AdventureWorksDW) that you want to recover:

$subscriptionId = "cd827379-9270-0791-....."
$resourceGroup = "rg_recovery"
$location = "West Central US"
$managedInstanceName = "jovanpop-try-re-create-db"
$deletedDatabaseName = "AdventureWorksDW"
$targetDatabaseName = "NewAdventureWorksDW"

In this example, dropped AdventureWorksDW on the instance jovanpop-try-re-create-db in West Central US region will be re-created as NewAdventureWorksDW database. Then, you can run the following script that uses these variables to recreate it:

$db = Get-AzureRmResource -ResourceId "/subscriptions/$subscriptionId/resourceGroups/$resourceGroup/providers/Microsoft.Sql/managedInstances/$managedInstanceName/restorableDroppedDatabases" -ApiVersion "2017-03-01-preview" |
         Where-Object { $_.Properties.databaseName -eq $deletedDatabaseName } | 
         Sort-Object -Property @{Expression={$_.Properties.deletionDate}; Ascending = $False} | 
         Select-Object  -First 1

Write-Host "Database $($db[0].Properties.databaseName) created on $($db[0].Properties.creationDate) dropped on $($db[0].Properties.deletionDate)"

$properties = New-Object System.Object
$properties | Add-Member -type NoteProperty -name CreateMode -Value "PointInTimeRestore"
$properties | Add-Member -type NoteProperty -name RestorePointInTime -Value $db.Properties.deletionDate
$properties | Add-Member -type NoteProperty -name RestorableDroppedDatabaseId -Value $db.ResourceId
New-AzureRmResource `
        -Location $location -Properties $properties `
        -ResourceId "subscriptions/$subscriptionId/resourceGroups/$resourceGroup/providers/Microsoft.Sql/managedInstances/$managedInstanceName/databases/$targetDatabaseName" `
        -ApiVersion "2017-03-01-preview" `
       -Force

As a result, new database called NewAdventureWorksDW will be created as a copy of the database AdventureWorksDW that is dropped. Just note that some properties as backup retention would be reset to the default value.

NOTE: Dropped time reported in the list might not be correct due to a known issue with duration of last log backup that will be fixed in next until the end of February. The dropped time reported in this list might lag after the actual drop time that should be used to re-create the database. If this script don’t work in your case, subtract few minutes or enter the actual data when you deleted the database if you know it.

↧

Getting started with Azure SQL Managed Instance

January 23, 2019, 2:45 pm

≫ Next: Azure Stream Analytics now supports Azure SQL Database as reference data input

≪ Previous: Recreate dropped database on Azure SQL Managed Instance

Azure SQL Managed Instance is fully managed PaaS version of SQL Server hosted in Azure cloud and placed in you own VNet with private IP address. In this post, I will shortly explain how to configure and create Managed Instance including network environment, migrate your databases and manage databases after migration.

I will explain the following topics in this article:

Configure network environment where Managed Instance will be created
Create Managed Instance
Assess your databases to check could they be migrated
Migrate your databases
Manage your databases after migration

Configuring network environment

Managed Instance is placed in Azure VNet so you need to create an Azure VNet and a subnet where the instance will be placed. Although the VNet/subnet can be automatically configured when the instance is created, it might be good to create it as a first step because you can configure the parameters of VNet.

The easiest way to create and configure the network environment is to use Azure Resource deployment template that will create and configure you network and subnet where the instance will be placed. You just need to press the Azure Resource Manager deploy button and populate the form with parameters. As an alternative, you can use PowerShell script described here.

If you already have a VNet and subnet where you would like to deploy your Managed Instance, you would need to make sure that your VNet and subnet satisfy networking requirements. You should use this PowerShell script to verify that your subnet is properly configured. This script will not just validate your network and report the issues – it will tell you what should be changed and also offer you to make the necessary changes in your VNet/subnet. Run this script if you don’t want to configure your VNet/subnet manually, and also you should run it after any major reconfiguration of your network infrastructure. If you want to create and configure your own network read Managed Instance documentation and this guide.

Creating managed instance

Once you have prepared the network environment, you can create your first Managed Instance. The easiest way to create it is to use the Azure portal and configure all necessary properties. If you have not created the network environment as described in the previous step, the Azure portal can do it for you – the only drawback is the fact that it will configure it with some default parameters that you cannot change later. As an alternative you can use PowerShell, PowerShell with ARM template, or Azure CLI.

Just make sure that you have a subscription type that is allowed to create the instances.

Connecting to Managed Instance

When you create your Managed Instance you would need to find a way how to connect to it. Remember that Managed Instance is your private service placed on a private IP inside your VNet, so you cannot just connect via some public IP (this might be changed in the future). There are several ways how you can setup connection to Managed Instance:

Create Azure Virtual Machine with installed SSMS and other apps that can be used to access your Managed Instance in a subnet within the same VNet where your Managed Instance is placed. VM cannot be in the same subnet with your Managed Instances.
Setup Point-to-site connection on your computer that will enable you to “join” your computer to the VNet where Managed Instance is placed and use Managed Instance as any other SQL Server in your network.
Connect your local network using express route or site-to-site connection.

Assessing your databases

Now when you have prepared Managed Instance you can start migrating your databases from SQL Server to cloud.

First thing that you need to do is to ensure that there are no critical differences between your SQL Server and Managed Instance. You can find a high-level list of supported features in Managed Instance here, and you can find details and known issues here.

Instead of reading documentation and searching for incompatibilities, it would be easier to install Data Migration Assistant (DMA). This tool will analyze your database on SQL Server and find any issue that could block migration to Managed Instance such as existence of FileStream, multiple log files, etc. If you could resolve these issues, your databases are ready to go to Managed Instance.

Other way might be to script your empty database using SSMS or SSDT and try to create all objects on Managed Instance, and check would there be any errors, but DMA is much easier to use.

Database Experimentation Assistant is another useful tool that can record your workload on SQL Server and replay it on Managed Instance so you can determine are there going to be any performance issues if you migrate to Managed Instance. Technical characteristics of Managed Instance are documented here, but DEA will enable you to more easily check does you instance fit your performance needs.

Migrating databases

Finally, you can start migrating your databases from SQL Server to Managed Instance. There are several ways to move your database:

Native restore functionality that enables you to create a backup of your database, upload it to an Azure blob storage and RESTORE database from the blob storage. This is probably the faster approach for migration, but requires downtime because your database cannot be used until you restore it on Managed Instance.
Data Migration service is a service that can migrate your database with minimal downtime.
Exporting and importing your database as .bacpac file, or using bcp too – but there is no big advantage of these methods compared to RESTORE/DMS, except if .bacpac is integrated in your DevOps pipeline.

You can migrate up to 100 database on a single Managed Instance.

Managing databases after migrations

Many management functions such as backups are handled by Managed Instance and don’t require your involvement. However, there are some best practices, tools, and scripts that you might add to your Managed Instance:

sp_blitz procedure from Brent Ozar First Responder Kit can help you identify issues that you have in your database. One example is the fact that Managed Instance don’t run DBCC CHECKDB on your database because this is resource consuming operation. Instead, Managed Instance check every backup and file an warning to Azure SQL team if any corruption is detected. However, it might be good if you could run DBCC CHECKDB periodically.
Maintenance script developed by Ola Hallengren and Microsoft Adaptive Index Defrag scripts can help you keep your indexes up-to-date. Currently Managed Instance don’t automatically rebuilds your indexes.
Apply storage performance best practices and considerations for General Purpose service tier recommended by Dimitri Furman.
Find more How-to guides that can help you configure your Managed Instance.
Install Microsoft PowerShell and Azure CLI modules that can help you configure your Managed Instance.
DBA tools is a powerful PowerShell library that help you control SQL Server and many script can be used on Managed Instance.

↧

Azure Stream Analytics now supports Azure SQL Database as reference data input

February 4, 2019, 4:35 pm

≫ Next: We are moving!

≪ Previous: Getting started with Azure SQL Managed Instance

Reference data is a dataset that is static or slow changing in nature which you can correlate with real-time data streams to augment the data. Azure Stream Analytics leverages versioning of reference data to augment streaming data by the reference data that was valid at the time the event was generated. One of the most popular customer requests is to be able to use Azure SQL Database as reference data input for Stream Analytics.

We are excited to announce that this capability is now available in public preview.

For more information and to get started with using Azure SQL Database as your reference data set for Azure Stream Analytics, see this announcement.

↧

We are moving!

March 4, 2019, 10:21 am

≫ Next: Measuring file io performance on Managed Instance using QPI

≪ Previous: Azure Stream Analytics now supports Azure SQL Database as reference data input

This blog is in the process of being migrated to one of the new consolidated SQL Server and Azure SQL Database blogs on the Microsoft TechCommunity website. Once the migration is complete, we will post the new URL along with instructions for how to navigate the new blog and update your RSS feeds. Until then, feel free to keep reading here, but keep in mind that any comments that are added may not make it over to the new blog. You will be able to re-post them there once we are live. Stay tuned and see you on the new site soon!

↧

Measuring file io performance on Managed Instance using QPI

March 4, 2019, 10:39 am

≫ Next: Reduced recompilations for workloads using temporary tables across multiple scopes

≪ Previous: We are moving!

Checking file performance and comparing performance characteristics between SQL Server database engines in Azure cloud and your on-premises environment might be tricky and require some better knowledge of DM objects. In this post, you will see how to use one open-source library that can help you to analyze and compare file performance.

SQL Server and Azure SQL Managed Instance enable you to measure IO characteristics of your database files using sys.dm_io_virtual_file_stats DM object. However, you must be aware that this DM returns cumulative values that should be sampled and you would need to calculate the differences in IO stats between two period of times. If you need to extract information from this DM object, you should read the following articles:

Paul Randal How to examine IO subsystem latencies
Erin Stellato What Virtual Filestats Do, and Do Not, Tell You About I/O Latency.

If you don’t have prepared scripts you can use the following open-source library: https://github.com/JocaPC/qpi where you have some useful prepared views that can help you analyze your file latency. This is a helper T-SQL library that enables you to:

Take the snapshots of your sys.dm_io_virtual_file_stats
Calculates the values such as IO throughput, latency, etc. Calculations are same/similar as in Paul & Erin scripts.
Enables you to go back in past and examine the latency of the previous snapshot – this is a small additional functionality compared to Paul & Erin scripts.

Prerequisite – install QPI

As a first step you should install QPI library on your Managed Instance/SQL Server. Go to installation section and choose the version of QPI library depending on your SQL Server version. Only versions higher than SQL Server 2016 are supported because it uses SQL Server 2016 temporal tables to store the history of IO statistics.

This is plain T-SQL script that you can review to make sure that there is nothing dangerous inside.

This script will add a set of views/procedures in QPI schema in your database.

Analyzing io statistics

First you need to take a snapshot of the current values in sys.dm_io_virtual_file_stats DM object using the following command:

EXEC qpi.snapshot_file_stats;

This is the baseline for IO statistics and you can get the cumulative/average values until this point. Now you need to keep your workload running until the moment where you want to measure IO performance using the following view:

SELECT * FROM qpi.file_stats;

In the results of this view you can find information about the size of each file, IOPS, throughput, latency, etc.

You can also get the IO statistics in the current database:

SELECT * FROM qpi.db_file_stats;

The results are shown on the following image:

Most of the columns are self-explanatory, but there is difference in the following two:

read_latency_ms/write_latency_ms represent average latency to complete request end-to-end.
read_io_latency_ms/write_io_latency_ms represent average latency to complete request in IO subsystem. Ideally it should be same as read_latency_ms/write_latency_ms. If there is a difference, then SQL Server Database Engine cannot catch-up procession IO requests or introduces some IO throttling.

You can also get the file statistics from some point in time in the past (make sure that you had done snapshot before this pint in time):

SELECT * FROM qpi.file_stats_as_of( <some date> );

If you are comparing IO performance results on Managed Instance and your on-prem or Azure SQL VM (IaaS) try to compare the results from this query in order to get the consistent results and fair comparison.

Conclusion

QPI is a set of useful scripts that can help you to more easily analyze performance of your database files.

QPI is open-source library and if you find any issue in the functions feel free to post the bug submit or pull request with a fix.

↧

Reduced recompilations for workloads using temporary tables across multiple scopes

March 4, 2019, 1:55 pm

≫ Next: Analyzing wait statistics on Managed Instance

≪ Previous: Measuring file io performance on Managed Instance using QPI

SQL Server 2019 introduces several performance optimizations which will improve performance with minimal changes required to your application code. In this blog post we’ll discuss one such improvement available in CTP 2.3: reduced recompilations for workloads using temporary tables in multiple scopes.

In order to understand this improvement, we’ll first go over the current behavior in SQL Server 2017 and prior. When referencing a temporary table with a DML statement (SELECT, INSERT, UPDATE, DELETE), if the temporary table was created by an outer scope batch, we will recompile the DML statement each time it is executed.

The following example illustrates this behavior:

In the outer procedure we:

Create a temporary table
Call a stored procedure

The inner stored procedure definition is as follows:

For the inner stored procedure, we have two DML statements that reference the temporary table created in the outer scope where we:

Insert a row into the temporary table.
Return the row from the temporary table.

We created the temporary table in a different scope from the DML statements, and for the existing implementation (pre-SQL Server 2019), we don’t “trust” that this temporary table schema hasn’t been materially changed and so we recompile the associated DML statements each time they are executed. This additional recompilation activity increases CPU utilization and can decrease overall workload performance and throughput.

Starting with SQL Server 2019 CTP 2.3, we will perform additional lightweight checks to avoid unnecessary recompilations:

We will check if the outer-scope module used for creating the temporary table at compile time is the same one used for consecutive executions.
We will keep track of any data definition language (DDL) changes made at initial compilation and compare them with DDL operations for consecutive executions.

The end result is a reduction in unwarranted recompilations and associated CPU-overhead.

The below figure shows test results from 16 concurrent threads each executing the “OutProc” stored procedure 1,000 times (in a loop). The Y-axis represents the number of occurrences, with the blue line representing Batch Requests/sec and the green line representing SQL Re-Compilations/sec:

When the feature was enabled, for this example we saw:

Improved throughput, as represented by Batch Requests/sec (blue line, second hump).
Shorter overall workload duration.
Reduced recompilations, as represented by SQL Re-Compilations/sec (green line) showing only a small increase at the beginning of the second test.

This feature is enabled by default in SQL Server 2019 CTP 2.3 under all database compatibility levels. As of this writing, this feature is also enabled in Azure SQL Database under database compatibility level 150, although it will soon be applied across all database compatibility levels (matching the behavior of SQL Server 2019).

If you have feedback on this feature or other query processing features, please email us at IntelligentQP@microsoft.com.

↧

Analyzing wait statistics on Managed Instance

March 5, 2019, 6:56 am

≫ Next: Sending resource alerts on Managed Instance using db_mail

≪ Previous: Reduced recompilations for workloads using temporary tables across multiple scopes

Wait statistics are information that might help you understand why the query duration is long and identify the queries that are waiting for something in database engine. In this post, I will show to you how to identify why the workload is waiting and what are the queries that are waiting on some resources.

Azure SQL Managed Instance enables you to find why the queries are waiting for some resources using the following views:

sys.dm_os_wait_stats that returns instance level wait statistics
sys.query_store_wait_stats that returns query-plan wait statistics per database.

These information can be found using DMO/Query store. However, to make the the analysis easier, I’m using free/open-source QPI library. This library is not a prerequisite, but it makes analysis easier because it has pre-defined views that join all necessary Query Store views to fetch the information. Since this library is open-source you can copy-paste the query examples from the views and use them even without installing the whole library.

In addition, it takes the snapshots of the wait stats and enables you to see wait statistics in the past.

To install QPI library go to the installation section and download the SQL script for your version of SQL Server (it supports Azure SQL/SQL Server 2016+ because it depends on Query Store views).

Analyze wait statistics

First thing that you need to do is to take a snapshot of the wait statistics or at least reset them because sys.dm_os_wait_stats collects wait statistics since the instance start or the last time you reset the stats. In QPI, you can use the following procedure to reset the wait statistics:

exec qpi.snapshot_wait_stats

This procedure will reset wait statistics on your instance and the Managed Instance will start collecting new wait statistics.

While your workload is running, you can read the wait statistics values from the qpi.wait_stats view:

This view returns wait statistics and also their categories. Here you can see that the main wait statistic on my instance is INSTANCE_LOG_RATE_GOVERNOR categorized as Log Rate Governor. If you don’t know what is some wait statistic, you can follow the URL in info column and go to sqlskills site to find more details.

The category is important because if you want to find the queries affected by the wait type you need to use category in Query Store and not actual wait type name. Wait statistics in query store are not recorded per wait type.

There is a mapping logic between wait types and wait categories that is documented on SQL Server documentation and these rules are added in this library. This is important because this is the only link between global wait statistics and query store wait statistics.

If you want to see top queries affected by this wait category, you can use qpi.query_wait_stats view and filter the wait statistics per category:

In this case you can see that I’m hitting log rate limit on the Managed Instance and that this is probably caused by rebuild index statement. If you have multiple databases, you need to run this query in each of them because Query Store is configured per database.

These views can enable you to easily troubleshoot the issues on your instance and drill down to the queries that might cause problems or the queries that are affected by some wait type.

QPI is open-source library that is not official part of SQL Server/Azure SQL Db scripts. If you find any issue in this library you can file a bug or submit the PR with the change.

↧

Sending resource alerts on Managed Instance using db_mail

March 6, 2019, 7:10 am

≫ Next: Monitor local storage usage on General Purpose Azure SQL Managed Instance

≪ Previous: Analyzing wait statistics on Managed Instance

One of the biggest issue that you might experience in Managed Instance is reaching storage limit or finding out that you don’t have enough CPU. In this case you would need to get the bigger instance; however, this is not instant operation. In this post, you will see how you can monitor resource usage and send email alerts if there is a risk that you might reach the limits.

Azure SQL Managed Instance enables you to define how much resources you want to provision in terms of max storage and max cores that can be used. If you reach the storage limit there are several issues that will happen:

Your databases are in “read-only” state because nothing can be written in log/data files.
Some of the read queries cannot run because they might require tempdb to grow to store some temporary objects.
CHECKPOINT cannot flush data from memory into mdf files.

Generally, this is the case that you want to avoid. Although you can always upgrade the service tier and get more storage or CPU, this is not an instant operation in Managed Instance because new a host VM must be provisioned and injected in your network.

Therefore, it is important that you constantly monitor your instance and find if you are reaching the storage limits.

Managed Instance provides a system view in master database called sys.server_resource_stats that contains information about the CPU usage, reserved and used storage that enables you to monitor usage in the real-time:

select top 1 avg_cpu_percent,
storage_space_used_mb / reserved_storage_mb
from master.sys.server_resource_stats
order by start_time desc

This view contains snapshots of 5-min usage on the Managed Instance in last two weeks, so you just take the most recent row and see CPU/storage usage. You can periodically run the query on Manage Instance and take some action if the percentages get high.

Sending alerts using db_mail

As an alternative, you can enable db_mail feature and send the email alerts directly from Managed Instance once you are close to resource limits. If you haven’t configured emails on Managed Instance take a look at this post.

The following script will send you an email if CPU usage is above 95% and storage usage is above 90%:

declare @cpu_perc float, @storage_perc float;
declare @instance nvarchar(200) = @@SERVERNAME;

select top 1 @cpu_perc = avg_cpu_percent,
@storage_perc = storage_space_used_mb / reserved_storage_mb
from master.sys.server_resource_stats order by start_time desc

if(@cpu_perc > .95 or @storage_perc > .9)
begin

declare @msg nvarchar(max) = CONCAT('You are reaching the compute/storage limits of your instance ', @@SERVERNAME, ':
Storage ', @storage_perc, '%
CPU usage:', @cpu_perc, '%
Consider upgrading the instance.');
exec msdb.dbo.sp_notify_operator 
@profile_name = N'AzureManagedInstance_dbmail_profile', 
@name = N'DevOps team', 
@subject = N'Azure SQL Instance - Storage limit alert', 
@body = @msg;
end

CPU limit might not be critical and you might find the average CPU usage in last few hours. However, if you are reaching the storage limit you should take some actions faster and add more storage or free-up some space on your instance.

You can easily automate this query and copy/paste it into SQL Agent job that will run every 15 minutes and send the email if you are reaching the limit.

↧