About Me

I now work for Microsoft Federal in Chevy Chase, MD.

Dedicated to providing customer-driven, results-focused solutions to the complex business problems of today... and tomorrow.

At SQLTrainer.com, LLC  we understand that the technical challenges faced by businesses today are much greater in both scope and complexity than they have ever been. Businesses today are faced with understanding both local IT infrastructures as well as cloud-based technologies.

What is SQLTrainer.com?

Founded in 1998 by Ted Malone, SQLTrainer.com is a technical consulting, training and content development firm dedicated to the following core principles:

  • Technology Alone is NOT the Answer! - Implementing a particular technology because it is interesting or "cool" will not solve customer problems.
  • Technology Solutions do NOT need to be Overly Complex! - Many times developers and technical practitioners will attempt to build the most clever solution possible. While this serves to stroke the egos of those involved, it doesn't provide a maintainable solution.
  • Consultants Should be Mentors First! - When looking to hire an external consultant, businesses should look to the consultant who's willing to train themselves out of a paycheck.

Why the name, SQLTrainer.com?

SQL (pronounced See-Quell) stands for Structured Query Language, which is at the heart of every modern-day relational database system. Since many technology solutions today rely on some form of database storage or interaction, it was only logical to find a way to incorporate SQL into the name of the organization. Given that one of our core principles is to be a mentor/training above everything, the name SQLTrainer made sense. Since we also wanted to represent our embracing of the cloud, it seemed logical to add the ".com", referring to the biggest "cloud" of them all.

Live Feeds

Thursday, July 24, 2014 1:00:00 PM

Today, we are making available a new version of SQL Server Migration Assistant (SSMA), a free tool to help customers migrate their existing Oracle databases to SQL Server 2014. Microsoft released SQL Server 2014 earlier this year, after months of customer testing, with features such as In-Memory OLTP to speed up transaction performance, In-Memory Columnstore to speed up query performance, and other great hybrid cloud features such as backup to cloud directly from SQL Server Management Studio and the ability to utilize Azure as a disaster recovery site using SQL Server 2014 AlwaysOn.

Available now, the SQL Server Migration Assistant version 6.0 for Oracle databases, greatly simplifies the database migration process from Oracle databases to SQL Server. SSMA automates all aspects of migration including migration assessment analysis, schema and SQL statement conversion, data migration as well as migration testing to reduce cost and reduce risk of database migration projects. Moreover, SSMA version 6.0 for Oracle databases brings additional features such as automatically moving Oracle tables into SQL Server 2014 in-memory tables, the ability to process 10,000 Oracle objects in a single migration, and increased performance in database migration and report generation.

Many customers have realized the benefits of migrating their database to SQL Server using previous versions of SSMA. For example:

SSMA for Oracle is designed to support migration from Oracle 9i or later version to all editions of SQL Server 2005, SQL Server 2008, SQL Server 2008 R2, and SQL Server 2012 and SQL Server 2014. The SSMA product team is also available to answer your questions and provide technical support at ssmahelp@microsoft.com

To download SSMA for Oracle, go here. To evaluate SQL Server 2014, go here.  

Thursday, July 24, 2014 10:00:00 AM

If you’re considering backing up your SQL Server database to the cloud, there are many compelling reasons. Not only will you have an offsite copy of your data for business continuity and disaster recovery purposes, but you can save on CAPEX by using Microsoft Azure for cost-effective storage.  And now, you can choose to backup to Microsoft Azure even for databases that aren’t running the latest version of SQL Server – creating a consistent backup strategy across your database environment. 

SQL Server has these tools and features to help you back up to the cloud:

  • In SQL Server 2014, Managed Backup to Microsoft Azure manages your backup to Microsoft Azure, setting backup frequency based on data activity.  It is available inside the SQL Server Management Studio in SQL Server 2014.
  • In SQL Server 2012 and 2014, Backup to URL provides backup to Microsoft Azure using T-SQL and PowerShell scripting.
  • For prior versions, SQL Server Backup to Microsoft Azure Tool enables you to back up to the cloud all supported versions of SQL Server, including older ones.  It can also be used to provide encryption and compression for your backups – even for versions of SQL Server that don’t support these functions natively.

To show you how easy it is to get started with SQL Server Backup to Microsoft Azure Tool, we’ve outlined the four simple steps you need to follow:

Prerequisites: Microsoft Azure subscription and a Microsoft Azure Storage Account.  You can log in to the Microsoft Azure Management Portal using your Microsoft account.  In addition, you will need to create a Microsoft Azure Blob Storage Container:  SQL Server uses the Microsoft Azure Blob storage service and stores the backups as blobs. 

Step 1: Download the SQL Server Backup to Microsoft Azure Tool, which is available on the Microsoft Download Center.

Step 2: Install the tool. From the download page, download the MSI (x86/x64) to your local machine that has the SQL Server Instances installed, or to a local share with access to the Internet. Use the MSI to install the tool on your production machines. Double click to start the installation. 

Step 3: Create your rules.  Start the Microsoft SQL Server Backup to Microsoft Azure Tool Service by running SQLBackup2Azure.exe.  Going through the wizard to setup the rules allows the program to process the backup files that should be encrypted, compressed or uploaded to Azure storage. The Tool does not do job scheduling or error tracking, so you should continue to use SQL Server Management Studio for this functionality.

On the Rules page, click Add to create a new rule.    This will launch a three screen rule entry wizard.

The rule will tell the Tool what local folder to watch for backup file creation. You must also specify the file name pattern that this rule should apply to.

To store the backup in Microsoft Azure Storage, you must specify the name of the account, the storage access key, and the name of the container.  You can retrieve the name of the storage account and the access key information by logging into the Microsoft Azure management portal.

At this time, you can also specify whether or not you wish to have the backup files encrypted or compressed.

Once you have created one or more rules, you will see the existing rules and the option to Modify or Delete the rule.

Step 4: Restore a Database from a Backup Taken with SQL Server Backup to Microsoft Azure Tool in place. The SQL Server Backup to Microsoft Azure Tool creates a ‘stub’ file with some metadata to use during restore.  Use this file like your regular backup file when you wish to restore a database.  SQL Server uses the metadata from this file and the backup on Microsoft Azure storage to complete the restore. 

If the stub file is ever deleted, you can recover a copy of it from the Microsoft Azure storage container in which the backups are stored.  Place the stub file into a folder on the local machine where the Tool is configured to detect and upload backup files.

That’s all it takes!  Now you’re up and running with Backup to and Restore from Microsoft Azure.

To learn more about why to back up to the cloud, join Forrester Research analyst Noel Yuhanna in a webinar on Database Cloud Backup and Disaster Recovery.  You’ll find out why enterprises should make database cloud backup and DR part of their enterprise database strategy. 

The webinar takes place on Tuesday, 7/29 at 9 AM Pacific time; register now.

Thursday, July 17, 2014 10:00:00 AM

In this overview and demo, we will show you what sentiment analysis is and how to build a quick mashup that combines real-time access to multiple data sources using tools from Microsoft.

Sentiment analysis is one of the hottest topics in the Big Data space. Sentiment analysis is the process of analyzing customer comments and feedback from Facebook, Twitter, Email, and more. The purpose of the analysis is to understand the overall sentiment the customer is trying to convey. This could be a negative sentiment, when the customer is unhappy with a company or its product. Neutral sentiment, when the customer is only mentioning a company or product, in passing, without a good or a bad feeling. The last is positive sentiment, when a customer is happy or excited about a company or its product.

Traditionally sentiment analysis was complicated because it required a mixture of very complex platforms and tools. Each component required for sentiment analysis was offered from a different company and required a large amount of custom work. The difficulty is further exasperated by hard-to- achieve business requirements. When we discuss sentiment analysis there are 3 key business requirements we see repeated:

  • Real-time access
  • Full granular data set (structured & unstructured)
  • BI and SQL front-end

Real-time Access

In the case of real-time access, business users need access to fresh data. In the world of social media, customer sentiment can change rapidly. With images and videos quickly being posted with re-tweets and Facebook ‘like’ capabilities, a good or bad aspect of a company’s product can go viral in minutes. Business users need to have the ability to analyze data as it comes in, in real-time. We will show in our overview video and demo, how we can utilize Microsoft’s StreamInsight technology for real-time data analysis and complex-event processing.

Full Granular Data Set

In the case of full granular data, in practice we have seen that using a traditional database system can hinder development. This is because a lot of the data that comes in for sentiment analysis such as email, is in a semi-structured or unstructured format. This means the data is not easily modeled into a database. The data does not come in a simple row/column format. Thus we utilize our Big Data technology that is meant for this type of data:  HDInsight (Hadoop). HDInsight is essentially Hortonworks Data Platform running on Windows. In our case we utilize HDInsight to land all of the data, in its raw original format, into the distributed file system HDFS. This allows us to ingest any kind of data, regardless of structure, and store that data online for further analysis at low cost. The Hadoop software is open-source and readily available.

BI and SQL Front-End

The most important area around delivering sentiment analysis to the business is access, making sure we are able to provide the data both in real-time (and high-fidelity) within the tools that our business users know and love. Previously when our customers were doing sentiment analysis on Hadoop systems, BI and SQL access was not available. This was not because the tools could not integrate with Hadoop systems. This was because they could not scale or have the same level of functionality. Some BI users have chosen Hive ODBC in Hadoop, which many claim to be slow and ‘buggy’. Instead here we utilize one of our flagship technologies: PolyBase. With PolyBase we expose the data in Hadoop, and relational SQL Server, with one T-SQL query. What this means is users can use BI tools like Excel, SSAS, or other 3rd party tools. They can then utilize PolyBase within Analytics Platform System (APS) to query that data either in Hadoop, or Parallel Data Warehouse (SQL Server), or mash up the data from both systems!

How It Works

Now we will show you how to use all of the tools from the SQL Server data platform to achieve sentiment analysis. This will allow you to quickly deploy and meet all 3 business requirements through a set of tools and platforms that are very easy to use, fully integrated, and ‘just work’ together.

Let’s get started with the first video (~5 minutes) where we present sentiment analysis using Microsoft technologies. We show you how sentiment analysis works, and how the Microsoft products fit. We then follow up by discussing the architecture in detail surrounding StreamInsight, HDInsight, and Analytics Platform System.

Watch the overview video:


In the second video (~7 minutes), we show you sentiment analysis in action. The demo will include a full sentiment-analysis engine running in real-time against Twitter data along with a web dashboard. We then stream Twitter data to both HDInsight and Parallel Data Warehouse. Finally, we end the demo by showcasing PolyBase, our flagship technology. With Polybase we can do data mashups combining data from relational and non-relational systems. We will use Polybase to write standard T-SQL queries against this data to determine tweet analytics and how social sentiment is fairing for our marketing campaigns and products.

Watch the demo video:

Wednesday, July 16, 2014 4:14:00 PM

We are pleased to see Microsoft acknowledged by Forrester Research as a Leader in The Forrester Wave™: Agile Business Intelligence Platforms, Q3 2014.  

We are happy to see what we believe to be an affirmation in our approach and in the strength of our technologies. Our placement in this report reflects both high scores from our clients for product vision, as well as for client feedback collected as part of the customer survey. Forrester notes that “Microsoft received high client feedback scores for its agile, business user self-service and [advanced data visualization] ADV functionality. Clients also gave Microsoft BI a high score for its product vision”. This feedback from our customers is especially gratifying to see.

Microsoft is delivering on our vision of making business intelligence more agile and accessible through the tools that people use every day. With the accessibility of Excel and the recent release of Power BI for Office 365, we aim to lower the barrier of entry for users and reduce the complexity of deploying business intelligence solutions for IT. Using Microsoft’s business intelligence solution, companies such as MediaCom have reduced time to reporting from weeks to days, Carnegie Mellon is using data to reduce energy consumption by 30%, and Helse Vest is combining hospital data to visualize trends in real time.

We appreciate the recognition of our software in this report. Above all, we value our customer’s voice in helping shape and validate this approach.

Thursday, July 10, 2014 12:35:00 PM

Microsoft SQL Server 2014 brings new performance and scalability gains by introducing In-Memory OLTP.  In-Memory OLTP contains tables and indexes optimized for in memory. Transactions execute under lock-free algorithms to provide linear scalability and Transact-SQL stored procedures can be compiled in native machine code for maximum efficiency in processing.

Working with SQL Server customers on In-Memory OLTP engagements, a common pattern emerged around the desire for increased performance and scalability when using ASP.NET session state. Some early adopters modified their SQL Server objects to take advantage of In-Memory OLTP for ASP.NET session state, with great success. To learn more, read the bwin.party case study “Gaming site can scale to 250,000 requests per second and improve player experience”. To further enhance this scenario, we have created a new provider to make it easier for customers to take advantage of SQL Server In-Memory OLTP when using ASP.NET session state.

This ASP.NET session state provider is fully optimized for In-Memory OLTP by calling natively compiled Transact-SQL stored procedures and by creating all tables as memory-optimized. The functionality of the provider was tested both internally and by external customers. The results showed the implementation was able to provide some significant gains at scale levels which would have previously exhibited a bottleneck on the database.

NOTE: While some testing has been done before the release, we recommend executing your own testing and validation to understand how this implementation behaves in your specific environment.

Getting Started

Setting up the provider requires two steps, installing the provider into the ASP.NET application and creating the In-Memory OLTP database and objects in Microsoft SQL Server 2014.

The provider and scripts can be accessed in two ways:

1. The package has been uploaded to NuGet: https://www.nuget.org/packages/Microsoft.Web.SessionState.SqlInMemory/

2. The source code is also accessible through CodePlex: https://msftdbprodsamples.codeplex.com/releases/view/125282

NuGet Installation

Download the ASP.NET Session State Provider for SQL Server In-Memory OLTP from the NuGet gallery by running the following command from the Visual Studio Package Manager Console:

PM> Install-Package Microsoft.Web.SessionState.SqlInMemory

More information about the NuGet package can be found here:


Installing the package will do the following things:

  • Add references to the ASP.NET Session State Provider assembly.
  • Add to the web.config file a customProvider equals to "SqlInMemoryProvider", where the connectionString attribute needs to be updated.
    <?xml version="1.0" encoding="utf-8"?>
        <sessionState mode="Custom" customProvider="SqlInMemoryProvider">
            <add name="SqlInMemoryProvider"
                 connectionString="data source=sqlserver;initial catalog=ASPStateInMemory;User ID=user;Password=password;" />
  • Adds an ASPStateInMemory.sql file that includes the script for creating the SQL Server database configured to support In-Memory OLTP.

Setting up In-Memory OLTP Database and objects

Open the T-SQL script file "ASPStateInMemory.sql" and update the 'CREATE DATABASE' statement to replace the 'FILENAME' attributes to specify a path that will exist in your SQL Server machine where the memory-optimized filegroup should exist. For further considerations on placement of this filegroup see Books Online section Creating and Managing Storage for Memory-Optimized Objects

  NAME = ASPStateInMemory, FILENAME = 'D:\SQL\data\ASPStateInMemory_data.mdf'
  NAME = ASPStateInMemory_xtp, FILENAME = 'D:\SQL\data\ASPStateInMemory_xtp'

After updating the 'FILENAME' attributes, run the entire script for creating the In-Memory tables and the natively compiled stored procedures.

Additionally, create a periodic task in SQL Server to run the stored procedure 'dbo.DeleteExpiredSessions'. This procedure removes the expired sessions and frees up the memory consumed.

NOTE: The memory-optimized tables are created with a durability of SCHEMA_ONLY to optimize for performance. If session data durability is required, then change the 'DURABILITY' attribute from 'SCHEMA_ONLY' to 'SCHEMA_AND_DATA'. More information can be found in Books Online sections Defining Durability for Memory-Optimized Objects and Durability for Memory-Optimized Tables.


SQL Server In-Memory OLTP has shown to greatly improve the performance of ASP.NET session state applications. This provider allows customers to optimize ASP.NET web farms to take advantage of SQL Server In-Memory OLTP using a packaged solution with ease.

For further considerations on session state with In-Memory OLTP, along with other solution patterns which have shown success with SQL Server In-Memory OLTP, please reference the whitepaper: In-Memory OLTP – Common Workload Patterns and Migration Considerations.  

Download the Microsoft SQL Server 2014 Evaluation and see how in-memory processing built into SQL Server 2014 delivers breakthrough performance.

Tuesday, July 1, 2014 10:00:00 AM

This blog post was authored by: Murshed Zaman, AzureCAT PM and Sumin Mohanan, DS SDET

With the advent of SQL Server Parallel Data Warehouse (the MPP version of SQL Server) V2 AU1 (Appliance Update 1), PDW got a new name: the Analytics Platform System [Appliance] or APS. The name changed with the addition of Microsoft’s Windows distribution of Hadoop (HDInsight or HDI) and PDW sharing the same communication fabric in one appliance. Customers can buy an APS appliance with PDW or with PDW and HDI in configurable combinations.

Used in current versions of PDW, Polybase is a technology that allows PDW users to query HDFS data. SQL users can quickly get results from Hadoop data without learning Java or C#.

Features of Polybase include:

  1. Schematization of Hadoop data in PDW as external tables
  2. Querying Hadoop data
  3. Querying Hadoop data and joining with PDW tables
  4. High speed export and archival of PDW data into Hadoop
  5. Creating persisted tables in PDW from Hadoop data 

In V2AU1 Polybase improvements include:

  1. Predicate push-down for queries in Hadoop as Map/Reduce jobs
  2. Statistics on Hadoop data in PDW

Another new feature introduced in PDW V2AU1 is the capability to query data that resides in Microsoft Azure Storage Accounts. Just like HDFS data, PDW can place a schema on data in Microsoft Azure Storage Accounts and move data from PDW to Azure and back.

The APS with these new features and improvements has become a first-class citizen in analytics for any type of data. Any company that has Big Data requirements and wants a highly scale-out Data Warehouse appliance can use APS.

Here are four cases that illustrate how different industries are leveraging APS:

One: Retail brand vs. Name brand

Retail companies that use PDW who also want to harvest and curate data from their social analytics sites. This data provides insights into their products and understand the behaviors of the customers. Using APS, the company can offer the right promotion at the right time and to the right demographics. Data also allows the companies to find brand recommendation coming from a friend, relative or a trusted support group that can be much more effective than marketing literature alone. By monitoring and profiling social media, these companies can also gain a competitive advantage.

Today’s empowered shoppers want personalized offers that appeal to their emotional needs. Using social media retailers offer promotions that are tailored to individuals using real-time analytics. This process starts by ranking blogs, forums, Twitter feed and Facebook posts for predetermined KPIs revealed in these posts and conversations. Retail organizations analyze and use the data to profile shoppers to personalize future marketing campaigns. Measureable or sale data reveals the effectiveness of the campaign and the whole process starts again with the insight gained.

In this example, PDW houses the relational sale data and Hadoop houses the social emotions. PDW with built in HDI region gives the company the arsenal to analyze both data sources in a timely manner to be able to react and make changes.  

Retail store APS diagram:

Two: Computer Component Manufacturing

Companies that generate massive amounts of electronic test data can get valuable insights from APS. Test data are usually a good candidate for Hadoop due to its key-value type (JSON or XML) structure.

One example in this space is a computer component manufacturer. Due to the volume, velocity and variety of these (ie: Sort/Class) data a conventional ETL process can be very resource expensive. Using APS, companies can gain insight from their data by putting the semi-structured (key-value pair) data into an HDI-Region and other complementary structured data sources (ie: Wafer Electrical Test) into PDW. With the Polybase query feature these two types of data can easily be combined and evaluated for success/failure rates.

Computer Component Manufacturing Diagram:

Three: Game Analytic Platform for online game vendors

The PDW with HDI regions can offer a complete solution for online game companies, to derive insights from their data. MMORPG’s (Massively Multiplayer Online Role Playing Games) are good examples where APS can deliver value.  Game engines produce many transactional data (events like which avatar got killed in the current active game) and a lot of semi-structured data such as activity logs containing chat data and historical logs. PDW is well-suited to loading the transactional data in to the PDW workload and semi-structured data to the HDI region of APS. The data can then be used to derive insights such as: 

  1. Customer retention - Discovering when to give customers offers and incentives to keep them in the game
  2. Improving game experience - Discovering where customers are spending more time in the game, and improving in-game experience
  3. Detecting fraudulent gaming activities

Currently these companies deal with multiple solutions and products to achieve the goal. APS provides a single solution to power both their transactional and non-transactional analytics.

Four: Click stream analysis of product websites for targeted advertisement.

In the past, a relational database system was sufficient to satisfy the data requirements of a medium-scale production website. Ever-increasing competition and advancements in technology have changed the way in which websites interact with customers. Apart from storing data that customers explicitly provide the company, sites now record how customers interact with their website.  As an example, when a registered user browses a particular car model, additional targeted advertisements and offers can be sent to the user.

This scenario can be captured using collected clickstream data and the Hadoop eco-system. APS acts as the complete solution to these companies by offering the PDW workload to store and analyze transactional data, combined with HDI region to derive insights from the click-stream data.

This solution also applies to Third party companies that specialize in targeted advertising campaigns for their clients.

While “Big Data” is a hot topic, we very often receive questions from customers about the actual use cases that apply to them and how they can derive new business value from “Big Data.” Hopefully these use cases highlight how various industries can truly leverage their data to mine insights that deliver business value in addition to showcasing how traditional data warehouse capabilities work together with Hadoop

Visit the Microsoft Analytics Platform System page to learn more. 

Wednesday, June 25, 2014 10:00:00 AM

This blog post will detail how APS gives users the ability to:

  • Leverage Power Query, Power Pivot, and Power Map at massive scale
  • Iteratively query APS, adding BI on the fly
  • Combine data seamlessly from PDW, HDI, and Azure using PolyBase

The Microsoft Analytics Platform System (APS) is a powerful scale out data warehouse solution for aggregating data across a variety of platforms. In Architecture of the Microsoft Analytics Platform System and PolyBase in APS - Yet another SQL over Hadoop solution?, the base architecture of the platform was defined. Here we’ll build on this knowledge to see how APS becomes a key element of your BI story at massive scale.

Let’s first start with a business case. Penelope is a data analyst at a US based restaurant chain with hundreds of locations across the world. She is looking to use the power of the Microsoft BI stack to get insight into the business – both in real time and aggregate form for the last quarter. With the integration of APS with Microsoft BI stack, she is able to extend her analysis beyond simple querying. Penelope is able to utilize the MOLAP data model in SQL Server Analysis Services (SSAS) as a front end to the massive querying capabilities of APS. Using the combined tools, she is able to:

  • Quickly access data in stored aggregations that are compressed and optimized for analysis
  • Easily update these aggregations based on structured and unstructured data sets
  • Transparently access data through Excel’s front-end

Using Excel, Penelope has quick access to all of the aggregations she has stored in SSAS with analysis tools like Power Query, Power Pivot, and Power Map. Using Power Map, Penelope is able to plot the growth of restaurants across America, and sees that lagging sales in two regions, the West Coast and Mid-Atlantic, are affecting the company as a whole.

After Penelope discovers that sales are disproportionately low on the West Coast and in the Mid-Atlantic regions, she can use the speed of APS’ Massively Parallel Processor (MPP) architecture to iteratively query the database, create additional MOLAP cubes on the fly, and focus on issues driving down sales with speed and precision using Microsoft’s BI stack. By isolating the regions in question, Penelope sees that sales are predominantly being affected by two states – California and Connecticut. Drilling down further, she uses Power Chart and Power Pivot to breakdown sales by menu item in the two states, and sees that the items with low sales in those regions are completely different.

While querying relational data stored in APS can get to the root of an issue, by leveraging PolyBase it becomes simple to also take advantage of the world of unstructured data, bringing additional insight from sources such as sensors or social media sites. In this way Penelope is able to incorporate the text of tweets relating to menu items into her analysis. She can use PolyBase’s predicate pushdown ability to filter tweets by geographic region and mentions of the low selling items in those regions, honing her analysis. In this way, she is able to discover that there are two separate issues at play. In California she sees customers complaining about the lack of gluten free options at restaurants, and in Connecticut she sees that many diners find the food to be too spicy.

Iterative Analytics

So how did Penelope use the power of APS to pull in structured data such as Point of Sale (POS), inventory and ordering history, website traffic, and social sentiment into a cohesive, actionable model? By using a stack that combines the might of APS, with the low time to insight of Excel - let’s breakdown the major components:

  • Microsoft Analytics Platform System (APS)
  • Microsoft HDInsight
  • Microsoft SQL Server Analysis Services (SSAS)
  • Microsoft Excel with Power Query, Power Pivot and Power Map

Loading Data in APS and Hadoop

Any analytics team is able to quickly load data into APS from many relational data sources using SSIS. By synchronizing the data flow between their production inventory and POS systems, APS is able to accurately capture and store trillions of transactional rows from within the company. By leveraging the massive scale of APS (up to 6 PB of storage), Penelope doesn’t have to create the data aggregates up front. Instead she can define them later.

Concurrently, her team uses an HDInsight Hadoop cluster running in Microsoft Azure to aggregate all of the individual tweets and posts about the company alongside its menus, locations, public accounts, customer comments, and sentiment. By storing this data in HDInsight, the company is able to utilize the elastic scale of the Azure cloud, and continually update records with real-time sentiment from many social media sites. With PolyBase, Penelope is able to join transactional data with the external tables containing social sentiment data using standard TSQL constructs.

Creating the External Tables

Using the power of PolyBase, the development team can create external tables in APS connected to the HDInsight instance running in Azure. In two such tables, Tweets and WordCloud, Twitter data is easily collected and aggregated in HDFS. Here, the Tweets table is raw data with an additional sentiment value and the WordCloud table is an aggregate of all words used in posts about to the company.

Connecting APS and SSAS to Excel

Within Excel, Penelope has the ability to choose how she would like to access the data. At first she uses the aggregations that are available to her via SSAS – typical sales aggregates like menu items purchases, inventory, etc. – through PowerQuery.

But how does Penelope access the social sentiment data directly from APS? Simple, by using the same data connection tab, Penelope can directly connect to APS and pull in the sentiment data using PolyBase.

Once the process is complete, tables pulled into Excel, as well as their relationships, are shown as data connections.

Once the data connection is created, Penelope is able to create a report using PowerPivot with structured data from the Orders table and the unstructured social sentiment data from HDInsight in Azure.

With both data sets combined in Excel, Penelope is able to then create a Power Map of the sales data layered with the social sentiment. By diving into the details, she can clearly see issues with sentiment from customers in Connecticut and California.

To learn more about APS, please visit http://www.microsoft.com/aps.

Drew DiPalma – Program Manager – Microsoft APS
Drew is a Program Manager working on Microsoft Analytics Platform System.  His work on the team has covered many areas, including MPP architecture, analytics, and telemetry.  Prior to starting with Microsoft, he studied Computer Science and Mathematics at Pomona College in Claremont, CA. 

Tuesday, June 24, 2014 10:00:00 AM

The cloud is creating radical changes in how information technology is architected.  And its next big target just might be business continuity. Explosive data growth and new application types are causing enterprises to consider cloud as a strategic alternative to growing on-premises storage for backup and disaster recovery.

Many enterprises view backup and disaster recovery (DR) as an IT function that seems to provide little value, even though they must do it to support critical applications. Exponential data growth, shrinking backup windows, static budgets, and increasing deployments of business critical mobile, cloud and web applications are changing the backup and DR requirements. Traditional on-premises approaches for backup and DR cannot keep up with this explosive demand for new database administration requirements.

Join Forrester Research analyst Noel Yuhanna in this webinar as we talk about cloud and database trends, and why enterprises should make database cloud backup and DR part of your enterprise database strategy.

Register now for this webcast on improving business continuity by going to the cloud! The event takes place Wednesday July 2 at 9:00 AM Pacific.


 Noel Yuhanna
Principal Analyst Serving Enterprise Architecture Professionals
Forrester Research

Noel serves Enterprise Architecture Professionals. He primarily covers database management systems (DBMSes), infrastructure-as-a-service (IaaS), data replication and integration, data security, data management tools, and related online transaction processing issues. His current primary research focus is on customer usage experiences and broad industry trends of DBMS, IaaS, data security, enterprise data grids, outsourcing, information life-cycle management, open source databases, and other emerging database technologies.

Thursday, June 19, 2014 10:00:00 AM

Last year we announced the support of SQL Server AlwaysOn Availability Groups on Microsoft Azure Infrastructure Services.

We’re excited to announce that AlwaysOn Availability Groups are now supported between Microsoft Azure Regions. Today we updated our official documentation describing how to configure these.

AlwaysOn Availability Groups on Microsoft Azure Infrastructure Services

Availability Groups, released in SQL Server 2012 and enhanced in SQL Server 2014, detect conditions impacting SQL Server availability (e.g. SQL service being down or losing connectivity).  When detecting these conditions, the Availability Group fails over a group of databases to a secondary replica. In the context of Azure Infrastructure Services, this significantly increases the availability of these databases during Microsoft Azure’s VM Service Healing (e.g. due to physical hardware failures), platform upgrades, or your own patching of the guest OS or SQL Server.

To ensure SQL Server high availability on Azure Infrastructure Services, you configure an Availability Group, generally with 2 replicas (1 primary, 1 secondary) for automatic failover and a Listener. The replicas correspond to SQL Server instances hosted by separate Virtual Machines within the same Azure Virtual Network (VNET). The Listener is a DNS name that client applications, inside or outside the VNET (inside or outside of Microsoft Azure), can use in their connection string to connect to the primary replica of the Availability Group. This is illustrated in the figure below:

AlwaysOn Availability Groups between Microsoft Azure Regions

Availability Groups are now supported between different Azure Regions. Any regions available today (4 in United States, 2 in Europe, 2 in Asia Pacific, 2 in Japan, and 1 in Brazil).

This builds on top of Microsoft Azure’s new support to connect VNETs in different Azure regions via secure tunnels. After connecting 2 or more VNETs, their VMs can connect to each other, and even join the same Windows domain, as if they were part of the same VNET.

Having Availability Groups spanning two or more Azure regions enables two important SQL Server scenarios on Azure Infrastructure Services: disaster recovery and geo-distributed read scale-out.


Scenario 1: SQL Server Disaster Recovery

In this scenario, an Availability Group is expanded with one or more secondary replicas in a different Azure region. This allows quickly recovering SQL Server from a situation impacting a full Azure region (e.g. a gateway hardware failure). This also allows testing disaster recovery processes when desired.

The scenario is depicted in the figure below. An availability group has been configured with 2 replicas (primary P and secondary S1) for automatic failover and a Listener within the virtual network VNET1 in Region 1 (e.g. West US). This guarantees high availability of SQL Server in case of failures within the region.  A secure tunnel has been configured between VNET1 and another virtual network VNET2 in Region 2 (e.g. Central US). The availability group has been expanded with a third replica (S2) configured for manual failover in this VNET to enable disaster recovery in case of failures impacting Region1. Finally, the Listener has been configured to route connections to the primary replica, irrespective of which region hosts it. This allows client applications connect to the primary replica, with the same connection string, after failing over between Azure regions.


Scenario 2: SQL Server Geo-Distributed Read Workloads

In this scenario, an Availability Group is expanded with one or more readable secondary replicas in one or more different Azure regions. This allows offloading read workloads from the primary replica to readable secondary replicas in Azure regions that are closer to the source of the read workloads (e.g. reporting or BI apps).

This not only reduces the utilization of resources (CPU, memory, IO) at the primary replica, saving them for write workloads (e.g. OLTP), but also reduces the response time of the read workloads by reducing network latency and leveraging dedicated resources.

The scenario is depicted in the figure below. As before, an Availability Group has been configured with 2 replicas (primary P and secondary S1) for automatic failover and a Listener within the virtual network VNET1 in Region 1 (e.g. Central US). This guarantees high availability of SQL Server in case of failures within the region. 

Two secure tunnels have been configured between VNET1 and two other Virtual Networks: VNET2 in Region 2 (e.g. East US) and VNET3 in Region 3 (e.g. West US). The availability group has been expanded with two readable secondary replicas, one on each Azure region: S2 on Region 2 and S3 on Region 3.

Client applications, inside or outside of Azure, can connect to the closest readable secondary replica to run read workloads. For example, a Reporting App connects to the secondary replica S2 within the same Azure Region 2, and BI App connects to the secondary replica S4 from on-premise via a public endpoint.


Remember that the secondary replicas on the remote regions can be failover targets, so they can support disaster recovery besides serving read workloads. In addition, they can be used to take backups, this allows offloading backups from the primary replica to reduce resource utilization, and maintaining backups outside the operational region, if needed for compliance reasons.


Wednesday, June 18, 2014 1:00:00 PM

Microsoft Azure Infrastructure Services offers many ways to optimize your on-premises data and data platform projects from development and test of new SQL Server applications to migrating existing on-premises SQL Server instances on the latest images of SQL Server to cost effective hybrid scenarios such as database backup and extended business continuity.  So why is Microsoft Azure Infrastructure Services the ideal place to implement such scenarios?

The latest Images of SQL Server always available

Let’s start with having the latest images of SQL Server as soon as they are made generally available including tuned images for select workloads like an optimized data warehousing image.  An example is Microsoft Azure offered SQL Server 2014 CU1 update at the end of April, the day it was made generally available.  This image is still pending to be uploaded on many other service providers.  In addition critical updates for SQL Server are enabled by default, ensuring that your SQL Server running in an Azure VM always receives these upon release. 

Additional VM Sizes to choose from and more continually being added

With recently added larger VM sizes including the A8 (8 vCPUs, 56GBmem) and A9 (16 vCPUs, 112GB) virtual machines you know have access to more cores and more memory for your larger SQL Server workloads.  You can scale up these large VMs even further with the unique in-memory OLTP design architecture in SQL Server 2014 that removes database contention to allow you to utilize more vCPUs in parallel for increased number of concurrent users and significantly increase transactional performance.

Cross regional connectivity support for extended business continuity and global BI 

Now with the new cross regional Virtual Network connectivity amongst Azure datacenters you can extend the SQL Server business continuity scenario even further by being able to select which Azure datacenters to place AlwaysOn secondaries.  With up to 8 secondaries available with SQL Server 2014, you are not only enhancing business continuity by placing additional asynchronous secondaries but also improving global BI performance by offloading BI reporting from your primary to the closest Azure secondary.  This allows you to take advantage of the global scale that Microsoft Azure datacenters offers along with the cost savings through economies of scale.

Business Continuity with lower RTO and RPO with ExpressRoute

Azure ExpressRoute offers customers a secure dedicated connection to Microsoft Azure.  The connection offers much higher speeds than standard Azure connections as well as greater isolation when it comes to data security.  This means fasters SQL Server backups to Azure storage using the SQL Server cloud backup capability offered in SQL Server 2012 and enhanced in SQL Server 2014.  In addition to improved cloud backup, you can significantly improve hybrid business continuity by combining SQL Server AlwaysOn with ExpressRoute.  The significantly faster and reliable connection between your primary on-premises and secondary in Azure you can improve your recovery point object (RPO) by reducing potential for data loss in asynchronous mode and also improve your recovery time objective (RTO).

The above capabilities in Microsoft Azure make it an ideal environment for you to maximize the benefits of SQL Server hybrid scenarios.  Ready to create your own virtual machine? Check out the resources listed below, and look for upcoming technical blogs to follow on the topics discussed here. 

Try Microsoft Azure

Learn more about Virtual Machines

Read how Amway and Lufthansa leveraged Microsoft SQL Server 2014 and Windows Azure

Site Map | Printable View | © 2008 - 2014 SQLTrainer.com, LLC | Powered by mojoPortal | HTML 5 | CSS | Design by mitchinson