About Me

I now work for Microsoft Federal in Chevy Chase, MD.

Dedicated to providing customer-driven, results-focused solutions to the complex business problems of today... and tomorrow.

At SQLTrainer.com, LLC  we understand that the technical challenges faced by businesses today are much greater in both scope and complexity than they have ever been. Businesses today are faced with understanding both local IT infrastructures as well as cloud-based technologies.

What is SQLTrainer.com?

Founded in 1998 by Ted Malone, SQLTrainer.com is a technical consulting, training and content development firm dedicated to the following core principles:

  • Technology Alone is NOT the Answer! - Implementing a particular technology because it is interesting or "cool" will not solve customer problems.
  • Technology Solutions do NOT need to be Overly Complex! - Many times developers and technical practitioners will attempt to build the most clever solution possible. While this serves to stroke the egos of those involved, it doesn't provide a maintainable solution.
  • Consultants Should be Mentors First! - When looking to hire an external consultant, businesses should look to the consultant who's willing to train themselves out of a paycheck.

Why the name, SQLTrainer.com?

SQL (pronounced See-Quell) stands for Structured Query Language, which is at the heart of every modern-day relational database system. Since many technology solutions today rely on some form of database storage or interaction, it was only logical to find a way to incorporate SQL into the name of the organization. Given that one of our core principles is to be a mentor/training above everything, the name SQLTrainer made sense. Since we also wanted to represent our embracing of the cloud, it seemed logical to add the ".com", referring to the biggest "cloud" of them all.

Live Feeds

RSS Feeds RSS
Tuesday, July 28, 2015 2:00:00 PM

The SQL Server engineering team is committed to SQL Server 2016 monthly public preview release rhythm. Following CTP 2.1 release last month, we are excited to announce the immediate availability of SQL Server 2016 CTP 2.2 release for download. This incremental release includes new capabilities for query store, stretch database, core engine, temporal, MDS, reporting services and engine scalability improvements. Learn about these improvements below.

The Stretch Database enables you dynamically stretch the cold transactional data to Azure SQL so your operational data is always at hand, no matter the size, and you can benefit from the low cost of using Azure. Improvements in this release include:

  • Row Level Security (RLS) enabled
  • Stretch Database Advisor now available to analyze existing database tables, discovering and evaluating candidates for stretch by adjustable table size thresholds
    • Bundled with SQL Server 2016 Upgrade Advisor Preview 1, Stretch Database Advisor is available for download here or through the Web Platform Installer

First released in SQL 2008 R2, the Master Data Services (MDS) is the SQL Server solution for master data management. We are making significant investments in SQL 2016 to advance strategic capabilities in the data management space. Improvements in this release include:

  • Sync entity between models allow you to setup sync relationship to sync entity from another model. Steps:
    1. Go to Admin, Sync Entity page
    2. Click Add
    3. Choose target model, version and entity, source model, version and entity
    4. Choose sync type to be on-demand or auto sync
    5. Click Save
  • SCD Type-2 support enables creation of SCD type-2 subscription view for Member transaction log type entities. Steps:
    1. Go to Admin, Entity page
    2. Choose transaction log type
    3. If transaction log type is Member, go to Integration, Create Views page
    4. Create a view of SCD Type-2 then History View can be chosen from format type
  • Compound Keys index support includes custom attributes to improve the performance or enforce constraints. NOTE: In this release Entity Based Staging batches need to start by calling stored procedures directly instead of using the web UI. Steps:
    1. Go to Admin, Entity page
    2. Click Add on Customer Indexes
    3. Choose the columns and click Save to create the index


The query “flight recorder,” Query Store, captures current and historical query plans and execution metrics, enabling you to easily monitor and troubleshoot query performance issues. Query Store has been made available since the first SQL 2016 community technical preview. Improvements in this release include:

  • Automatically switch to READ_ONLY mode when the reaches the defined max size limit and store collecting new query plans and run time stats. You will be able to detect that by looking at readonly_reason from sys.database_query_store_options. Value 65536 indicates when Query Store has reached the defined size limit.
  • Query Store UI enhancements and bug fixes
  • No force plan recompiles after MAX_PLANS_PER_QUERY is hit. The max_plans_per_query value can be examined from sys.database_query_store_options. NOTE: Force plan recompile could have potential performance overhead.

Temporal enables handling and analyzing database records that changes over time. Improvements in this release include:

  • Full support column with ROWVERSION (TIMESTAMP) to support UPDATE operation on ROWVERSION column in temporal table
  • COLUMNPROPERTY exposes ‘ishidden’ property
    select columnproperty (object_id ('dbo.sample_table'), 'SysStartTime', 'ishidden')
  • Several improvements in SQL Server Management Studio:
    • Syntax highlighting for temporal keywords
    • Transact-SQL client side validations
    • Script table as DROP TO includes DROP script for history table
    • SSMS surfaces IsHidden information in column Properties dialog

Query Execution provides improved diagnostics for memory grant usage. The following new XEvents were added to facilitate better diagnostics. Showplan xml is extended to include memory grant usage per thread and iterator (additions in “RunTimeCountersPerThread” element).

  • query_memory_grant_blocking
  • query_memory_grant_resource_semaphores
  • query_memory_grant_usage (details on ideal vs granted vs used memory)

Core Engine Scalability improvements dynamically partition thread safe memory objects by NUMA node or by CPU. This improvement will enable higher scalability of high concurrency workloads running on NUMA hardware.

  • Thread safe memory objects (of type CmemThread) will be dynamically promoted to be partitioned by NUMA node or by CPU based on workload characteristics and contention factor. In SQL 2012 and SQL 2014, TF8048 is needed to promote memory objects partitioned by node, to be partitioned by CPU. This improvement not only eliminated the need for the trace flag, but also dynamically determined partition based on contention.

DBCC CHECKDB improvements in this release include:

  • Persisted computed columns and filtered indexes validation. Persisted computed columns are frequently used. DBCC CHECKDB can take long time to complete with persisted columns. This improvement provides an option to do persisted column validation under EXTENDED_LOGICAL_CHECKS.
  • Performance improvements when validating a table with thousands of partitions.

Reporting Services added treemap and subburst charts. Report authors can now create two additional chart types:

  • Treemap
  • Subburst charts
Wednesday, July 22, 2015 11:00:00 AM

Revised on July 28

UPDATE: The issue with the initial release of SQL Server 2016 CTP 2.2 that caused downtime during rolling upgrades across an AlwaysOn Availability Group has been corrected, and the download has been replaced.  The download link now points to an updated build of 13.0.407.1 or higher.  An upgrade to this build is recommended for all users of SQL Server 2016, including AlwaysOn users. For more information, visit the CTP 2.2 Release Note.

The SQL Server 2016 Upgrade Advisor Preview and SQL Server 2016 Community Technology Preview (CTP) 2.2 are now available for download! This first look at SQL Server 2016 Upgrade Advisor showcases a new platform for the database upgrade tool.  It also introduces the Stretch Database Advisor, enabling customers to identify data for online, transparent cold storage in Azure. In SQL Server 2016 CTP 2.2, part of our new rapid preview model, we have made enhancements to several features that customers can try in their SQL Server 2016 development and test environments.

In SQL Server 2016 CTP 2.2, available for download today, customers will see enhancements in several key areas

  • Users are able to add security policies including row-level security to a table that has been stretched to Azure using Stretch Database
  • Master Data Services now allows compound keys, slowly changing dimension (SCD) Type 2, and syncing an entity between models
  • SQL Server Reporting Services (SSRS) will now support two additional chart types, treemap and sunburst

CTP2 will also have improvements to the Query Store, Query Execution, DBCC CHECKCB performance, and Temporal Database.

SQL Server 2016 Upgrade Advisor Preview

SQL Server 2016 Upgrade Advisor Preview is a standalone tool that enables users of prior versions to find certain breaking and behavior changes as well as deprecated features in the database engine.  Upgrade Advisor Preview also helps with the adoption of Stretch Database by analyzing existing database tables for fit with the new functionality based on criteria that the user selects, using the Stretch Database Advisor. Future releases will include enhancements to the upgrade rules and feature adoption advisors.

SQL Server 2016 Upgrade Advisor Preview is available on the Microsoft Download Center and through the Web Platform Installer. We invite you to try it, and to provide feedback via the built-in tool.

Download SQL Server 2016 CTP 2.2 preview today!

As the foundation of our end-to-end data platform, SQL Server 2016 is the biggest leap forward in Microsoft's data platform history with real-time operational analytics, rich visualizations on mobile devices, built-in advanced analytics, new advanced security technology, and both on-premises and in the cloud.

To learn more about the release, visit the SQL Server 2016 preview page.  To experience the new, exciting features in SQL Server 2016 and the new rapid release model download the preview or trial the preview using a virtual machine in Microsoft Azure and start evaluating the impact these new innovations can have for your business.  Also, be sure to share your feedback on the new SQL Server 2016 capabilities using Microsoft’s Connect tool.  We look forward to hearing from you

Friday, July 10, 2015 5:10:00 PM

Lighting up interactive insights on big data

By T. K. “Ranga” Rengarajan, corporate vice president, Data Platform

Today, we’re announcing the public preview of Spark for Azure HDInsight and the upcoming general availability of Power BI on July 24th. These investments support our commitment to help more people maximize their data dividends with interactive visualizations on big data.

Big data is changing the way organizations deliver value to their stakeholders. For example, Real Madrid bring soccer matches closer to their 450 million fans and Ultra Tendency project the health impact of nuclear contamination in Japan.  Here at Microsoft, we’re thrilled to help fuel that innovation with data solutions that give customers simple but powerful capabilities. This is something we’ve done with Azure HDInsight by making Hadoop easier to provision, manage, customize, and scale. Azure HDInsight is a fully managed Hadoop service, that includes 24x7 monitoring and enterprise support across the broadest range of analytic workloads.


Introducing Spark for Azure HDInsight

Today, we go further with this vision by providing our customers the best environment to run Apache Spark. Spark is one of the most popular big data projects known for its ability to handle large-scale data applications in memory, making queries up to 100 times faster. Spark lets users do various tasks like batch and interactive queries, real-time streaming, machine learning, and graph processing - all with the same common execution model. With Spark for Azure HDInsight, we offer customers more value with an enterprise ready Spark solution that’s fully managed and has a choice of compelling and interactive experiences.

  • Choice of compelling interactive experiences: Microsoft empowers users and organizations to achieve more by making data accessible to as many people as possible.
    • We have out-of-the-box integration to Power BI for interactive visualizations over big data. Because both are powered by the cloud, you can deploy Spark cluster and visualize it in Power BI within minutes without investing in hardware or complex integration.
    • Data scientists can use popular notebooks like Zeppelin and Jupyter (iPython) to do interactive analysis and machine learning to create narratives that combine code, statistical equations, and visualizations that tell a story about the data with Spark for Azure HDInsight.
    • Microsoft offers flexibility to use BI tools like Tableau, SAP Lumira, and QlikView so you can leverage existing investments.

  • Enterprise Spark: Integrating Spark with Azure HDInsight ensures that it is ready to meet the demands of your mission critical deployments because Azure is always-on, has hyper-scale, and is enterprise-grade. With a 99.9% service level agreement at general availability, you can ensure continuity and protection against catastrophic events. As demands grow, create larger clusters with your choice of SSD and RAM allocation to process big data on demand. Microsoft also has built-in integration with other parts of Azure, like Event Hubs, for building Streaming and IoT related applications.

  • Fully Managed: With Spark for Azure HDInsight, you can get started quickly with a fully managed cluster. This includes 24x7 monitoring and enterprise support for peace of mind. You also have the elasticity of a cloud solution, so you can scale your solution up or down easily, and only pay for the power that you use.

In addition to Spark, we announced the upcoming general availability of Power BI on July 24. Power BI is a cloud-based business analytics service that enables anyone to visualize and analyze data with greater speed, efficiency, and understanding. It connects users to a broad range of live data through easy-to-use dashboards, provides interactive reports, and delivers compelling visualizations that bring data to life. Power BI has out-of-the-box connectors to Spark enabling users to do interactive visualizations on top of big data.  For more information on this announcement, read James Phillip’s blog post.

Microsoft continues to make it easier for customers to maximize their data dividends with our data platform and services. It’s never been easier to capture, transform, mash-up, analyze and visualize any data, of any size, at any scale, in its native format using familiar tools, languages and frameworks in a trusted environment on-premises and in the cloud.

Experience interactive insight on big data today with Spark for Azure HDInsight and Power BI.

-  Ranga

Thursday, July 9, 2015 11:00:00 AM

We are pleased to announce the availability of Datazen Publisher for Windows 7 Preview.

Datazen Publisher is the single point for creation and publishing of rich, interactive visualizations. You can simply connect to the Datazen Server to access your on-premises SQL Server or other enterprise data sources,
easily create beautiful visualizations for any form factor and then publish them for access by others on all major mobile platforms.

Datazen Publisher for Windows 7 offers feature parity with the Datazen Publisher app and the UI has been optimized for desktop scenarios. You can now create and publish data visualizations based on your personal preference
(mouse & keyboard or touch) or device that you are working on.

The Datazen Publisher for Windows 7 Preview is available for download on the Microsoft Download Center.  Please visit the Windows Store to install the Datazen Publisher for Windows 8 and Windows 8.1+.

For more information about how Datazen can help your organization get insights from your on-premises enterprise data please visit: http://www.microsoft.com/en-us/server-cloud/products/sql-server-editions/sql-server-enterprise.aspx#sqlmobilebi

Wednesday, July 1, 2015 11:00:00 AM

Today we are pleased to announce availability of the Community Technology Preview (CTP) of the Microsoft JDBC 4.2 Driver for SQL Server! The driver provides robust data access to Microsoft SQL Server and Microsoft Azure SQL Database for Java-based applications.

The JDBC Driver for SQL Server is a Java Database Connectivity (JDBC) type 4 driver that implements full compliance with the JDBC specifications 4.1 and 4.2 and supports Java Development Kit (JDK) version 1.8. There are several additional enhancements available with the driver.  The updated XA Transaction feature includes new timeout options for automatic rollback of unprepared transactions. And, the new SQLServerBulkCopy class enables developers to quickly copy large amounts of data into tables or views in SQL Server and Azure SQL Database databases from other databases.

To use the SQLServerBulkCopy class, the basic flow is: connect to the source, then connect to the destination SQL Server or Azure SQL Database, create the SQLServerBulkCopy object and call WriteToServer.  The sample code below illustrates how the new class works:

// Obtain data from the source by connecting and loading it into a ResultSet

Connection sourceConnection = DriverManager.getConnection(connectionUrl);

 

String SQL = "SELECT * FROM SourceTable";

Statement stmt = sourceConnection.createStatement();

ResultSet resultSet = stmt.executeQuery(SQL);

 

// Prepare for SQL Bulk Copy Operation

Connection destConnection = DriverManager.getConnection(connectionUrl);

 

// Default options are shown here

SQLServerBulkCopyOptions options = new SQLServerBulkCopyOptions();

options.setBatchSize(0);

options.setBulkCopyTimeout(30);

options.setCheckConstraints(false);

options.setFireTriggers(false);

options.setKeepIdentity(false);

options.setKeepNulls(false);

options.setTableLock(true);

options.setUseInternalTransaction(false);

 

SQLServerBulkCopy operation = new SQLServerBulkCopy(destConnection, options);

operation.setDestinationTableName("DestinationTable");

operation.addColumnMapping(1, 1);

operation.addColumnMapping(2, "test2");

operation.addColumnMapping("test3", 3);

operation.addColumnMapping("test4", "test4");

 

// Perform the copy

operation.writeToServer(resultSet);

 

// Finished

operation.close();

For more details about what is currently supported in the bulk copy feature preview, please refer to What’s New in the JDBC driver.

The JDBC driver is part of SQL Server and the Microsoft Data Platform’s wider interoperability program, with drivers for PHP 5.6, Node.jsJDBC, ODBC and ADO.NET already available. 

You can download the JDBC 4.2 driver preview hereWe invite you to explore the latest the Microsoft Data Platform has to offer via a trial of Microsoft Azure SQL Database or by trying the new SQL Server 2016 CTP. We look forward to hearing your feedback about the new driver. Let us know what you think on Microsoft Connect.

Wednesday, June 24, 2015 11:00:00 AM

Following the release of SQL Server 2016 CTP 2.0 last month, the SQL Server engineering team is excited to announce the immediate availability of CTP 2.1 release for download. The release includes improvements for three new innovations releasing in SQL Server 2016 - Stretch Database, Query Store, Temporal - and Columnstore Index, introduced in SQL Server 2012:

  1. Stretch Database, which enables transparent stretching of warm and cold OLTP data to Microsoft Azure in a secure manner without requiring any application changes, includes the following fixes:
    1. Data migration does not trigger lock escalation in stretched tables, so no timeouts for INSERT or SELECT operations
    2. Automatic encryption and validation requirement of remote server certification, preventing “man-in-the-middle” security attacks
    3. Ability to run INSERT statement against updatable views created on top of stretch tables
  2. Query Store, the “flight recorder” which stores historical query plans and their performance characteristics, allowing DBAs to monitor and analyze plans and force a specific query plan on regression, includes:
    1. Parse statistics avg_parse_duration, last_parse_duration, avg_parse_cpu_time, last_parse_cpu_time removed from sys.query_store_query view
    2. The minimally allowed for flush_interval_seconds parameter, 60 seconds, is now verified in ALTER DATABASE statement
    3. Naming in sys.database_query_store_options and actual parameters in ALTER statements are in alignment with flush_interval_seconds and operation_mode
    4. Query Store on master and tempdb disabled and error message thrown ("Cannot perform action because Query Store cannot be enabled on system database master (tempdb)")
    5. Force_failure_count parameter is now cleared after plan is enforced
  3. Temporal, which enables handling and analyzing database records that changes over time, includes:
    1. Support for computed columns
    2. Support for marking one or both period columns with HIDDEN flag, allowing for frictionless migration for existing applications. The following applies to HIDDEN period columns:
        1. Column is not returned in SELECT * statements
        2. INSERT statement without column list do not expect inputs for HIDDEN columns
        3. Hidden column must be explicitly included in all queries that directly reference temporal table or other objects that reference temporal table (views, for example)
        4. BULK INSERT scripts that worked with non-temporal table (prior to adding system-versioning and hidden period columns) will continue to work and hidden columns will be auto-populated
        5. is_hidden flag is set to 1 in sys.columns view

  4. Columnstore Index includes:
    1. Improved seek performance
    2. Improved scan performance with partitioned tables

 

 

Wednesday, June 24, 2015 11:00:00 AM

From Tiffany Wissner, Senior Director, Data Platform

In the SQL Server CTP 2 blog, T.K. Ranga Rengarajan, CVP of SQL Server engineering, discussed bringing cloud first innovations to SQL Server 2016. This includes a rapid preview model that exists in Microsoft Azure today.  With the release of SQL Server 2016 Community Technology Preview (CTP) 2.1, for the first time customers can experience the rapid preview model for their on-premises SQL Server 2016 development and test environments.  This born in the cloud model means customers don’t have to wait for traditional CTPs that are released after several months for the latest updates from Microsoft engineering, and can gain a faster time to production.  The frequent updates are also engineered to be of the same quality as traditional major CTPs, so customers don’t have to be concerned about build quality.  In addition, customers have the flexibility to choose which CTP they will deploy for the development and test environment and are not forced to upgrade to the most recent preview release. 

In SQL Server 2016 CTP 2.1, available for download today, customers will see improvements to the Stretch Database technology released in CTP 2, along with improvements to the Query Store, Temporal Database and in-memory columnstore.  To learn more about the specific improvements for each feature please view the SQL Server 2016 Community Technology Preview 2.1 blog.

In addition to the new CTP model, SQL Server Management Studio (SSMS) is now being offered as a stand-alone install outside of the SQL Server release.  This means Azure SQL database customers can use SSMS for the new features being released in cloud as well as their on-premises SQL Server.  SSMS as a standalone tool will also follow the rapid preview model with frequent updates to keep pace with the new features being released in Azure SQL Database.  You can download the first Preview release of the standalone SQL Server Management Studio (SSMS) June 2015 Preview, here. To learn more about the features included please take a look at SQL Server Management Studio June 2015 Release Blog.

Download SQL Server 2016 CTP 2.1 preview today!

As the foundation of our end-to-end data platform, with this release we continue to make it easier for customers to maximize their data dividends. With SQL Server 2016 you can capture, transform, and analyze any data, of any size, at any scale, in its native format —using the tools, languages and frameworks you know and want in a trusted environment on-premises and in the cloud.

To learn more about the release, visit the SQL Server 2016 preview page.  To experience the new, exciting features in SQL Server 2016 and the new rapid release model in place download the preview or trial the preview using a virtual machine in Microsoft Azure and start evaluating the impact these new innovations can have for your business.  Also, be sure to share your feedback on the new SQL Server 2016 capabilities using Microsoft’s Connect tool.  We look forward to hearing from you!

Wednesday, June 24, 2015 10:30:00 AM

By Tiffany Wissner, Senior Director, Data Platform

At the Build conference in April, we announced Azure SQL Data Warehouse, our new enterprise-class elastic data warehouse-as-a-service. Today, we’re pleased to announce that Azure SQL Data Warehouse is open for Limited Public Preview.  

With growing data volumes, you have been telling us that you want to take advantage of the cost-efficiencies, elasticity and hyper-scale of cloud for your large data warehouses. You also need for that data warehouse to work with your existing infrastructure and tools, utilize your current skills and integrate with many sources of data. With Azure SQL Data Warehouse, we deliver:

  • The first enterprise-class elastic data warehouse that includes the separation of compute and storage, enabling customers to pay for what they need, when they need it
  • The ability to pause the database  so you only pay for commodity storage costs
  • Full SQL Server experience that includes PolyBase that allows you to combine queries for your structured and unstructured data using the skills you have today  
  • Hybrid options – your data, your platform, your choice

Azure SQL Data Warehouse is based on SQL Server’s massively parallel processing architecture currently available only in the Analytics Platform System (APS), and integrates with existing data tools including Power BI for data visualization, Azure Machine Learning for advanced analytics, Azure Data Factory for event processing and Azure HDInsight, Microsoft’s 100% Apache Hadoop managed big data service.

First enterprise-class elastic cloud data warehouse

This is the first elastic cloud data warehouse to offer enterprise class features our customers and partners expect and need like full indexing, stored procedure/functions, partitions, and columnar indexing. Without these features, organizations have to rewrite their existing applications and workflows at significant cost and slower time to market. This is the first fully managed cloud data warehouse to offer the compatibility SQL users need to migrate to the cloud.  Azure SQL Data Warehouse will be supported by a rich ecosystem of partners that support SQL Server today. Find out more about the partners we are working with from Garth Fort, GM of the enterprise partner group at Microsoft here.

And, because Azure SQL Data Warehouse independently scales compute and storage, users only pay for query performance as they need it.  Unlike other cloud data warehouses that require hours or days to resize, the service allows customers to grow or shrink query power in seconds. As you can scale compute costs separately from storage costs, costs are easier to forecast than competitive offerings.

Pause compute for lower costs

Elastic pause enables a customer to shut down the compute infrastructure while persisting the data and only paying for data storage. Customers can schedule pausing compute usage either through Azure SQL Data Warehouse or Azure Scheduler to optimize the cost of the service.

Leveraging existing SQL server skills with Hadoop

With the incredible growth of all types of data, the need to combine structured and unstructured data is essential. With Polybase, we offer the ability to combine data sets easily. SQL Data Warehouse can query semi-structured data stored in blob storage using familiar T-SQL, making it easy to gain insights from various data types.

Hybrid data warehouse – your data, your platform, your choice

Deploying on-premises data warehouse solutions can range from weeks to months. Azure SQL Data Warehouse takes seconds to provision. Cloud born data is more readily ingested into Azure SQL Data Warehouse. Other technologies such as Azure Data Factory and Power BI, which is available as a Connector to Azure SQL Data Warehouse today, provide the data management gateway which makes bringing data from on-premises sources to the cloud much simpler.  Azure SQL Data Warehouse also uses the same management as APS appliance to make hybrid management easier. 

 

One of the use cases for SQL Data Warehouse is the ability to cost effectively explore new insights and analytical workloads without impacting production platforms. This enables customers and partners to engage in new development and testing scenarios, based on real world requirements without the overhead production supportability and management constraints. 

The initial public preview is designed for data warehouses in the 5-10 TB range to give users the ability to start testing and providing feedback on the service. You can sign up now, and as we ramp the preview, new customers will be notified as they are accepted.

As with this Azure SQL Data Warehouse release, we continue to make it easier for customers to maximize their data dividends.  With our data platform you can capture, transform, and analyze any data, of any size, at any scale, in its native format —using the tools, languages and frameworks you know and want in a trusted environment on-premises and in the cloud.

To learn more about Azure SQL Data Warehouse and sign-up for the Public Preview, click here.

Wednesday, June 10, 2015 6:00:00 AM

Microsoft is committed to continuous innovation to make Azure the best cloud platform for running hyper-scale big data projects. This includes an existing Hadoop-as-a-service solution, Azure HDInsight, a hyper-scale repository for big data, Azure Data Lake, and Hadoop infrastructure-as-a-service offerings from Hortonworks and Cloudera. This week, Hortonworks also announced their most recent milestone with Hortonworks Data Platform 2.3 which will be available on Azure this summer.

Today, we are excited to announce that MapR will also be available in the summer as an option for customers to deploy Hadoop from the Azure Marketplace. MapR is a leader in the Hadoop community that offers the MapR Distribution including Hadoop which includes MapR-FS, an HDFS and POSIX compliant file store, and MapR-DB, a NoSQL key value store. The Distribution also includes core Hadoop projects such as Hive, Impala, SparkSQL, and Drill, and MapR Control System, a comprehensive management system. When MapR is available in the Azure Marketplace, customers will be able to launch a full Hadoop cluster based on MapR as an Azure Virtual Machine with a few clicks. Together with Azure Data Lake, SQL Server, and Power BI, this will allow organizations to build big data solutions quickly and easily by using the best of Microsoft and MapR. 

Our partnership with MapR allows customers to use the Hadoop distribution of their choice while getting the cloud benefits of Azure. It is also a sign of our continued commitment to make Hadoop more accessible to customers by supporting the ability to run big data workloads anywhere – on hosted VM’s and managed services in the public cloud, on-premises or in hybrid scenarios.

We are very excited be on this journey of making big data more readily accessible to accelerate ubiquitous adoption.  We hope you join us for this ride!

T.K. “Ranga” Rengarajan

Corporate Vice President, Data Platform at Microsoft

Tuesday, June 9, 2015 10:00:00 AM

Guest post by Rohit Bakhshi, Product Manager at Hortonworks Inc.

Over the past two quarters, Hortonworks has been able to attract over 200 new customers.  We are feeding the hunger our customers have shown for Open Enterprise Hadoop over the past two years.  We are seeing truly transformational business outcomes delivered through the use of Hadoop across all industries.  The most prominent use cases are focused on:

  • Data Center Optimization – keeping 100% of the data at up to 1/100th of the cost while enriching traditional data warehouse analytics
  • 360° View of Customers, Products, and Supply Chains
  • Predictive Analytics – delivering behavioral insight, preventative maintenance, and resource optimization
  •  Data Discovery – exploring datasets, uncovering new findings, and operationalizing insights

What we have consistently heard from our customers and partners, as they adopt Hadoop, is that they would like Hortonworks to focus our engineering activities on three key themes: Ease of Use, Enterprise Readiness, and Simplification.  During the first half of 2015, we have made significant progress on each of these themes and we are ready to share what we’ve done thus far.  Keep in mind there is much more work to be done here and we plan on continuing our efforts throughout the remainder of 2015.

This week Hortonworks proudly announced Hortonworks Data Platform (HDP)  2.3  - which delivers a new breakthrough user experience along with increased enterprise readiness across security, governance, and operations. HDP 2.3 will simultaneously be available on Windows Server and Linux, and will be supported for deployment on Azure Infrastructure as a Service (IaaS) virtual machines.

In addition, we are offering an enhancement to our support subscription called Hortonworks Smartsense™.

Breakthrough User Experience

HDP 2.3 eliminates much of the complexity administering Hadoop and improves developer productivity. We employ a truly Open Source and Open Community approach with Apache Ambari to put a new face on Hadoop for the administrator, developer, and data architect.

 We actually started this effort with the introduction of Ambari 1.7.0 which delivered an underlying framework to support the development of new web-based Views.  Now, we would like to share with you some of the progress we’ve made leveraging that framework to deliver a breakthrough user experience for both cluster administrators and developers.

Enabling the Data Worker

With HDP 2.3 we focus on the SQL developer and provide an integrated experience that allows for SQL query building, displaying a visual “explain plan”, and allowing for an extended debugging experience when using the Tez execution engine.  A screenshot of what we’ve developed shown below:

With the Hive View in Ambari, developers now have a web based tool to develop, debug and interact with remote clusters in Azure. Ambari’s web based tooling allows admins to securely and easily manage their clusters on Azure without having to log in and edit configuration files on the remote cluster.

In addition to the SQL builder, we are providing a Pig Latin Editor which brings a modern browser-based IDE experience to Pig, as well as a File Browser for loading file datasets into HDFS.

HDP 2.3 brings an entirely new user experience for Apache Falcon, our Data Lifecycle Management component. The new Falcon UI allows you to search and browse processes that have executed, visualize lineage and setup mirroring jobs to replicate files between clusters and cloud storage - allowing enterprises to seamlessly backup data to Azure Blob Storage.

Smart Configuration

For the Hadoop Operator, we provide Smart Configuration for HDFS, YARN, HBase, and Hive. This entirely new user experience within Ambari is guided and more digestible than ever before. 

Shown below is the new configuration panel for Hive:

 

YARN Capacity Scheduler

The YARN Capacity Scheduler provides workload management across application types and tenants in a shared HDP cluster. HDP 2.3 delivers a  new experience to configuring workload management policies.

Ambari Web based tooling allows admins to securely and easily manage their clusters on Azure without having to log in and edit configuration files on the remote cluster.

Customizable Dashboards

In HDP 2.3 we have developed customizable dashboards for a number of the most frequently used components.  This allows for each customer to develop a tailored experience for their environment and decide which metrics they care about most.  Shown below is the HDFS Dashboard:

Enterprise Readiness: Enhancements to Security, Governance, and Operations

HDP 2.3 delivers new encryption of data-at-rest, extends the data governance initiative with Apache Atlas, and drives forward operational simplification for both on-premise and cloud-based deployments.

This release expands the fault tolerance capabilities of the platform to withstand failures - with high availability configuration options for Apache Storm, Apache Ranger, and Apache Falcon that power many mission critical applications and services.

In HDP 2.3, a number of significant security enhancements are being delivered.  The first of which is the HDFS Transparent Data at Rest Encryption.  This is a critical feature for Hadoop and we have been performing extensive testing as part of an extended technical preview.  As part of providing support for HDFS Transparent Data at Rest Encryption, Apache Ranger provides a key management service (KMS) that leverages the Key Management Provider API and can be directly leveraged to provide a central key service for Hadoop.  There is more work to be done related to encrypting data at rest, but we have confidence that a core set of use cases are ready for customers to adopt and we will continue to expand the capabilities and eliminate a variety of limitations over the coming months.

Other important additions related to Apache Ranger included providing centralized authorization for Apache Solr and Apache Kafka.  Security administrators can now define and manage security policies and capture security audit information for HDFS, Hive, HBase, Knox, and Storm along with Solr and Kafka.

Shifting to data governance, we launched the Data Governance Initiative (DGI) in January of 2015 and then delivered the first set of technology along with an incubator proposal to the Apache Software Foundation in April.  Now the core set of metadata services are being delivered along with HDP 2.3.  This is really the first step on a journey to address data governance in a holistic way for Hadoop.  Some of the initial capabilities will ease data discovery with a focus on Hive and establish a strong foundation for the future feature additions as we look to tackle Kafka, Storm, and integrating dynamic security policies based on the available metadata tags.

In addition to the new user interface elements described earlier, Apache Falcon enables Apache Hive database replication in HDP 2.3.  Previously, Falcon provided support for replication of files (and incremental Hive partitions) between clusters, primarily to support disaster recovery scenarios.  Now customers can use Falcon to replicate Hive databases, tables and their underlying metadata -- complete with bootstrapping and reliably applying transactions to targets. 

Finally on the operations front, the pace of Apache Ambari innovations continues to astonish.  As part of HDP 2.3, Ambari arrives with support for significantly wider range of component deployment and monitoring support than ever before.  This includes the abilities to install and manage: Accumulo, DataFu, Mahout, and the Phoenix Query Server (Tech Preview) along with expanding its ability to configure the NFS Gateway capability of HDFS.  In addition, Ambari now provides support for rack awareness – allowing you to define and support the visualization of your data topology by rack.

We introduced the automation for rolling upgrade as part of Ambari 2.0, but this was primarily focused on automating the application of maintenance releases to your running cluster.  Now, Ambari expands its reach to support rolling upgrade for feature bearing releases as well.  Automating the ability for you to roll from HDP 2.2 to HDP 2.3.

Following the general availability of HDP 2.3, Cloudbreak will also become generally available.  Since the acquisition of SequenceIQ, the integrated team has been working hard to complete the deployment automation for public clouds like Microsoft Azure.

With Cloudbreak, operators will be able to seamlessly deploy elastic HDP clusters to Azure IaaS virtual machines. HDP will efficiently utilize Azure resources, with policy-based autoscaling to expand and contract clusters based upon actual usage metrics.

Operators will be able to deploy using a Cloudbreak web interface as well as a RESTful API.

Proactive Support with Hortonworks SmartSense™

In addition to all of the tremendous platform innovation, Hortonworks is proud to announce Hortonworks SmartSense which adds proactive cluster monitoring and delivers critical recommendations to customers who opt into this extended support capability.  The addition of Hortonworks SmartSense further enhances Hortonworks’ world-class support for Hadoop.

Hortonworks’ support subscription customers simply download the Hortonworks Support Tool (HST) from the support portal and deploy it to their cluster.  HST then collects configuration and other operational information about their HDP cluster and packages it up into a bundle.  After uploading this information bundle to the Hortonworks’ support team, we use our platform to analyze all the information it provides using more than 80 distinct checks performed across the underlying operating system, HDFS, YARN, MapReduce, Tez, and Hive components. 

Of course, there is so much more that I didn’t cover here which is also part of HDP 2.3!  There has been meaningful innovation within Hive for supporting Union within queries and using interval types in expressions, additional improvements for HBase, Phoenix, and more.  But, for now, I’ll leave those for subsequent blog posts that will highlight them all in more detail.

In closing, I would like to thank the entire Hortonworks team and the Apache community - including Microsoft developers - for the hard work put in over the past six to eight months.  That hard work is about to pay off in a big way for folks adopting Hadoop today as much as it will delight those who have been using Hadoop for years.

Site Map | Printable View | © 2008 - 2015 SQLTrainer.com, LLC | Powered by mojoPortal | HTML 5 | CSS | Design by mitchinson