Toad World Blog

Azure Data Factory and SSIS compared

Mar 1, 2017 11:00:24 AM by James Serra

I see a lot of confusion when it comes to Azure Data Factory (ADF) and how it compares to SSIS.  It is not simply “SSIS in the cloud”.  See What is Azure Data Factory? for an overview of ADF, and I’ll assume you know SSIS.  So how are they different?

SSIS is an Extract-Transfer-Load tool, but ADF is a Extract-Load Tool, as it does not do any transformations within the tool, instead those would be done by ADF calling a stored procedure on a SQL Server that does the transformation, or calling a Hive job, or a U-SQL job in Azure Data Lake Analytics, as examples.  Think of it more as an orchestration tool.  SSIS has the added benefit of doing transformations, but keep in mind the performance of any transformations depends on the power of the server that SSIS is installed on, as the data to be transformed will be pushed to that SSIS server.  Other major differences:

  • ADF is a cloud-based service (via ADF editor in Azure portal) and since it is a PaaS tool does not require hardware or any installation.  SSIS is a desktop tool (via SSDT) and requires a good-sized server that you have to manage and you have to install SQL Server with SSIS
  • ADF uses JSON scripts for its orchestration (coding), while SSIS uses drag-and-drop tasks (no coding)
  • ADF is pay-as-you-go via an Azure subscription, SSIS is a license cost as part of SQL Server
  • ADF can fire-up HDInsights clusters and run Pig and Hive scripts.  SSIS cannot
  • SSIS has a powerful GUI, intellisense, and debugging.  ADF has a basic editor and no intellisense or debugging
  • SSIS is administered via SSMS, while ADF is administered via the Azure portal
  • SSIS has a wider range of supported data sources and destinations
  • SSIS has a programming SDK, automation via BIML, and third-party components.  ADF does not have a programming SDK, has automation via PowerShell, and no third-party components
  • SSIS has error handling.  ADF does not
  • ADF has “data lineage“, tagging and tracking the data from different sources.  SSIS does not have this

Think of ADF as a complementary service to SSIS, with its main use case confined to inexpensively dealing with big data in the cloud.

Note that moving to the cloud requires you to think differently when it comes to loading a large amount of data, especially when using a product like SQL Data Warehouse.

More info:

Azure Data Factory vs SSIS

 

Tags: SQL Azure

James Serra

Written by James Serra

James is currently a Senior Business Intelligence Architect/Developer and has over 20 years of IT experience. James started his career as a software developer, then became a DBA 12 years ago, and for the last five years he has been working extensively with Business Intelligence using the SQL Server BI stack (SSAS, SSRS, and SSIS). James has been at times a permanent employee, consultant, contractor, and owner of his own business. All these experiences along with continuous learning has helped James to develop many successful data warehouse and BI projects. James has earned the MCITP Business Developer 2008, MCITP Database Administrator 2008, and MCITP Database Developer 2008, and has a Bachelor of Science degree in Computer Engineering.