Data Architecture as a Service (DAaaS)

31 Jan 2023 | AgileData Product, Blog

TD:LR

Data Architecture as a Service (DAaaS), is it Buzzwashing or not?

As is often the case, it depends on your point of view.

Our point of view? It’s a real thing.

Shane Gibson - AgileData.io

Buzzwashing Sucks!

At AgileData we are not great fans of the constant #Buzzwashing that happens in the data technology space.  For us Buzzwashing is where a vendor positions their product as supporting the latest wave of buzz words, but under the covers it is the same old product they have had for a while.

We saw that behaviour happen with the Data Mesh buzz words in the wave of technology changes that has happened over the last couple of years, and we saw it with the Big Data buzz words that happened in the previous Hadoop technology wave.

It is why you will never see AgileData talk about how we support Data Mesh.  To us the team topology change pattern in Data Mesh is far more important than the technology patterns to enable distributed ways of working.  We enable teams to work based on any team topology, so in theory that means we support Data Mesh, but we don’t want to be one of those #Buzzwashing companies.

Eckerson Group Rocks!

I have personally been a big big fan of the Eckerson Groups thoughts and content for a long time.  I would probably only rate Mark Rittman higher in the clear and useful data content space.

I still use the “Silver Service vs Self-Service” positioning and article from Eckerson when creating a Data Blueprint for customers and I need to explain the consequence of implementing a self service only pattern for Information Consumers (some people just want to be served the data).

DAaaS – mmmmm yup!

Eckerson Group has started talking about Data architecture-as-a-service (DAaaS)

I am not a great fan of the term itself, as we think it is more than just data architecture that is required to reduce the complexity of managing data, but we are aligned with the core principles and patterns they describe to explain what DAaaS is.

To quote them:

“DAaaS is not a new concept. Architects and engineers have long created development templates to foster reuse, accelerate development, and improve accuracy and efficiency. But DAaaS commercializes this approach, baking templates and automation into GUI-based development environments geared to business users.

DAaaS is a metadata-driven approach that auto-generates code and documentation. It also abstracts the underlying platform so data architects and engineers can change or update the platform without impacting business users. On the whole, DAaaS promises to govern self-service, foster reuse, reduce errors, and speed development.”

Ok this ticks some boxes for us, lets highlight which ones:

“DAaaS is not a new concept. Architects and engineers have long created development templates to foster reuse, accelerate development, and improve accuracy and efficiency. But DAaaS commercializes this approach, baking templates and automation into GUI-based development environments geared to business users.

DAaaS is a metadata-driven approach that auto-generates code and documentation. It also abstracts the underlying platform so data architects and engineers can change or update the platform without impacting business users. On the whole, DAaaS promises to govern self-service, foster reuse, reduce errors, and speed development.”

If you want to read the full research report from the Eckerson Group then download it from this link https://www.eckerson.com/register?content=data-architecture-as-a-service-empowering-the-business-to-build-compliant-pipelines

The report is sponsored by Coalesce (as in the software company not the dbt Labs conference), who has exclusive permission to syndicate its content.

DAaaS Criteria

Eckerson Group outline the criteria for a DAaaS products.  Lets apply an AgileData lens to those criteria.

“Configurable. Architects or data engineers can configure the product using templates, blocks, or other constructs to simplify the development of data pipelines.”

Yes and No.

Yes to the patterns of configuration and simplification are built as a core part of AgileData all day long.  Analyst’s can configure the AgileData Product using the building blocks we provide.

Our target user is a Business or Data Analyst not an Architect or Data Engineer. 

“Multi-code. The product supports no-code, low-code, and all-code development environments so that it can be used by all types of users.”

Yes and No.

We see ourselves as only supporting low code. Our aim is to enable the analyst to do the work without ever having to write code, and we are hyper focussed on enabling those analysts, not all the breadth of data Personas.  But we are experienced in the data domain and therefore realistic enough to know we will always need the ability to drop in custom code blocks to change the data in a way we have not built a sexy wizard for (yet), so thats what we enable. 

We describe our pattern as low code. 

“Metadata-driven. The product automatically generated SQL code that developers can view and modify, if desired.”

Yes and No. 

AgileData automatically generates the SQL code needed to get the data job done, so yes to that pattern.   

We don’t want the analysts hacking that code, so no to that pattern.  Strongly protecting the underlying patterns that are used to design, change and validate the data is the one the core ways we keep the cloud analytics database costs so low and enables us to pass that cost saving onto our customers.

“Universal updates. Developers can update a template in one place and ripple changes automatically to all solutions that leverage the code.”

Yes and No.

As a SaaS product we do universal updates natively.  For example when we change one of the rule templates in the AgileData product, lets say we are deploying a new version of the “Filter Wizard”, we use a staggered deployment process to deploy this universally across all our customer tenancies.  So yes to that pattern. 

The complexity and potential impact of environment wide changes invoked by the analyst changing a fundamental code template scares the dickens out of us.  Imagine the impact on your reporting tools if they did a global change to the templates to switch table prefixes to table suffixes (not to mention the cost of your Cloud Analytics Database compute to rebuild the tables).  Or imagine the implication on the cloud computing costs if they altered a template that used a merge statement leveraging partitions to reduce the cost and changed it to use truncate and create statement that caused a full rebuild of the tables every time new data was collected.  So no to that pattern for now. 

As we get better at making those global template changes ourselves and rolling them out across all our customer tenancies, we will find ways to build the belts and braces needed into the product to provide this as a self service capability in the future, and allow technical personas to define their own rule templates.  We will probably start by providing this capability to our partner magicians first so we can iterate the patterns.

“Platform agnostic. The product runs on multiple data platforms, adjusting SQL output as needed, and architects can migrate platforms without rewriting code.”

No. 

The cost of being able to run on any cloud platform is to high.  You either have to engineer it to the lowest common denominator, aka an architecture based on containerisation not serverless, or maintain multiple code bases that allow you to leverage the unique capabilities of each cloud provider.  We see ourselves as a SaaS company not a PaaS company, and just like SaaS products like Salesforce, we abstract the technical complexity away from our customers, so they shouldn’t care what cloud platform we utilise.

This does raise a question in my head, is DAaaS more PaaS centric than SaaS centric if it needs to be Platform agnostic?

“Connectors. The tool connects to multiple source systems and loads it into multiple targets.

Yes and No.

We collect data from multiple Systems of Capture, but we only load it into AgileData (and we use BigQuery under the covers). So yes to the multiple source systems . 

We could say that we do Reverse ETL / Data Activation to meet the multiple target pattern, but that’s not what we think the intent is and we don’t want to Buzzwash it. So no to multiple targets.

Orchestration. The product executes transformations and tasks according to a workflow designed within a directed acyclic graph.”

Yes.

When you create a Change Rule (transformation code) that rule is included and visible in our lineage map, which is the equivalent of a dynamically generated DAG. When new data arrives into AgileData all the rules that are dependent on that data are executed.  So while we operate a very different UI pattern to the typical pattern of creating independent DAG / pipelines that bound a bunch of data transformations, we provide the equivalent capability, through our Rules UI, rules engine and dynamic DAG/Orchestration patterns.

“Data lineage. The tool tracks lineage of data elements from source to target, both at a table level and column level.”

Yes.

We track the lineage of data at a table, column and rule level.  The Data Map provides the UI to see it at a table and rule level. We have yet to design a UI we like to expose the lineage at a column level that meets our criteria of simplicity. 

“Data catalog. Users can search and reuse existing data pipelines or data assets to speed development and promote reuse”

Yes.

Reuse is at the core pattern within AgileData.  And in our opinion any data tool that doesn’t have a catalog capability to be able to browse or search for the things it creates and holds, should go back to the drawing board.  So of course we built a magical data catalog capability into our product.

“Documentation. The tool automatically generates documentation to speed onboarding of new developers and support auditors.”

Yes.

As everything an analyst does in AgileData is stored in what we call “Config”, then any other analyst can see it, and if you grant them access so can the Auditors.  And given its a core part of the AgileData platform, the documentation is never orphaned from the system.

Scorecard

As I went through and responded with how AgileData meets each of the DAaaS criteria, I ended up with a lot more Yes and No combo’s than I expected.

The question seems to be not do we meet the criteria patterns, but do we implement it in the way it is described.  As always the devil is in the details.

If I look at the 10 core DAaaS patterns of:

  1. Configurable
  2. Multi-code
  3. Metadata-driven
  4. Universal updates
  5. Platform agnostic
  6. Connectors
  7. Orchestration
  8. Data lineage
  9. Data catalog
  10. Documentation

We agree 9 of these core patterns are required for the new wave of bundled data platforms that will start appearing in 2023, and that we support them, without any worry that we are #Buzzwashing.

In a webinar The Eckerson Group hosted they used this slide as a summary:

Eckerson_Webcast_-_Data_Architecture_as_a_Service_V2

Again we believe we meet all the characteristics and deliver all the benefits outlined in this slide, apart from the Change target platforms easily.  But again we have worked long enough in the data domain to see small fortunes spent “re-platforming” the technology and delivering little or no business benefit.  The focus should always be delivering business value.

So are we DAaaS?

Does that mean we are a DAaaS product, if we don’t match all the detailed technical patterns described by the Eckerson Group?  Well that’s really up to them to decide.

From an AgileData point of view, we are pretty happy with the patterns we used under the covers to power our SaaS Data Product in a simply magical way (but we are always iterating them regardless).

Photo by Webstacks on Unsplash

 

Keep making data simply magical

AgileData.io provides both a Software as a Service product and a recommended AgileData Way of Working.  We believe you need both to deliver data in a simply magical way.

There are a core set of patterns we believe are tables stakes for any Data Platform, so of course we have built those patterns into the AgileData Product.

AgileData.io