TD:LR

Data Mesh 4.0.4 is only available for a very short time. please ensure you scroll to the bottom of the article to understand the temporal nature of the Data Mesh 4.0.4 approach.

ADI - AgileData.io

This article was published on 1st April 2022 as an April Fools Joke.

ADI - The Joker Fish

What is Data Mesh 4.0.4?

The big book of Data Mesh has just been released, but at AgileData we are way ahead of the curve and we were already working on the next generation, modern, big data mesh stack version, which we have coined Data Mesh 4.0.4.

Where Data Mesh 1.0 describes four core principles of:

domain-driven ownership of data
data as a product
self-serve data platform
federated computational governance

Here at AgileData we believe these principles are already legacy and a new modern set of principles are required. Being recognised as thought leaders in the data mesh space (at least within our own team google meet calls) we have revised these principles with a 4.0.4 version.

The new Data Mesh 4.0.4 principles are:

attribute-level ownership of data
AI neural net data delivery
5 star silver service data platform
sheriff based governance

Attribute-level ownership of data (AOOD)

Domains are big and complex

One of the problems with domain driven design is that domains often contain a lot of data. We might break the big elephant down into smaller elephants using domain design, but they are still elephants none the less, and often still a pain in the butt.

Think of domains based on organisational silo's. Sales, HR, Operations etc, those domains contain lots of data, customers, orders, employees, leave request data to name but a few.

Think of domains based on core business processes that cross organisational silo's, for example "customer orders product from an employee in a store". Again lots and lots of data.

The poor old domain team will be very busy working with all that data, and thats not fair on the team.

Then apply a lens of master data, golden records or single answer to a single question. Do we have domain teams on domain teams, aka master data domain teams?

Those teams would have to talk to lots of other teams, and we all know how much time is wasted in meetings between people, its so much better when teams work in complete isolation.

This domain driven design thing seems big, complex and involves lots of work.

At AgileData we have developed a much easier Data Mesh 4.0.4 approach called Attribute-level data design.

Attribute-level data design approach

In this approach each data team looks after one attribute and one attribute only. For example customer date of birth.

This team is responsible for collecting the customer date of birth from any source system or application where it is captured. The team then load this attribute data into their own data repository, ideally creating a set of source specific Operational Data Stores (ODS) and then a Master Data Store with a single conformed view of the attribute.

The good news with this approach is as lots of people share the birth date of 1st April 2022, you will save heaps on data storage as it will only be stored once, no matter how many customers have this date of birth.

Lets look at some of the benefits of the Attribute-level data design approach

Attributes are easy to identify

Customer date of birth is customer date of birth, there is very little augment on that definition, so its a very simple thing to define, no cross domain boundary arguments on that one.

Hyper-specialisation

Dates of birth can be stored in multiple date formats, 2022-04-01 or 01/04/2022, but fear not. The customer date of birth attribute-level ownership team will become experts in the various date formats and will be hyper-specialist experts in this work in next to no time.

We will also see new hyper-specialised roles appear, such as customer date of birth data engineer, to join the other roles that have appeared such as Analytics Engineer.

Lots of free time = happy team

There aren't that many customer date of birth attributes in an organisation and that will leave the team with lots of time to do other things they enjoy. They can spend more time on refining their Wordle technique or finding funny cat videos. And we all know the importance of a happy and relaxed team.

Gosh, they could always spend more time working on their next personal start up idea, removing the need to do it after hours and in the weekends.

One potential downside

One potential downside on this approach is we end up with lots and lots of teams, and that introduces a challenge on how you co-ordinate them all. We are working on this right now and in our next version, Data Mesh 20230401, we plan to release the new Hive attribute-level data design approach.

After all bee's seem to be able to work together so easily, why can't millions of data peeps.

AI neural net data delivery (ANNDD)

Data as a product means you still have to go shopping

One of the problems of Data as a Product is the customer still has to go shopping to get the data they want.

They have to go into the data product catalog, they have to search and find the data product they need, they have to look at the data product description to make sure its the data they want. Then they have to do the heavy lifting to get the data product they want and transport the data to where they can use it. Its as painfull as going to the grocery store to get food.

While we can see value in the data product being nicely packaged up for them, rather than a set of raw data ingredients, we think there is a better way.

We thought Data Mesh 4.0.4 could be the Instacart of data, where personal shoppers could put the data you need together and deliver it right to you. But you still needed to go to the effort to browse and select the data products. We thought it could be the UberEats of data, where you pick an outcome, like you select a ready made takeaway with UberEats, and somebody would deliver the data product to you, but we decided that was so 2021.

At AgileData we have developed a much better Data Mesh 4.0.4 approach called AI neural net data delivery.

AI knows what you want

Data consumers wear "a cap connected to an electroencephalography (EEG) machine" which reads their brain waves and determines which data product will deliver the data that best matches their current thoughts.

An industry leading AI neural net machine learning engine reads the data consumers brain waves to determine their deepest data desires. This technology has yet to be developed, but we believe a number of vendors are about to publish vendor washing articles claiming their current legacy technology does this very thing.

Storage is cheap

This data is streamed directly into the data consumers brain, removing the need to store the data on a digital device, such as a laptop, tablet or smart phone. Its a little known fact that that the brain has a massive amount of unsed storage "This storage capacity is an amount over 74 Terabytes (just in the cerebral cortex alone)"

A truely human mesh

In another industry first data, is transferred via a human mesh. Each data mesh cap communicates directly with all the other data mesh cap wearers, ensuring blazingly fast network connectivity.

One small downside with the current data mesh cap technology, is each data mesh cap wearer has to hold a tin foil lined umbrella above their heads to focus the connectivity. We expect Data Mesh Caps version 20240401 to be released in the future to remove this need, as there have been complaints from some data consumers that they get a sore arm after a few hours of holding the umbrella up.

However we have seen a massive take up of the Data Mesh Cap in the United Kingdom. primarily due to the fact it rains so often in the UK, they are used to constantly holding an umbrella up anyway.

5 star silver service data platform (5SSSDD)

Self Service is a buffet of yuck

Some people love self service, you only have to go to a buffet style restaurant to see how much.

However their are a few problems with the self-service buffet style approach.

One problem is you are not in charge of what you are presented with, you have to choose from what you are given. Maybe you prefer Vegan, but the buffet is nothing but fried food.

Another problem is sometimes you put together combinations that are just wrong on your plate and then you feel like you have to eat it. Deep fried mars bar anyone?

Photograph: PA Photograph: Danny Lawson/PA

And so it is with self service data platforms, you are given what somebody else thinks is a good platform and forced to use it. Even worse the platform could have be designed and built by a team of project managers and contractors who have never worked in the data space and will bugger off once it has been built. Or it could be a legacy technology platform that has been vendor washed as supporting Data Mesh 4.0.4. and somebody fell for the ruse and brought it.

Lucky Data Mesh 4.0.4 solves these problems with the a 5 star silver service data platform

Just like your own personal data butler

Your personal data butler is on 24x7 call to meet your every data platform need, providing a 5 star silver service.

Need to wrangle, mash and blend your data to provide a single view of your customers, your data butler will whip that up in a jiffy and deliver to you the most delicious data cocktail on a silver platter.

Need to understand you cost to acquire, again your data butler will calculate that for you and bring it to you in their best attire.

Need to slice and dice your data, not anymore, your butler has the worlds best set of Ginsu knives and will deliver your data to you in the finest of slivers.

Have a problem with the quality of your data, fear not, your data butler will take the offending data dish away and immediately deliver you a replacement with the freshest, cleanest data you have ever seen.

Why would you lift a finger serving yourself when you can just sit back and enjoy your 5 star silver service.

Free like beer or free like puppies?

While the 5 star silver service data platform is not a cheap service, it is probably still cheaper than the cost of the 20 data engineers you need to cobble together the 25 different open source data platform Jenga components in your bespoke Modern Data Self Service Data Platform Stack (MDSSDPS).

Sheriff based governance (SBG)

Meetings, meetings everywhere, but not a decision is made

We all know what data governance is like, endless committee meetings with lots people talking waffle, and no real decisions being made.

Federated data governance is even worse, we still have lots of meetings where things are "decided" and then we have to try and encourage the domain teams and the platform teams to play nicely by the vague rules the committee made up.

Its like the United Nations talking about climate change, but for data.

Our last Data Mesh 4.0.4 principle of Sheriff based governance solves all these problems.

Introducing the data sheriff

"If the sheriff sounds like something from the American frontier, that’s because it is. The role of sheriff goes back to England where sheriffs were usually appointed by the Crown and other officials to oversee the laws of the shire, or county. Duties included tax collection and running a local militia, also called the posse comitatus—citizens who would moonlight as law enforcement." https://theappeal.org/the-power-of-sheriffs-an-explainer/

Attributes-level team enforcer

Each of the attribute-level data teams have their own data Sheriff. The Sheriff is there to ensure the team complies with all the governance rules that have been set, regardless of the quality of the rule. Its a kinda of a "shoot first, ask questions later" pattern.

If a team member does not comply with the rules, that team member is "removed".

Team vs team enforcement

If an attribute-level data teams or a 5 star silver service platform team believe one of the other teams are not following the Sheriff code of conduct, they can call the other team out.

In that scenario the Sheriffs from both teams enter into a quick draw show down using the latest days Wordle puzzle.

The losing team gets removed and that teams attribute is also removed from the data platform. One of the key platform features you will need for this to happen is Column Level lineage, which is great news as we have finally found a use for that often asked for feature.

Visual Governance Rules

To help the teams and the teams Sheriff to understand what rules need to be enforced, we are adopting a communication pattern that has been adopted and proven to work across many workplace shared kitchens.

Just like the old "your mother doesn't work here" signs, data governance signs will be posted in all team areas. For example "no test, no deploy", or "tho shall document tho code"

Unfortunately we realise that with the advent of remote working as a result of COVID , your mother may actually work in the same place as you do. And that physical posters in your physical offices may not be visible when you are remotely working from home. But don't worry we plan to resolve this problem in the Data Mesh 20230401 update.

Its the 1st of April peeps!

AgileData.io

Keep making data simply magical

A delicious April fools dessert

A fool is an English dessert. Traditionally, fruit fool is made by folding pureed stewed fruit into sweet custard