The 3 patterns of AgileData AI

22 Dec 2023 | AgileData Network, AgileData Product, Blog

TD:LR

Having AI embedded in your product have become table stakes it seems.

I have been thinking about our approach to AI in our product for a while and landed on 3 patterns that I use as a reference.

  1. Ask AI
  2. Assisted AI
  3. Automated AI
Shane Gibson - AgileData.io

In one of our regular catchup sessions with an AgileData Network Partner this week they mentioned that in the conversations they were having with prospective customers, the question of AI in data was being constantly raised.

Having AI embedded in your product have become table stakes it seems.

I have been thinking about our approach to AI in our product for a while and landed on 3 patterns that I use as a reference.

  1. Ask AI
  2. Assisted AI
  3. Automated AI

Ask AI

Ask AI is where a human asks a question and the machine replies, this is repeated until the human gets what they need, then the human goes and undertakes the data work.

This is the now typical “ChatGPT” pattern. The user starts a conversation, asks a question, gets a reply and asks the next question.

Lets use two examples, answering business questions and understanding data to highlight this pattern.

For the answering business questions , the typical example is the Text to SQL pattern, where the Information Consumer asks how many customers there were last month using natural language and the machine creates some SQL, runs it and returns the answer. Then the Information Consumer asks their next question and gets the next response.

Then they move on to take the actions required to create an outcome and hopefully that outcome results in business value.

For understanding data , an example would be a Data Engineer collecting the data from a System of Capture, the Data Engineer passes the schema for all the tables to the LLM and asks it to indicate which system the data comes from and the LLM responds with something like Shopify. The Data Engineer can then ask what is the likely Ontology for this data and then what is the likely Taxonomy all based on that schema.

They then move onto the manual data work of Designing the Data Model, creating the code to transform the data to load it into the data model.

The key to this pattern is the human asks a question, gets some information, asks another question, gets more information and finally the human does the data work themselves.

The value of the pattern is it reduces the cognition required to do the data work.

Assisted AI

Assisted AI, is where the machine helps the human do the data task that is required, without the need for the human to ask.

Lets use the same two examples, answering business questions and understanding data.

For answering business questions, an example is an Information Consumer using a Viz tool to graph data. The tool looks at the shape of the data and suggests the best graph type that should be used to visualise data of that shape. The Information Consumer can accept that recommendation and use that graph type or override it and create whatever graph type they want.

For understanding data, an example is a Data Engineer who needs to create transformation code to define new Concepts. The machine looks at the schema and data examples from the Systems of Capture and suggests that the Data Engineer creates Concepts for Customers, Orders, Products, Order Lines and Payments. It also suggests the unique business keys that can be used to define each of those Concepts. The Data Engineer can accept those recommendations and the transformation code is created for them to review and publish.

The key to this pattern is the machine is making a recommendation which the human can accept or ignore during the process of doing the data work.

The value of the pattern is it reduces the time and the cognition required to do the data work.

Automated AI

Automated AI, is where the machine does the work, the human is no longer involved.

For answering business questions, an example would be where the machine observes all the business questions that are asked and answered overtime, determines what the intent and outcome of those questions are, and then generates and sends answers to new questions directly to the Information Consumer or an action directly to the System of Action.

For example it might determine that Information Consumers are constantly looking for # of abandoned carts and so it will send a daily alert with the number of abandoned carts on the ecommerce store to the relevant Information Consumer. Or it might go one step further and push the abandoned carts to the System of Action that will send the customer an email reminding them that they have left something in the cart.

For understanding data, an example would be the first time data has been collected from a new System of Capture the machine will determine the relevant Concepts that should be created, automatically create the Data Model and Transformation code to create them, publish that code and execute it so the data is ready for consumption.

The key to this pattern is the machine is doing the data work, the human is not involved at any step in the process.

The value of the pattern is it removes the need for the human to spend time doing the data work and removes the need for the human to have the experience and the cognition required to do the data work.

An AgileData Platform and App view of these three patterns

I often joked in the old waves of Data Mining and then Data Science that 90% of all analytical models were “Group By’s”, 5% were linear regression and then a small percentage were advanced techniques like neural networks etc.

I still think the same is true in the AI wave.

We did some work a while ago in the AgileData Platform using KMEANS to determine load anomalies. We found that using some Group By patterns we were able to detect the same load anomalies, but at a much cheaper compute cost. Obviously we need to keep experimenting and testing this and at some stage we will find the use case where we need to move to more advanced patterns.

So here is where my thinking is at with the 3 patterns of AI in the AgileData Product and App.

AgileData Ask AI

We have experimented multiple times with this pattern over the five years we have been building out AgileData.

Initially using some NLP patterns, then leveraging Google Clouds alpha Q&A service. Each time the level of effort and cost to build this was higher than we wanted to spend and so it was deprioritised.

Like most companies who provide data tooling and technology we are now of course experimenting with LLM’s to see how we can embed automated text to sql capabilities, and more importantly experimenting to see how we can ensure it does not hallucinate and give the Information Consumers invalid answers to their Business Questions.

The emergence of LLM’s and Generative AI tools and patterns means we can finally implement this capability in the AgileData App, at a lower level of effort and cost. We built the first version leveraging OpenAI and ChatGPT, and as we are closely bound to Google Cloud we have implemented the latest experiments using Vertex AI and Bison.

We are also experimenting with how we can use Ask AI to help understand where to go in the AgileData App to get a data task done, and to reduce the cognition needed to use the App.

We will keep iterating these to improve the usability, expand its use cases and also reduce the chances of the dread hallucinations (aka when the machines makes shit up and provides answers the user cannot trust)

AgileData Assisted AI

We have had ADI, our version of a chat bot, in the AgileData App, since day one. Originally she was your typical chatbot “phone menu”, where she presented a bunch of menu options.

Then she became a feedback loop, when you are doing a task in the AgileData App, she explains what will happen next and asks you to confirm. If the thing you are about to do is semi destructive she will explain the impact in more details and ask you to confirm again.

We will move her to provide more recommendations and assistance over time, so when you are doing a data task in the AgileData App, she will recommend things that will reduce the time or cognition you need to complete that task.

The balance will be making her useful and not useless and annoying like Microsoft Clippy was.

AgileData Automated AI

Again we have been automating data tasks in the AgileData Platform and App since day one.

We do automated anomaly detection to notify you when the data collection looks out of whack.

We automatically check the Data Design is still valid whenever new data is loaded into it. For example we check that the unique business keys for all Concepts are still unique. Originally you used to have to manually add a Trust Rule for this each time you define a new business Concept, now we do it automagically in the Platform, so you don’t need to do that effort (or remember to do it).

You can also ask for the Concepts and Details to be automatically designed, generated and loaded from a System of Capture table, and the AgileData Platform will automatically do that Data Work with no input required from you. We will extend this out to enable it to do the same automated work across multiple tables from a single System of Capture and then across multiple systems.

Group By vs Deep Learning

Often the automation we have built is based on simple patterns, simple code, simple group by’s, not large complex deep learning models.

But in my view that’s not the point, who cares what the kitchen looks like as long as I get the meal and experience I wanted.

And that is why we focus on the three patterns of AgileData AI:

  1. Ask AI
  2. Assisted AI
  3. Automated AI

Each of these are bound by the way a user interacts with the machine, asking questions, getting recommendations, having the work done for them (aka no interaction needed).

How we build the magic in the backend is our problem.

Keep making data simply magical

AgileData is focussed on removing the complexity of managing data in a simply magical way.

Automated versioning of change rules and data is one small step in reducing that complexity.

AgileData

Do more with less

We remove the need to build a large dedicated team of expensive data experts, by reducing the effort to do the data work and by doing the data work for you

Without AgileData

No AgileData Team - Data Engineers

With AgileData

Google Cloud Ready BigQuery