AI Data Agents with Joe Reis

Jun 26, 2024 | AgileData Podcast, Podcast

Guests

Resources

https://joereis.substack.com/

Join Shane Gibson as he chats with Joe Reis on how the potential adoption of GenAI and LLM’s in the way Data teams work.

Listen on your favourite Podcast Platform

Podcast Transcript

Read along you will

Shane: Welcome to the Agile Data Podcast. I’m Shane Gibson.

Joe: And I’m Joe Reis.

Shane: Hey Joe, good to have you on the show again. Today I wanted to cover off this whole area of Data AI agents or GenAI / LLM.

And the reason I’ve got you on is because I, you’re highly opinionated and cool bullshit out rather than do the hype cycle stuff. You’re independent in that you don’t actually get paid by the vendors or us or anybody else like some of the analyst companies do. And more importantly, you run two podcasts, you’re constantly at conferences presenting and talking to people, you do a shit ton of reading.

If I had anybody that would be able to give me an overview or a broad view of what’s actually going on in the market, . When we rip in, what I’m going to use is a bit of a framing. I want to talk about Gen AI. And for me, I particularly want to talk about LLMs,

so my definition for Gen AI for this session is large language models and how they could be used for data and how they are being used and, what’s bullshit and what’s not. Let’s start with a broad question. What are you seeing, what are the themes in the market you’re seeing around this idea of Gen AI, LLMs, and the data domain?

Joe: If you didn’t know any better, you would think that every company and vendor and open source project is an AI. Company or project these days. I was just at Snowflake summit last week and I came up with the observation that it was interesting. There’s two, everyone’s basically slapping AI on to everything.

I think there’s kind of two camps. There’s the, I made a post about this too, where I feel like there’s companies out there that really believe this is the next game changing technology. And this is, every business is going to have to adapt or die. And then the other end of the spectrum, I see, it’s a pretty big group of companies that I saw, not just at Snowflake, but other places.

Yeah, this is, you have to do this because if you don’t, it’s there’s a lot of FOMO and it’s not good optics if you’re seeing down the sidelines of this. There’s the other camp that’s doing, generative AI because it just checks the box, you know what I mean?

But then he, you talk to some of these people, I’m not saying this represents all of them, but some of the viewpoints I heard over and over was this is it’s just a toy, but we have to do it, we have to play the game. So I think that those, that’s definitely the observations I saw at Snowflake Summit.

I expect Databricks to be very similar though. They tend to be a lot more kind of AI heavy, but for like actual AI use cases. So we can dive into what I, how to distinguish those, but that’s what I’m seeing, not just at Snowflake Summit last week, but I would say, in conversations and travels around the world.

It’s everybody’s, keen on quote AI for whatever reason. It was in Saudi Arabia back in March. Was it March? Yeah, whatever, March. I can’t remember. I travel a lot. Anyway it was what was that conference? Leap. So it’s outside of Riyadh. It’s like 215, 000 attendees. From all over the world.

It was a big conference big. And AI was the, the big topic there too, right? And the Saudi government throws 40 billion into an AI fund to fund a new initiative. And you see this, in places like Dubai as well, where it was back in the fall of last year, where. They have a ministry of AI and they have a minister of AI.

And and if any, you would ask around, the people who are maybe at the ministry and might be in charge of having to try and do this stuff. There’s, I think there’s a lot of confusion still in what’s hype and what’s. What are the use cases for this stuff? And so I think to answer another part of your question, where are we?

What am I seeing? I think that on the consumer end of things the users, what I’m noticing is there’s last year was definitely, I think the theme was let’s imagine what we could do with this. Let’s figure out, what does the world look like if we were to implement this much in the same way, the internet actually back when that.

The web was, I think, becoming more popular. It was a lot of the same kind of conversations. On one end, you had a lot of utopian visions. On the other hand, it was, everyone trying to commercialize this as quickly as possible.

nowadays the internet is so ubiquitous. We’re talking over a year in New Zealand.

I’m in the U S and that’s, that wasn’t possible back in the nineties at least. But the conversations are very similar to for users right now and consumers if we’re going to implement this what are we going to do? So like last year was definitely the age of the year of imagination.

I think this year is a year of. POCs and trying to test it out and seeing it where it actually has a use case in companies. And, and then I think next year is going to be the year where the rubber hits the road. Either you start seeing, companies start getting some traction with this stuff, or, I guess we’ll see where it goes from there.

But I think you have an idea. You’re you’ve been around the block a bit when the hype cycle fizzles out, then, either there’s either been proven out use cases or, it’s I don’t know, the hype cycle unfolds and people move on to other stuff.

Shane: I’m intrigued by that. We’ve seen hype cycles, I’m old enough to see them around the OLAP cubes, Tableau and then, the big data doopie wave and then, data mesh was a bit of a hype y cycle for a while and now data products and data contracts, which I actually think will become sticky because I think they add value.

And then, With the downturn, we all thought we’re going back to consolidation and, end to end products rather than best of breed features. But then the AI waves hit and now we see all the VC money going in. Stealth after stealth companies starting to come out in the data space.

And a lot of the time they seem to be just putting lipstick over, ChatGPT, LLM, and creating these agents with no moats, but I’ll come back to that. So I’m intrigued about whether this is just a hype cycle and data and analytics, or as you said, there’s been some changes in the world.

Things like the internet, because I remember the day when we had dial up. I remember the days before we had that. Hell, I’m old enough to remember when we had video players with remote controls that actually had a wire to the video player. We’ve seen the internet, we’ve seen, the iPhone or , smartphone,

we’ve seen a whole lot of step changes that have changed everything we do. And the real question for me is the LLM one of those, or is it a Hadoop?

Joe: I think there’s a lot of utility to it when getting to use cases specific to data, we’re recording this June 11th or something, but this is also the week of WWDC, which is Apple’s conference. And Apple just announced a slew of Apple intelligence integrations.

So your iPhone. Assuming you have an iPhone 15 Pro or above or an M series Mac will you’ll be able to take advantage of that stuff. So I’m curious to see what that looks like. In these cases that he said there, I was like, being able to generate images and, summarize texts and reminders and all this kind of stuff and calendar stuff.

So I don’t know. Maybe it’s a good productivity assistant. I use personally, I use chat GPT every day that in Gemini Pro, just because, as I said, on other podcasts, do I use it more as a conversational agent to poke holes in my ideas? I don’t use it to write. I think that, I don’t know, defeats the purpose.

Plus it comes up with weird. Results anyway, so you don’t know you want to copy and paste that kind of stuff. But so my workflows, I use it all the time for just ideating stuff. I think it’s awesome for that. It’s like having a really smart friend that sometimes lies to you.

Shane: I think that’s one of the keys is that the deterministic, which means that you can ask it the same question five times, and you will get variability of answers. And in the data space, what we really want is we ask the same question, Five times. And we get the same answer five times. How many customers do we have right now?

And if we haven’t refreshed the data, the answer should be 42, 42, 42, 42, not 42, 43, 926, 1, 233. So there’s this idea that actually the LLMs are completely the wrong technology for data, and that comes back to the use cases. One of the things we found is our partners get asked by their potential customers, what are we doing in the AI space?

And this is not because the customers or the organizations actually have a need for it. It’s now, like you said, just table stakes. They want to know that we’re doing something in it. And so I had to think about, what would I do in the LLM space? And to me, I had to come up with a set of patterns cause that’s how I always think.

So what I did was I came up with three patterns that I understand. I talk about ask AI. Assisted AI and automated AI. So ask AI is that chatbot feature, ask something, get an answer, ask something, get an answer, ask something else, get another answer, and then go and do the work yourself,

you just, it’s helping you, think, talk being, being that friend. But you’re making the decision, you’re doing that action. Assisted AI for me is that co pilot behavior, which is it’s watching what I’m doing and it’s helping me do it better, but I’m still making the call. And then automated AI or autonomous is, I’m not even involved.

The thing’s just doing it for me in the background and it may tell me but there’s no human interaction. So let’s go through those kinds of patterns. So the first one is ask AI, which Is what I’m seeing pretty much every data vendor and every stealth startup doing is the text to SQL.

Natural language, let me ask a question, write the SQL, give me the data answer. What are you seeing in that space?

Joe: Oh, that’s, if you talk about table stakes, I feel like that’s something that everybody’s. Trying to do, or, at a minimum has included some sort of functionality into their products, as far as I can tell, everyone’s got a co pilot, for example. Everyone’s got, some sort of LLM search.

So when we talk about that, it’s it is table stakes though. It’s if the if the donut chart became like this hot thing that everyone was interested in donut charts, I’m just going to, throw that out there. But then, every tool now has to support 3d donut charts, right?

It’s not special anymore. It’s just. What flavor are you going to use? What color are you going to have the kind of just like flies onto your PowerPoint presentation like back in the nineties and 2000s what flavor are you going to use? So I think that’s where we are with the search.

Snowflake was interesting last week. Cause they unveiled I can’t remember the name of the product. Snowflake analysts, I think for lack of a better way of describing it, but they have their own LLM now. And so you can actually ask questions of your data in Snowflake. So all of a sudden I thought about all the vendors out there, like ThoughtSpot and others like that.

And I was like, okay, I think that might be a bit of a concern. Cause that’s literally what they do is search powered BI.

Shane: They bought Moat, right? So after years of saying no dashboards are dead, they went and bought a dashboarding company. So that was maybe they saw that the moat for natural language search is disappearing.

Joe: Yeah, it’s an interesting one. And then, there’s a lot of other LLM powered vendors at that, stuff like Summit and I I didn’t want to ask. I didn’t want to be mean about it, but I could definitely tell that it was, it’s an announcement that you don’t want to hear when you’re, the vendor that you’re partnered with and you’re at their summit basically just knocks you off with their own version of your product.

Shane: But we saw that with OpenAI, we saw people starting to build all these agent capabilities on top of OpenAI. And then all of a sudden, they came out with the ability to build your own agents in their tool. And it just killed the market

Joe: That’s what happens when you build on top of a platform like that. It’s they have all the intelligence of what’s going on. What do you think is going to happen to you? That’s how it works when you’re too dependent on the large vendor. But it is what it is.

It’s like you say, OpenAI did that. And that’s been the case since since we’ve had large monolithic kingdoms that we’re attached to. Yeah, so I think that’s an interesting one. I so to tell AI what to do or ask AI what to do that’s an interesting one.

They are they tell AI what to do is an interesting sort of I think angle that is going to have some interesting impacts on software development and application development. I know Andrew Ng is working on like a lot of agentic workflows right now. He actually just announced open an open source version of something that he did over the weekend.

I know Andrew spends all of his time on the weekend just nerding out on stuff. So this might. Project he did on Sunday. That’s, so basically give it, give a one agent some instructions, write the code and other agent will proof it, check it out, test it, the agent will deploy it and basically the whole feedback loop, right?

He’s working on, in this case, it was I think like language translation and having things check, but the whole point is like, you could extend that to basically use cases where you had various workflows that needed to integrate, and I think the success rate on it has been.

He’s been building out and others like that. I know that Microsoft has their own dev co pilot. I think it was, I can’t remember the name of it. There’s a white paper on it about a month or two ago, but that was another example where it was like, okay, this is where it’s all going.

It’s just, these are the workflows. We can talk more about the impacts of that, but I think that’s an area we’re going to see a lot more more activity.

Shane: One of the other use cases you’ve seen a lot come up is the idea of using the LLM to populate the descriptions of all your data. We all know that one of the biggest pains in the butts about implementing the catalog tool was, a. Buying it and turning it on and then integrating it with your source systems.

So you can actually get the data in there. But after that it was some humans had to go and actually update all the context about all that data because it never existed. I looked at that use case. And I was like, okay, that makes sense. But then I went one step further and said, but hold on.

What we’re going to do is we’re going to get the machine to give us the descriptive metadata for all our data. And then we’re going to go and ask the machine text a SQL question. And it’s going to use that descriptive metadata as the primary input to that language. So now what we’ve got is pretty much the machine making shit up. Asking a question while the machine uses the stuff it made up. And then you go, Oh, that’s fine. What we’ll do is you’ll see the vendors make it as a suggestion, here’s the text descriptive metadata. You have to approve it. But what we know as humans is, when people have a drossy job like that, they just hit the approve button, nine times out of 10, , because they don’t see the impact of that decision.

I can see Using LLMs for actions, I was listening to a thing around Baby AGI and that idea that you could give it a list of tasks and then that goes into the vector database and then whenever you ask a question it goes and looks at that stuff you’ve preceded to say, oh if you ask me that question I need to go and do these things and actually do the work for you.

Shane: But that reinforcement model where the LLM is making shit up about our data and then using it to answer our questions, I still haven’t got my head around how dangerous that is.

Joe: Oh, it could be very dangerous if it’s not there’s no enforcement of rules or any any sense of, correctness, right? I know Juan Cicada, he’s done quite a bit of research on, the hallucination effect of LLMs on Texas SQL, for example, and it’s horrendous. It’s horrible.

I think. Low double digit accuracy, just on its own. And I’ve seen other benchmarks. I think Sega mirror did something like that as well, where he benchmark accuracy and it ain’t great. If you just use like regular chat, GPT or open AI yeah, it’s going to suck out of the box. And so you definitely have to give it some context.

And yeah, it’s still early days and stuff like I, I wouldn’t use it in my own business. And the question I always ask around the world is, would you stick an LLM on top of your corporate data set today? And the second part of that question is, would you bet your job that it’s going to work awesomely?

Nobody raises their hand.

Shane: I suppose the third question is, would you let it do your job for you so it answers for you without you even seeing the answer, would you trust it that much?

Joe: Yeah. And I don’t, I don’t get the impression people are at that point yet. But you’re starting to see some indications of things like, knowledge graphs and ontology mappings and stuff like that, taxonomies even, other other structures really semantically, or I think it was another one people are, pretty keen on that could offer some benefits, so yeah it’s still early days and all this stuff, but I, I’m definitely not going to say like we just need to throw LLMs away and avoid them. I think that they’re remarkably awesome technologies. But we’re just trying to figure it out. Like I was reading the the Anthropic paper where they were probing the inner workings of LLMs and trying to figure out how to nudge it according to the weights.

That it used that attaches to concepts, right? So in this case, like the Golden Gate Bridge or something, it was somehow fond of, and so they kept nudging it and so forth. So I think we’re just starting to figure out how these things work. The fact that like the, the experts who create these models don’t also know how they work,

Shane: Goes back to the, neural nets back in the day of the mining days, right? Where you threw it into a neural net and it gave you an answer and it looked reasonable, but you had no idea it was not explainable, back then we tested it. We didn’t care

that much.

Joe: I just writing about that in my book, the early days of neural nets and stuff. I think it was the forties actually when it was first introduced as a concept, but it’s a while. Yeah. So it’s interesting. And but the problem that’s always existed with a neural nets in particular is that it’s a black box.

You don’t know what’s going on,

Shane: What we used to do a lot of was we, when we’re doing the testing of the different models and lift charts and all that stuff, often we’d see the best result out of a neural, but we couldn’t explain it. So we’d actually take the customer through decision tree. And we’d say, look, the decision tree is telling us kind of the basis of the model.

It’s not the model, but at least it’s explainable. So I think we will see LLMs starting to give us an explainability path where it’s not actually what it’s doing, but it’s giving us enough of an insight that we can nod and go, yeah, that’s what I’d do. Yeah, maybe we’ll see. So let’s go back to that.

So one of the questions I’ve been struggling with this text to SQL is, we trust analysts that don’t know what they’re doing half the time. Yeah, I’ve seen analysts doing averages of averages. I’ve seen people take data that has absolutely no quality and treat it as the truth. So we’ve always had this trust in these data people who may or may not be doing the right work may or may not be using data that has quality, but we believe their answers.

So how is Text to SQL any different, is it because we don’t trust the machine or because we know it hallucinates that actually we trust it less?

Joe: I think because there’s always been a degree of fact checking that you can do with a human analyst. You can say tell me your assumptions behind this, right? And I think there’s a certain level of domain expertise that’s implied with an analyst that you probably don’t get with Texas SQL. A junior analyst, for example, might perform, I think, technically at the same level as chat GPT. And there’s that but which one would you be able to ask questions about ask questions from, and get a reliable answer back? I guess also, how would you know that you’re getting the right answer? So that implies some level of domain expertise on your own end when you’re asking these questions.

Cause I guess you could flip the coin and do the Turing test and say, okay, is it an analyst or is it Shachibuti that gave me this this answer?

I don’t know. It’s a fun game to play, just thought of that, but I think, so that’s that’s where we are though. I think that, the belief is that these things these LLMs and so forth could work.

Replace analysts and software engineers. And maybe they could actually, I don’t, I’m not going to discount that. I think in the next five years, you might actually see that to a large degree, or at least supplementing the work in a much different way than we think so. But as of today, I don’t know. That’s a good question.

I guess the Turing test is one way to maybe help evaluate that question. But it’s an interesting question too, because with data and analytics, you pointed out earlier that data sets are sometimes horrendous and sometimes you don’t know the horrendous. As I always say, data has been a silent killer in that sense where you don’t know if it’s good or not sometimes, and nobody does, and so you just use it, and I guess then people just start saying it’s directionally correct maybe it’s not completely accurate, but it’s believable,

Shane: when we started adopting Agile in the data world, we started to pick up the idea of acceptance tests and, so let’s say we’re going to go and calculate how many customers we have, and we want to take a test driven development approach, then actually what we do is we write the answer of how many customers we have, and we write the code until it gives us that answer.

But you go to a stakeholder and say, how many customers we’ve got? So I need to write that as a test and their answer typically is, but that’s what I’m asking you to calculate. So you get this problem where you don’t actually know what ground truth is. One thing I am seeing in the text to SQL space that I find interesting is, A lot of the vendors now are actually producing graphs as a reinforcement feedback mechanism for the answer.

So how many customers have we got? We’ve got a thousand, and here’s them by region, here’s them by product type, so that a subject matter expert can look at those graphs and go, they look wonky. One customer in northern and everybody else in southern, we know that’s not right.

Joe: It’s like a CAPTCHA test for for your data, that’s interesting.

Shane: Yeah, and I don’t know whether we’re doing that because they found it’s a great pattern. For actually getting the answer. Or , in our toolkit we’ve always had graphs and we love graphs, so why don’t we just give them back graphs as a best way of reinforcing it. I’m with you. I think there is lots of opportunity.

I think the models being non deterministic make them incredibly dangerous. We haven’t figured out how to solve the problem. And, maybe iPhone moment. Maybe we don’t for ask, AI, we’ll find out , but you do it for anything to do with text.

I’m like you, I use a chat GPT and Gemini every day because for textual work, it makes me faster, I can ask it questions. I can have a chat to it effectively as a pair. And I find I end up writing better content or writing a better answer to

a question I have.

We’ve got to say the adoption of it means it has value. Otherwise we would have given it up. Like we have so many technologies before it.

Joe: But yeah, it’s not like Bitcoin or blockchain where, the blockchain paper came out in what, 2008, I think it was sometime around then. But there’s still not that many tangible use cases for it that you can think of. I think you’re, there’s a lot of grasping at it and a lot of grifting, but that’s not the same as having it.

Actually, there’s been time to play out where there would have been something 16 years later, right? A lot of people made money on it, but that’s not the same as it having utility. A lot of people made money on blockchain and web three, cause it was a, cause it attracted a lot of grifters.

Shane: I know lots of large consulting companies that made money on Hadoop because they tried to make every small business look like Amazon, or Google not a great thing. Wasn’t fit for purpose for those organizations, but it was the hype cycle.

Joe: But the interesting thing with the generative AI, the thing that I like about it and I’m, I started a new business or a new publishing company I’m going to be announcing soon, but I think, it’s an exciting time to start a new business and I’m going to sing it for content, but for all sorts of use cases, I’m trying to figure out a case like how can I basically, How could I create workflows where I won’t need to hire people?

Like, how much AI can I include in my company where it’s just an AI powered company, right? There’s always been this speculation that somebody’s going to come up with a billion dollar company and it’s going to be one person running it. I don’t know. I’m willing to at least take a stab at that. I think it’s pretty cool.

So it’s, it empowers because now, instead of you trying to like, emulate the big tech companies, With your your dupe stack. Now I can, possibly do the work of several people that I don’t need to hire anymore because I have workflows and generative edge just saves me that much time and money.

So I think that’s, that’s the goal. That’s why a lot of companies think are hot on this, but it’s also ignoring the utility you can get from traditional machine learning as well, which also has a ton of utility as well. Having a background in machine learning, I think is at least good for me.

Now I can think about, okay, now that I have a new business, how would this work?

Shane: , I think that background of science that background and you have a hypothesis and you’re going to test it and if it proves value you keep it and if you don’t you’re going to throw it away and you’re just going to into constant experimentation quickly you know that true idea of agility and then what you then end up doing is paying for tokens over people and they start becoming expensive as we’re learning all let’s move on to assisted ai which is effectively co pilots so we’ve got code pilots for SQL development, and for software development starting to look at it at. Differently in terms of for particular data tasks. So the example I use, cause this is what we’ve done in our product. From my point of view, we played with the text to SQL. I looked at it and I went, nah, we’re not going to go and push that anywhere near a production capability.

I just don’t trust it. I don’t think it has the value yet. I looked at the augmentation of descriptive metadata and I could see how it saves time. But again, I was really worrying about that reinforcement and people being lazy. So I’m like, I’m not touching that bloody thing again for now. And what we ended up with is data quality rules,

Shane: so the problem I wanted to solve was, like all data platforms, we’ve got the default data quality rules, not null as a phone number but as you’ll know, everybody’s data is a variable quality and you end up having to create these custom tests and the process that I want to optimize, because I can’t code, is the workflow that I had.

the workflow I had was, I look at the data, I go, yeah, there’s a data quality problem I need to write a test to tell me that runs all the time. That’s what we call a custom trust rule. Nine times out of ten, it’s some form of string based thing I needed to test for rather than a number.

So then I’d go into chat gpt, say I’ve got a Data in this format, use data, tell me how to test it, it’ll come back with a regex statement it hallucinates, so then I go into a regex testing tool on the web, and I put in dummy data in the regex code and make sure it executes, and when that was all happy, then I’d go back and actually write that as a test and I looked at that and said, lots of steps, lots of handoffs how do we do that?

So we ended up using, we’re Google, so it’s Gemini, but using the LLM. So it looks at the data, it tells me what the data looks like. It says I think this is a test you should write. Here’s the code for the test. Here’s what this test is doing. Here’s it executing to make sure it runs and giving you the answer.

Tell me what you want to change, and that saved me a shit ton of time. So for me, Assisted AI are those use cases, where a human’s doing some work and the machine is now making that work faster or giving them hints, but they’re still finally going, yes, that’s acceptable, so the human’s still in the loop.

So what have you seen? What other use cases have you seen around that idea of Assisted AI that you’re seeing in the market or hearing it on the podcast or hearing from those conferences?

Joe: Yeah. It seems like every company now has a co pilot integrated, right? Even Apple just announced that they’re doing a co pilot for Xcode. Just yesterday, I think they had dropped that. So like everyone’s going to have a co pilot of some sort. So I think there’s some vendors I’ve seen that have sort of AI enabled co pilots for making data pipelines.

Which I think is interesting, right? I want to create this view from these tables in Snowflake, right? So this might have some JSON here.

You have to flatten it out in Snowflake. This might have some other stuff joined together and so forth. And so that’s a data pipeline. I thought that was interesting. I think that the interesting thing with co pilots is, I’m gonna gen, if I’m gonna a prompt for code. Now, if I do like a manual review or a, a poll request with somebody, do I have to in include the prompt they used?

Shane: That’s the context.

Joe: It’s a, now you’re doing prompt reviews. With people. And so it’s okay, is this like the best prompt you could use? If you’re pushing code to production as a team still, which I think a lot of people still are like you’ll co pilot to generate the code.

But that doesn’t mean I’m going to let this thing run willy nilly. I still want to have somebody else review it, which is so common on engineering teams. And so I think that’s interesting, but the, but these workflows didn’t have the capability of the workflows and these tools didn’t have the capability for prompt reviews.

So I was like, so how would you know that what you would have written You know, is can be repeated because the last thing you want in data pipelines, for example, you talked about determinism, 42, 42, not a bunch of random numbers. You don’t want your data pipelines to be the equivalent of a random number generator, where it just makes up a different workflow every single time, like that would not

be ideal.

Shane: okay. So what you’re saying though is it’s config driven pattern. So you’re pushing the prompt to the production and it’s generating the SQL every time versus using the prompt to create the SQL and then pushing the SQL as the thing that’s committed and

Joe: either, either, or yeah, either, or, cause some of these tools have both capabilities. I can make a prompt to create a pipeline or I can get the code to create a pipeline and so forth. Either one. So in either case though, now it’s okay, so what was the quality of the prompt that you use?

That was one of the things that came to my mind. And my recommendation, one vendor was like, you have to have a way of at least tracking the prompts that are being used. For this workflow, right? If I’m asking you to generate code for me, what are, what exactly did I ask it?

Shane: Auditability and traceability, this should be core of what we do. So every time you write a prompt it should be recorded, but I would take that scenario if I’m pushing the prompt to production and that prompt is generating the SQL on demand every time it runs, I’d call that automated AI because there’s no human involved in the generation of that code and that non deterministic capability is going to kill you, maybe.

You got to solve it.

Joe: That’s also where I think where the agentic workflows are going to be coming in handy too, is they’ll just do the checking for you, right? But there’s a high level of trust you have to have in that whole system to work. But. I’m pretty convinced that’s where everything is going is, with software dev and everything else.

There’s a really interesting article that I think, I can’t remember who it was from some venture firm. I think it was actually venture. I can’t remember. Anyway, it was. It was really good cause they had a good description of basically, we have I think the consequence of this will be very fascinating.

They talked about basically how, software development is one of these things where it’s so expensive. There’s actually a ton of technical debt in society as a result of the amount of software that’s not written because it’s labor intensive and and so forth. But now with generative AI, it opens up, the, I guess the cost of developing software could theoretically drop to almost zero.

Shane: Oh, look, the first person that generates your UI front end off Figma designs is going to kill it, yeah, if you want to be a billion dollar company at one, write that puppy. Because the amount of waste in terms of the front end design we do in Figma to make things perfect, and then having to pay somebody else to then generate the actual UI.

To execute that thing repeatedly like that, that is a form of waste in my head

and it kills you. So yeah. The other thing about that generated code is, there are a bunch of people out there that are highly opinionated on this and just to call out, thank you for last Friday’s five minute post about don’t be a dick.

Yeah, it’s good that, when these things happen and they happen badly, that people call it out. And the reason I mentioned that is, I did a post on LinkedIn around this area and somebody came back and they’re one of those opinionated people and, they didn’t cross the line the posts that you talked about, but they got close.

And, their argument was, The LLM could never write the code that was as optimized as what I can write it for. Like here’s an example, and they haven’t done the indexing or that kind of stuff. And I’m like, I don’t care right now. I’ve got firepower in the back end and I can afford to pay for that firepower to run non optimized code, as long as it’s accurate.

So we’ve got to make sure we’re focusing on the thing the human can add. Now, again, there’s times you have to optimize that code. For a whole bunch of reasons, but that’s when, you use a different pattern. And so talking about that, the problem I have with co pilots and data right now is they’re so broad in that they’re just writing code.

Whereas if we think about it, they should be solving micro problems. So that example you said of create some views in Snowflake. We should be far more targeted. It should be create some dimensional views or create one big table views or create an activity schema view or, create a view for Power BI in direct query mode where I want the response to be under 5 seconds for a million rows and 200 columns.

Because that’s the things that can do is actually when we Give it that framework, those set of patterns and that constraints. It’s really good at staying within that boundary to a degree until it hallucinates. So what’s your thought about that? Going from these broadbush co pilots to really micro use cases.

Joe: I do agree with that a lot, actually. And I think it’s a good it’s a good thing to bring up the I think it would help. It makes you it forces you to write better prompts too, because it’s a lot more specific. But I think yeah, I do agree that there’s definitely optimizations you can get that would be really hard to do because you don’t have the context of, you All the systems, maybe you don’t, maybe you’re like this guy.

That’s just so amazing at writing code that he doesn’t need anything. He could write it in BIM off the top of his head. And that’s that’s super cool. But yeah, I’m curious to see where that goes. I don’t know. I think it’s a good idea. It could open up a lot of, if you think about the workloads on a data team now, they’re often asked to.

Solve things with a very blunt instrument because they’re short on bandwidth and time. And so at best, most data teams, they start out with good intentions and they start out solving great problems. But from what I’ve seen, the backlog increases to the point where you don’t dig out and then you’re trying to go for the easiest wins that you can get to or the loudest voices.

yeah, what I would be interested in is if you could go through and actually automate the clearing out of a lot of tickets. Cause those are, those tend to be very specific. It’s, you don’t write broad tickets. That’s the nature, or maybe people do, but I think it’s really bad form.

Tickets, issues tend to be very granular and very specific. So I think that’s to your point, an area where you could actually have a lot of real value

Shane: That’s a really interesting use case. Especially in mixed modal. One of the things we’re helping a customer with was they are moving from a legacy data warehouse platform to a new modern data stack. And you know how that goes, it’s been around for a while, they’ve got 1200 reports, a whole lot of data and, undocumented to a degree, and they’re going to figure out what they can decommission, what’s got value, and they had a bunch of people working on it for six months to try and, bore the ocean and figure out what the most important stuff is.

what they determined was the reports, in the legacy tool was the old copy of the report. So report A, number of customers, report B, number of customers with a filter by region, report C, number of customers in the last 12 months. So a lot of duplication. And what we did was we ran the report definitions through stuff we were working on to show them what was similar and what was different.

But at the same time, we use some of our internal stuff and we actually ran images of our reports through to the LLM as images and got some feedback. It was amazing what it did. It blew me away. It whacked our costs up massively, it was, we learned very quickly that multimodal image stuff costs you more than text.

But there’s an interesting idea there if a user will often upload a ticket with a screenshot of the problem. This report, that number, with a red circle. So again, there’s a really interesting assisted AI use case there, isn’t there? Where it just scans it and says, looking at that image, this is what looks like the problem, looking at the code base, this is where we think the problem is data person go in and see if we’ve got it right and fix it.

Joe: I think that’s awesome. Yeah. And I, I’m really excited about stuff like Chachapiti’s 4. 0, for example. I think that, at least demoed really well. I still need to try it. A lot of the features haven’t been released yet, so I was thinking about just how do I use it to help my kids with their homework?

And that kind of stuff. They had this awesome demo of Sal Khan where he was with this kid and it was basically walking through how to solve a, I can’t remember if it was a trigonometry problem or something like that, but. That was pretty cool, right? Just looks at what you’re doing and says, Oh, here’s some suggestions on how you could fix it.

And so that’s, I think to your point with multimodal, that’s also the stuff I’m excited about is like what opportunities to actually open up in the business, because right now we’ve only been talking about text, but there’s a lot of other stuff, right? Explaining, things, audio or video maybe looking at your reports and saying, here’s what I would do just given the context of your business that I’ve learned about.

I think there’s a lot of really cool stuff that I could, I think it will be happening. I don’t think it’s a question of if

Shane: We can steal from our software brethren. A lot of the tools that record user screens in the software application. As you’re building it, you plug some stuff in so you can see what they’re using in your product and where they’re struggling. A lot of those are now using, and I’ll use AI because I’m not sure what they’re using under the covers, but yeah, it’ll actually go through and say this screen, this percentage of the users seem to be struggling in that area.

How often do we actually do usability testing on our reports? Never. It’s funny that our reports are effectively small apps, and we actually do none of the software engineering work in that, very rarely are our reports code, we’re always dragging and dropping or something, we never version them.

We do a little bit now. We don’t do any of the, test, recording and logging of who’s used what. And we certainly don’t use some of that software engineering stuff, so it would be interesting bringing in those other techniques that we haven’t used in data before.

Joe: Oh yeah. I think that’s what I’m saying. I think it opens up a whole new world. Cause for the longest time, especially just reading the the original business intelligence paper, it came out in 1958.

First time the term was used. It’s not a new concept. That’s a retiree on pension right now.

Shane: Semantic layers, right? We’ve seen those before a few times, they had value and

we keep losing them.

Joe: yeah. And so it’s just all this stuff. And I’m okay, so we’ve had this workflow that’s existed for a long time this paradigm of looking at the world and how to view it. And it’s been successful for the most part, but what else is there out there that we haven’t?

Thought about what, how is there another way that we could be looking at our business? Apart from the traditional ways that you either successful with, or you mostly struggle with to be more to the point with a lot of companies, like data warehousing and BI, for example. Why is it in this day and age that we’re still struggling with this problem?

It’s been around for ages. Like literally a long time. Bill came up with that a while ago and moved on from that actually a long time ago too. You’re like, this is pretty boring. I’m going to go focus on text, not structure data. But why are we still, why are we still struggling with these problems that have been around for decades?

And yet we’re also at this inflection point where we’re introducing like these extremely complicated, almost alien workflows into our businesses. That’s the dichotomy I’m still trying to wrap my head around really is you have all the problems of the past namely in analytics and so forth.

And, I’ve been through a lot of efforts of, trying to shoehorn machine learning into companies. Most part of it doesn’t work. They’re not ready for it for a lot of reasons. But now we’re at this point where it’s okay, but then you have a generative AI, which I said, it’s alien. I literally mean that it feels like you’re talking like a space alien sort of came to the earth.

And now you’re trying to learn how to talk to it and it knows everything. And sometimes it lies to you and so forth, but it’s an interesting one. But now you’re trying to shoehorn that into businesses too. But the dichotomy is obviously that a lot of the classic data problems, we haven’t even figured those out.

So can this help move that along? I think that’s maybe the only option you have. The other roaches haven’t worked.

Shane: It’s an interesting question. Again, I’ve been doing this for 30 years and I’m with you for 30 years. We’ve used different technologies to try and solve the same problem. And we haven’t, we still have disconnected data teams. Stakeholder asks for a number. It still takes us, you Days, weeks, months, depending on if we have access to it to give them the number back.

We can’t prove the numbers right nine times out of 10. Will this change it? I don’t know. I go back to the iPhone moment. Yeah, cause there were smartphones before that didn’t change the way we behave. So maybe it is, the technology will generate this new thing that we’ve never thought about. I just don’t know what it is, right? So I’m still focused on.

OK, I’m not going to do that big thinking. I’m just going to focus on small problems where it can take time. Things that you state me three hours or when I had to go and learn about it and let me do them in a minute or 10 minutes or teach me something I didn’t know. Silence.

Joe: problems. So if you and me tried to look at. A billion pictures and label the cats in the pictures. Like we would die before that happened.

That’s, jotting it down by hand. That’s cat, no cat, hot dog, no hot dog. We would literally run out of time. Machine learning algorithm has been able to do that for ages and it can do that at scale, million, billion images. Not a problem. We’ll do it.

Where problems like that are good. So with companies, for example, right? You talk about the team that’s migrating the data. This is a classic example of something where machine learning could be a good fit where it’s at a scale where it’s going to take humans way too much time to do, it’s not really value add at the end of the day. It isn’t. And then the other part of it is you have technology now that, with AI is really good at either synthesizing and creating data, new data sets based on input or summarizing. These are the two massive skill sets that it has. This is perfect, right? So I wrote an article back in November about this, where I felt like.

AI is probably going to be the this is probably how a lot of the data management problems that have plagued the business world for ages get solved because it’s going to operate at such a scale that it would have to work, I would say. Cause if you do, if this doesn’t work, I’m not sure what will.

If you’re not throwing machine learning or AI at the problem, I’m like, what are you going to throw more people at it? Or DEMA governance rules at a company and tell them to go do that? Is that going to solve the problem? Are we, are we going to put another post on LinkedIn complaining how the business doesn’t get data and we just need to fight harder to get the business to accept us?

I think that those are all stupid. Things to be talking about at this point. Like we have the technology that can be tweaked and honed to start solving these these data management problems at scale.

Shane: Oh, and I think, while we’re talking about assisted AI, before we get on to automated, yeah, there is some dystopian views out there and they are highly possible. You obviously know about Amazon Turk, you can go and put all those images there and people in third world countries that get paid, bugger all, go and do all this manual work to go and do what a machine can do now.

And, effectively. That market’s going to disappear whether you agree it’s a good thing or a bad thing. It’s gone that income for those people is going to get taken off the table. That could happen to the data people. If I’m a stakeholder and I go hold on, how’s this work?

I ask a question, even with these tools, it still takes a couple of weeks before my JIRA ticket gets picked up and then somebody looks at it and then, yeah, there’s an analyst involved, but actually I don’t trust their numbers any more than I trust the machine. Frig it, I’m just going to go and you do it myself in 10 seconds.

And the number will be good enough because I’m making a decision and a quick decision on reasonable data is better than a, a long decision on 100 percent data, nine times out of 10, right? There’s certain use cases where that’s not true. So that speed to market, I think is potentially the thing that will kill data teams,

Joe: yeah. It could, it’ll either replace them or hopefully evolve them becoming just do more of what they should be good at, which is being domain experts in the business that happened to be also good at, with data, I think that’s like that’s the role that I have when I got my first job was actually doing that.

I had good data skills, but I was working in the business. I wasn’t, not that we called ourselves data. I wouldn’t use that word that often. Cause it wasn’t really relevant to the people I was working with. They’re like, yeah, that’s cool. You do data. Nobody cares. But what I would like to do is increase my sales.

What I would like to do is effectively allocate my marketing budget of millions of dollars a week to, the most impact. Can you help me with that? I don’t really care how you do it. So I think that’s the role that people are going to be more evolving into in data. But I think you’re absolutely right.

There’s, there might be infrastructure engineers behind the scenes at companies that require it to maintain the infrastructure. But a lot of that’s, a lot of that’s already been sassified away a decade or more ago anyway snowflake, you don’t manage the infrastructure with that you, they do.

And yeah, it’ll be an interesting one, the notion of a data team, though, I think I’ve always had trouble with it. I feel in one hand it’s they’re necessary. On the other hand, it’s depending on the type of company you’re at, and it’s a broad generalization for me to say this, but what, what do most of them do?

Shane: They deal with the fact that if we’re building our own software products, we don’t care about the data. So we leave half the job undone. So somebody else has to pick it up. And if we’re buying, 12 different SaaS platforms, we don’t worry about how we’re going to integrate the data between them to answer questions.

Questions on the combined data. Data teams end up picking up the technical debt for those two decisions, which is, we’re going to go put stuff in quickly and ignore the next bit that we need to do. So if we get back to the last one, automated AI, and this is the one that I’m intrigued with because, originally what I’d say is, it’s, definition is the machine does it and we’re not involved.

And the example I used to use was anomaly detection. And I use this one particularly because data comes into the platform, the machine looks at it, determines that, the data coming is out of a certain range, so there’s an anomaly. And when we first started building that we were clever and we used k means and some clustering and that was all good.

That cost us a bit of money to run and we were hyper focused on that. In the end we moved back to a bunch of groupwise. Because actually it gave us pretty much the same answer and it was a shit ton cheaper to run. Now we could push that through an LLM and maybe it’ll give us a better fit?

I don’t know. But the question is, it’s still not automated because it’s still assisted, right? It’s telling me that something’s wrong and then I’ve got to go and fix it. Automated AI would be, it saw an anomaly. It actually went out and fixed the data, reloaded it, without me ever doing anything and told me it fixed the problem with that load.

And the whole reverse ETL wave and all that hyper bullshit actually will come true with LLMs. Because what we want for automated AI is we want it to actually take the action for us. We don’t want to tell us the customer’s churning so that we can go and figure out a save campaign, so we can then type something into an email, automated email system to send them the offer.

We want it to go and say, That person looks like they’re going to leave. We know that the next task list is send them a save campaign, email sent. We’re tracking their response, which if you think about it was what we tried to do in data mining and machine learning five, 10 years ago, was automating that workflow,

giving them that instant recommendation on an e commerce site. So I think we’re going to come back into that and I think that’s where automated AI will take us.

Joe: Love this version of the future, by the way. I, this is a world I want to live in where things are just done on my behalf. And then maybe you tell me what you did, if I need to know about it. I, my first experience with this where it really hit home was I was working at a we were an IOT company.

And companies kept asking us to, for real time reports on stuff. And I was like tell me what you’re going to do with this report that I give you. What’s the action you’re going to take? So I described the action and I’m like how about if I just take the action for you when that happens?

And I just give you a report on the things I just did. That would be awesome. That’s going to be a ton of time. And so that’s what I had to realize. This is my 2015 or something like that. I had the realization like, okay, like this is how it, this is how the world needs to work, right?

Words. If you can automate something heuristically or with AI, doesn’t really matter. Just automate the action that you’re going to take. Especially as we move more towards, what I call the live data stack in my book, right? Where it’s convergence of applications data, ML, all in real time and a real time feedback loop.

And this is happening as we speak. I think this is the next shoe to drop, but as you get more towards that world what you just described is necessary, right? Automated ML or automated heuristic workflows. That’s how it’s going to have to work. Cause it’s, things are happening at such a speed that humans can’t, Respond, right?

But that’s always how it’s been. If you look at the nature of things like charts, for example, and analytics, analytical data is the only data that’s actually meant for human consumption. I E it’s something that you look at and you try and understand and make a decision off of. But most operational analytics, for example, isn’t like that.

What are you going to do if you saw a chart that updated every nanosecond, how would you possibly be able to respond to anything that happens in that timeframe, a machine does that. And so that’s always, but that’s always how it’s been since, even since the days of mainframes, you did that because you’re trying to do things at scale that humans couldn’t do faster than humans could do it. That’s something that they it’s been around for, fifties and sixties. So the desire of automating stuff has always been there. Now you just have machine learning that we can do things that can’t be done heuristically.

So I love the world chain. That’s the world I would love to live in.

Shane: mean, the operational analytics one I used to love was contact centers, when they were humans and you had the big TV on the wall with the call times, the response times, the answer times, the people in the queue. And that was all great to know that you’re in the shit, but what do you do about it?

Joe: exactly.

Shane: You’re going to, you’re going to go and get 10 more people just to turn up off shifts into the call center now to take that spike because, you are an important customer, but we have an unusual amount of of volume right now. You’re probably going to do some predictive stuff to guess what that spike is, but you may get it right or wrong.

What are you going to do? You’re going to cut people off and pretend you didn’t answer it. You’re going to tell the agents to answer the question quickly. Yeah. What action are you actually going to take when that graph that your queue, your queue volumes have gone through the roof and that potentially again is, with things like LLMs, can you ask the customer to tell you what their problem is and then push them to another channel,

that multi channel has always been the Nirvana for these days.

Joe: replace the agent with, yeah, or the customer service agents with an LLM or whatever, and just be done with it, which is what a lot of companies are trying to do , or it’s either triaging or outward replacing the workflow, but yeah that’s where it’s all going.

I have no, no disagreements with that. That’s but that’s the interesting thing is going to see what what, if the stuff does take off, and companies are able to integrate this at scale. Like what happens to all the employees? And I guess this is the central economic question around the world, but we’re going to find out real soon.

So that’s so I think it’s exciting in some ways to be able to watch all of this unfold and quote real time, no pun intended because, it’s not too often you get to live through an inflection point like this. And it’s going to happen fast. I think if it happens, it’s going to happen real fast.

But that’s going to happen slow too. In the sense where it’s like for the companies that have adopted it and it’s a good fit, things are going to happen quick for the companies that, where their data is an absolute horror show. Unless we find a way to use like an AI agent to automate the fixing of that data.

Yeah. I don’t know what happens to those companies.

Shane: My suggestion is, there will be an iPhone moment with this. I think we’ve seen with the LLMs, the amount of usage that we do and everybody does, there is value there, we just don’t know what the final value, what that unique thing that will come out. And so with that level of uncertainty, if you’re an organization trying to make an investment, you’re in a hell of a world right now because everything’s experimental.

You’ve got to actually look to use it, otherwise you will be left behind. Although, I remember when I was at Oracle, it was you’re an e commerce business or you’re out of business and there’s still a shit ton of businesses that aren’t e commerce businesses that have been around for that long.

But I think, this will give you competitive advantage at some stage. So the question is, how do you make small bets? Yeah, and I go back to if there’s something that takes a team an hour and the AI data agent makes it 10 minutes and doesn’t increase your risk, why wouldn’t you invest in that? That just makes sense,

but just be realized that, if it’s a VC funded stealth company, potentially it won’t be around in six months time for that use case, but there’ll be a bunch, like you said, the platforms will take over those use cases, because why wouldn’t they? And then, if you’re a data person, let’s almost go back to being a BA, go back to that.

You talked about that breadth of understanding business, understanding data, and now introduce the idea of understanding language become, figure out how to use those prompts as natural language to help you bridge those two domains. And then it’s just another tool in your toolkit, it’s another thing that adds value and use it if it does and don’t, if it doesn’t.

Joe: That’s just it too. It is just another tool in the toolkit. It’s not a it’s not a godsend. And in some cases it’s better to. If else statements will do just fine in your code, right? Classical machine learning will do just fine. In other cases, generative AI will do right. So yeah, it’s just, it’s, it is another tool.

I’m glad you say that. Cause I think a lot of people conflate generative AI with all of machine learning and AI, and these are not the same things. Like literally not the same things. But. I think there’s, as I posted today too, there’s, for companies out there, I would say just, especially around the consumer end of buying these tools, just beware.

There’s a lot of charlatans out there, a lot of snake oil. Salesmen who are trying to pitch you on a lot of things that aren’t true, right? This is a very complex topic as a complicated field. And there’s a lot of people out there who want to grift and take your money.

So just make sure you understand who you’re talking to, right? Not everyone’s an expert in this stuff. I would say not a lot of people are, if you imagine, okay, this stuff’s been around, chat GPT exploded on the scene about a year and a half ago. Okay. Then suddenly within a year, everyone’s an expert in this field.

Are you kidding me? Most people I know that they claim to be experts. They read like Wikipedia on something or barely even did that. And they read the first paragraph and now they like can barely spell rag. Or AI. So just understand who you’re talking to. Go with people who have, experience in this field who have done the work.

I think before and after the, chat GPT, cause there’s a lot of value to having been around the data space before all the generative AI hype. Cause you just know how gnarly the problems are in a lot of companies. And they’re, it’s a junk show at most companies to put it very politely.

Shane: I’ll reinforce that in terms of, when you get those demos and the demos look beautiful with their curated data and their curated prompts, then the theory is this is Nirvana. You can just throw your data at it into the magic sorting hat and it’ll give you an answer. Make them do it because I guarantee you ask me to do that.

I’m gonna go, I’ll do it, but I don’t trust anything I get back. So let’s be upfront about that because I know it’s non-deterministic. I know it just doesn’t work yet.

Joe: And that should be the answer. When you hear an answer, like what changes gave you where it’s an honest answer, right? It’s not a bullshit answer. And that’s the kind of person you want to go work with, right? You don’t want to go work with the person who’s yeah, this would totally work out of the box.

Don’t worry about it. Your data is an absolute mess. It’s

Shane: Magic happens here. If it does, that’s good, right? Cause then I can be that billion dollar company and retire, which won’t happen, or I can go find another career. I need to know now. So if you, if somebody’s got the magic sorting hat and magic does happen here, just let me know so I can stop.

That’d be good. Excellent good chat as always. I think we should probably get together in a year’s time and do the post mortem on Gen AI, LLM and AI data agents and see where the

Joe: I think that’d be great. Cause it’ll be around the time when next snowflake summit and next Databricks is happening. So I’m, as I put on my post today, I’m very curious to see like where things are the next these next two summits next year, the same next year.

Find a recap. I think that’d be awesome.

Shane: Key prediction from you then. Next year will Databricks Summit and Snowflake Summit be on the same day or a different day? Yeah.

Joe: worked out pretty well this year. So hopefully they’re just different weeks for the vendor’s sake. I think that’s super convenient. I did last year when they were on the same week and that sucked. So I go to Vegas and then to San Francisco. But no, I think it worked really well this year and Lord knows San Francisco needs all the help I can get in terms of tourism right now.

Shane: Let’s do it in the air. Let’s take the transcript from this one and in a year’s time put it through whatever LLM we’ve got at the time saying if we were going to run a podcast that tells us a year later what actually happened, what would you predict? And then we’ll have a chat and we’ll compare it and see whether what it thought we would talk about is what we actually talked about.

That’d be intriguing.

Joe: That’d be awesome.

Shane: excellent. Alright thanks for your time as always and I hope everybody has a simply magical day.

Joe: All right. Take care.

AI Data Agents with Joe Reis

Guests

Resources

Listen on your favourite Podcast Platform

Podcast Transcript

Read along you will

Fractional Data Team

Common Team Problems

About