Why assumptions are just that
When we first sketched out our plans for AgileData we were pretty clear what AgileData would do and what it wouldn’t do. Those assumptions didn’t last long.
Combine with magic
We knew we wanted to focus on what we call the middle of the data supply chain. We want to make it easy and magical to combine and enhance data, no matter where you get it from or how complex or ugly it is. We wanted to empower people who don’t code, who aren’t data engineers or data scientists to be able to do this. That was a problem we believe has yet to be solved with simplicity.
Collect and Present
We also knew that data would need to be collected so we could do our magic and the results of our magic would need to be made available via pretty visualisations, reports and as inputs into hardcore analytical models.
We decided we would leave it to our customers to collect the data and to present the data the way they wanted. we of course had some logic behind these decisions. Did I say logic, well let’s say more assumptions based on our experience.
You decide things based on what you know
Our experience is based on working with large enterprise organisations.
In that enterprise world the Information Technology (IT) part of organisation’s typically make it very difficult to connect to the systems where data is entered (we call these systems of record or data factories). And enterprise organisations have a myriad of silo’d data factories in which we need to connect and collect data from.
Let the experts do what they are good at
So we reasoned we would work on the basis that those IT teams would collect the data from the data factories and land it into a trusted area, let’s call it a “data lake” to be “modern”. Typically the IT teams have already done this work before and have some enterprise grade software they spend a fortune buying and maintaining to achieve this task. The benefit to them is they maintain “control” of this data, have a “visible” security gate to control access to their systems and we are seen as less of a “threat” to their ecosystems.
The benefit to us was we wouldn’t have to deal with their myriad of complexities, we could pick the three most popular cloud data lake platforms, Google Cloud Storage, AWS S3 and Azure Blog storage as a safe landing zone. The customers IT teams could dump the organisations data into the data lake, we would connect and collect it from there and copy it into AgileData to do our magic.
No plan survives the first battle
Of course our first couple of customers weren’t large enterprise organisations. They didn’t have a dedicated IT team. They didn’t have organisational silo’s that “protected” their data silo’s.
What they did have is one core system of record and a series of manual spreadsheets that held all the organisations data.
Reuse previous patterns, build new ones when required
We already had a set of capabilities to collect data that had been entered into spreadsheets, it was what we used to test the initial build of AgileData. So we reused these for our new customers.
We then built out a set of patterns to collect data from Shopify and an on-premise SQL Server database, which was what we needed for our first customers. We have built these in a way that we can extend out the pattern to add more features and leverage it for other SaaS source systems and on-premise databases as we need to.
We then had to extend out our data collectors to include the ability of our next set of customers to “push” data programatically to us. This has ended up being the preverbal swiss-army knife and our most used data collection pattern.
We then had to extend out our data collectors to include manual file upload, Google Analytics, Xero and Quickbooks for our next set of customers.
Inspect, adapt, iterate
We started out by saying we wouldn’t provide the capability to collect data, we would rely on our customers to do that. That was based on a series of assumptions we had made based on our years of previous experiences.
As soon as we found those assumptions to be invalid, we inspected and adapted our approach to solve the problem with simplicity.
When we are lucky enough to work with our first large enterprise organisation we will need to have another look at whether we extend our collection capability to cover their systems, implement our original “data lake” pattern or come up with something completely new.
We are good at managing change, it’s one of the core principles of agile and part of the our AgileData DNA. I can’t wait to see what other assumptions we have made that we will end up iterating on.