Blog

Because sharing is caring

5 core Data Collection Patterns

At AgileData, delivering our Fractional Data Service has revealed the diverse challenges of integrating data from varied organisations, industries, and systems. To scale effectively, we’ve adopted five core data collection patterns based on our “Define it Once, Reuse it Often” (DORO) principle:

1. Push
2. Pull
3. Stream
4. Share
5. File Drop

These patterns are supported by a toolkit of tested technologies like Dataddo, Meltano, and Google services, allowing us to solve new data challenges quickly. Our approach ensures flexibility and scalability, always starting with the question: Push, Pull, Stream, Share, or File Drop?

AgileData Feature #01 – Marketplace
AgileData Feature #01 – Marketplace

Information Consumers can quickly search, find and access the Information Products they need to answer their business questions.

This feature enables Information Consumers to search and find all the available dashboards, reports and analytical models in the AgileData App, regardless of what third party Last Mile tool they are created in.

It also enables the Information Consumer to quickly open that report or dashboard directly from the Marketplace, when those reports are accessible by a web URL

NZ Scaleup AgileData achieves Google Cloud Ready – BigQuery Designation
NZ Scaleup AgileData achieves Google Cloud Ready – BigQuery Designation

AgileData has achieved Google Cloud Ready – BigQuery designation, streamlining data management for customers and partners. This certification confirms the integration’s functionality and reliability, reducing complexity through a low-code interface. By leveraging Google Cloud’s infrastructure and BigQuery, AgileData empowers business leaders to rapidly gain insights and make informed decisions efficiently.

The last (for now) of our #AgileDataDiscover summaries!
The last (for now) of our #AgileDataDiscover summaries!

Shane Gibson and Nigel Vining completed a 30-day public experiment using a Large Language Model for legacy data warehouse discovery. They confirmed it was feasible, viable, and valuable, securing their first paying customer. They’re showcasing their progress at Big Data London in September under their new product, AgileData Disco.

#AgileDataDiscover weekly wrap No.5
#AgileDataDiscover weekly wrap No.5

We are in the final phase of building a new product, AgileData Disco, aimed at efficiently discovering and documenting data platforms. We are exploring various Go-to-Market strategies like SLG and PLG. Pricing strategies include options like pay per output or subscription models. We are building in public to gather feedback and refine their approach.

#AgileDataDiscover weekly wrap No.4
#AgileDataDiscover weekly wrap No.4

We review feedback, highlight emerging use cases like legacy data understanding, data governance, and automated data migration. New patterns are needed for moving from prototype to MVP. Challenges include managing tokens, logging responses, and secure data handling. The GTM strategy focuses on Partner/Channel Led Growth.

#AgileDataDiscover weekly wrap No.3
#AgileDataDiscover weekly wrap No.3

We focus on developing features such as secure sign-in, file upload, data security, and access to Google’s LLM. Challenges include improving the menu system and separating outputs into distinct screens for clarity. Feedback drives their iterative improvements.

#AgileDataDiscover weekly wrap No.2
#AgileDataDiscover weekly wrap No.2

We discuss the ongoing development of a new product idea, emphasising feasibility and viability through internal research (“McSpikeys”). Initial tests using LLMs have been promising, but strategic decisions lie ahead regarding its integration. The team grapples with market validation and adjusting their workflow for optimal experimentation.

#AgileDataDiscover weekly wrap No.1
#AgileDataDiscover weekly wrap No.1

We are tackling challenges in migrating legacy data platforms by automating data discovery and migration to reduce costs significantly. Our approach includes using core data patterns and employing tools like Google Gemini for comparative analysis. The aim is to streamline data handling and enable collaborative governance in organisations. Follow their public build journey for updates.

Introducing Hai, AgileData 2024 Data Intern
Introducing Hai, AgileData 2024 Data Intern

I’m Hai, a name that intriguingly means “hi” in English. Originally from Vietnam, I now find myself in Australia, studying Data Science and embracing an internship at AgileData.io. This journey is not just about academic growth but also about applying my knowledge in practical, impactful ways. Join me as I explore the blend of technology and community, aiming to make a difference through data.

Defining self-service data
Defining self-service data

Everybody wants self service data, but what do they really mean when they say that.

If we gave them access to a set of highly nested JSON data, and say “help your self”, would that be what they expect?

Or do they expect self service to be able to get information without asking a person to get it for them.

Or are they expecting something in between.

I ask them which of the five simple self service patterns they want to find, which form of self service they are after.

There are 3 strategic / macro data use cases
There are 3 strategic / macro data use cases

I often ask which of these three macro data use cases the Organisations believed were its priorities to achieve their business strategy:

Providing data to Customers
Supporting Internal Processes
Providing data to External Organisations

Each of these three strategic / macro data use cases come with specific data architectures, data work and also impact the context of how you would design your agile data ways of working.

Building the Data Plane while flying it
Building the Data Plane while flying it

In the data domain you typically have to balance between building the right thing and building the thing right.

The days of being able to spend 6 months or a year on “Sprint Zero” creating your data platform have gone.

One team I worked with called it “building the airplane as you fly it”

Here are 5 patterns I have seen data teams adopt to help them do this.

2024 the year of the Intelligent Data Platform
2024 the year of the Intelligent Data Platform

AI was the buzzword for 2023 and it will continue to be the buzzword for 2024.

I have been thinking about our approach to AI in our product for a while and landed on 3 patterns that I use as a reference.

Ask AI
Assisted AI
Automated AI
Adopting these patterns moves a data platform from being a manual data platform, towards a data platform that can do some of the data work for you.

An Intelligent Data Platform.

The Art of Data: Visualisation vs Storytelling
The Art of Data: Visualisation vs Storytelling

Data visualization is like painting with data, using charts and graphs to make trends and patterns easy to understand. It’s great for presenting data objectively.

Data storytelling weaves a narrative around data, adding context, engaging emotions, and inspiring action. It’s perfect for persuading stakeholders.

Demystifying the Semantic Layer
Demystifying the Semantic Layer

The semantic layer is your mystical bridge between complex data and meaningful business insights. It acts as a translator, converting technical data into a language you understand. It works through metadata, simplifying queries, promoting consistency, and enabling self-service analytics. This layer fosters collaboration, empowers customization, and adapts to changes seamlessly. With the semantic layer’s power, you can decipher data mysteries, conjure insights, and make decisions with wizard-like precision. Embrace this enchanting tool and let it elevate your data sorcery to new heights.

Understanding Concepts, Details, and Events: The Fundamental Building Blocks of AgileData Design
Understanding Concepts, Details, and Events: The Fundamental Building Blocks of AgileData Design

Reducing the complexity and effort to manage data is at the core of what we do.  We love bringing magical UX to the data domain as we do this.

Every time we add a new capability or feature to the AgileData App or AgileData Platform, we think how could we just remove the need for a Data Magician to do that task at all?

That magic is not always possible in the first, or even the third iteration of those features.

Our AgileData App UX Capability Maturity Model helps us to keep that “magic sorting hat” goal at the top of our mind, every time we add a new thing.

This post outlines what that maturity model is and how we apply it.

AgileData App UX Capability Maturity Model
AgileData App UX Capability Maturity Model

Reducing the complexity and effort to manage data is at the core of what we do.  We love bringing magical UX to the data domain as we do this.

Every time we add a new capability or feature to the AgileData App or AgileData Platform, we think how could we just remove the need for a Data Magician to do that task at all?

That magic is not always possible in the first, or even the third iteration of those features.

Our AgileData App UX Capability Maturity Model helps us to keep that “magic sorting hat” goal at the top of our mind, every time we add a new thing.

This post outlines what that maturity model is and how we apply it.

Unveiling the Magic of Change Data Collection Patterns: Exploring Full Snapshot, Delta, CDC, and Event-Based Approaches
Unveiling the Magic of Change Data Collection Patterns: Exploring Full Snapshot, Delta, CDC, and Event-Based Approaches

Change data collection patterns are like magical lenses that allow you to track data changes. The full snapshot pattern captures complete data at specific intervals for historical analysis. The delta pattern records only changes between snapshots to save storage. CDC captures real-time changes for data integration and synchronization. The event-based pattern tracks data changes triggered by specific events. Each pattern has unique benefits and use cases. Choose the right approach based on your data needs and become a data magician who stays up-to-date with real-time data insights!

The challenge of parsing files from the wild
The challenge of parsing files from the wild

In this instalment of the AgileData DataOps series, we’re exploring how we handle the challenges of parsing files from the wild. To ensure clean and well-structured data, each file goes through several checks and processes, similar to a water treatment plant. These steps include checking for previously seen files, looking for matching schema files, queuing the file, and parsing it. If a file fails to load, we have procedures in place to retry loading or notify errors for later resolution. This rigorous data processing ensures smooth and efficient data flow.

The Magic of Customer Segmentation: Unlocking Personalised Experiences for Customers
The Magic of Customer Segmentation: Unlocking Personalised Experiences for Customers

Customer segmentation is the magical process of dividing your customers into distinct groups based on their characteristics, preferences, and needs. By understanding these segments, you can tailor your marketing strategies, optimize resource allocation, and maximize customer lifetime value. To unleash your customer segmentation magic, define your objectives, gather and analyze relevant data, identify key criteria, create distinct segments, profile each segment, tailor your strategies, and continuously evaluate and refine. Embrace the power of customer segmentation and create personalised experiences that enchant your customers and drive business success.

Magical plumbing for effective change dates
Magical plumbing for effective change dates

We discuss how to handle change data in a hands-off filedrop process. We use the ingestion timestamp as a simple proxy for the effective date of each record, allowing us to version each day’s data. For files with multiple change records, we scan all columns to identify and rank potential effective date columns. We then pass this information to an automated rule, ensuring it gets applied as we load the data. This process enables us to efficiently handle change data, track data flow, and manage multiple changes in an automated way.

Amplifying Your Data’s Value with Business Context
Amplifying Your Data’s Value with Business Context

The AgileData Context feature enhances data understanding, facilitates effective decision-making, and preserves corporate knowledge by adding essential business context to data. This feature streamlines communication, improves data governance, and ultimately, maximises the value of your data, making it a powerful asset for your business.

New Google Cloud feature to Optimise BigQuery Costs
New Google Cloud feature to Optimise BigQuery Costs

This blog explores AgileData’s use of Google Cloud, specifically its BigQuery service, for cost-effective data handling. As a bootstrapped startup, AgileData incorporates data storage and compute costs into its SaaS subscription, protecting customers from unexpected bills. We constantly seek ways to minimise costs, utilising new Google tools for cost-saving recommendations. We argue that the efficiency and value of Google Cloud make it a preferable choice over other cloud analytic database options.

Data as a First-Class Citizen: Empowering Data Magicians
Data as a First-Class Citizen: Empowering Data Magicians

Data as a first-class citizen recognizes the value and importance of data in decision-making. It empowers data magicians by integrating data into the decision-making process, ensuring accessibility and availability, prioritising data quality and governance, and fostering a data-centric mindset.

To whitelabel or not to whitelabel
To whitelabel or not to whitelabel

Are you wrestling with the concept of whitelabelling your product? We at AgileData have been there. We discuss our journey through the decision-making process, where we grappled with the thought of our painstakingly crafted product being rebranded by another company.

Data Consulting Patterns with Joe Reis
Data Consulting Patterns with Joe Reis

Dive into the world of data consulting with Shane Gibson and Joe Reis on the Agile Data Podcast. Explore their journey from traditional employment to successful data consulting, covering client acquisition, business models, financial management, reputation, sales strategies, employee management, and work-life balance.

The Enchanting World of Data Modeling: Conceptual, Logical, and Physical Spells Unraveled
The Enchanting World of Data Modeling: Conceptual, Logical, and Physical Spells Unraveled

Data modeling is a crucial process that involves creating shared understanding of data and its relationships. The three primary data model patterns are conceptual, logical, and physical. The conceptual data model provides a high-level overview of the data landscape, the logical data model delves deeper into data structures and relationships, and the physical data model translates the logical model into a database-specific schema. Understanding and effectively using these data models is essential for business analysts and data analysts, create efficient, well-organised data ecosystems.

Cloud Analytics Databases: The Magical Realm for Data
Cloud Analytics Databases: The Magical Realm for Data

Cloud Analytics Databases provide flexible, high-performance, cost-effective, and secure solution for storing and analysing large amounts of data. These databases promote collaboration and offer various choices, such as Snowflake, Google BigQuery, Amazon Redshift, and Azure Synapse Analytics, each with its unique features and ecosystem integrations.

Unveiling the Definition of Data Warehouses: Looking into Bill Inmon’s Magicians Top Hat
Unveiling the Definition of Data Warehouses: Looking into Bill Inmon’s Magicians Top Hat

In a nutshell, a data warehouse, as defined by Bill Inmon, is a subject-oriented, integrated, time-variant, and non-volatile collection of data that supports decision-making processes. It helps data magicians, like business and data analysts, make better-informed decisions, save time, enhance collaboration, and improve business intelligence. To choose the right data warehouse technology, consider your data needs, budget, compatibility with existing tools, scalability, and real-world user experiences.

Martech – The Technologies Behind the Marketing Analytics Stack: A Guide for Data Magicians
Martech – The Technologies Behind the Marketing Analytics Stack: A Guide for Data Magicians

Explore the MarTech stack based on two different patterns: marketing application and data platform. The marketing application pattern focuses on tools for content management, email marketing, CRM, social media, and more, while the data platform pattern emphasises data collection, integration, storage, analytics, and advanced technologies. By understanding both perspectives, you can build a comprehensive martech stack that efficiently integrates marketing efforts and harnesses the power of data to drive better results.

Unveiling the Magic of Data Clean Rooms: Your Data Privacy Magicians
Unveiling the Magic of Data Clean Rooms: Your Data Privacy Magicians

Data clean rooms are secure environments that enable organisations to process, analyse, and share sensitive data while maintaining privacy and security. They use data anonymization, access control, data usage policies, security measures, and auditing to ensure compliance with privacy regulations, making them indispensable for industries like healthcare, finance, and marketing.

5E’s
5E’s

As Data Consultants your customers are buying and outcome based on one of these patterns – effort, expertise, experience or efficiency.

We outline what each of these are, how they are different to each other and how to charge for delivering them.

Agile-tecture Information Factory
Agile-tecture Information Factory

Defining a Data Architecture is a key pattern when working in the data domain.

Its always tempting to boil the ocean when defining yours, don’t!

And once you have defined your data architecture, find a way to articulate and share it with simplicity.

Here is how we articulate the AgileData Data Agile-tecture.

DataOps: The Magic Wand for Data Magicians
DataOps: The Magic Wand for Data Magicians

DataOps is a magical approach to data management, combining Agile, DevOps, and Lean Manufacturing principles. It fosters collaboration, agility, automation, continuous integration and delivery, and quality control. This empowers data magicians like you to work more efficiently, adapt to changing business requirements, and deliver high-quality, data-driven insights with confidence.

ELT without persisted watermarks ? not a problem
ELT without persisted watermarks ? not a problem

We no longer need to manually track the state of a table, when it was created, when it was updated, which data pipeline last touched it …. all these data points are available by doing a simple call to the logging and bigquery api. Under the covers the google cloud platform is already tracking everything we need … every insert, update, delete, create, load, drop, alter is being captured

What is Data Lineage?
What is Data Lineage?

TD:LR AgileData mission is to reduce the complexity of managing data. In the modern data world there are many capability categories, each with their own specialised terms, technologies and three letter acronyms. We...

Data Mesh 4.0.4
Data Mesh 4.0.4

TD:LR Data Mesh 4.0.4 is only available for a very short time. please ensure you scroll to the bottom of the article to understand the temporal nature of the Data Mesh 4.0.4 approach.This article was published on 1st...

Data Observability Uncovered: A Magical Lens for Data Magicians
Data Observability Uncovered: A Magical Lens for Data Magicians

Data observability provides comprehensive visibility into the health, quality, and reliability of your data ecosystem. It dives deeper than traditional monitoring, examining the actual data flowing through your pipelines. With tools like data lineage tracking, data quality metrics, and anomaly detection, data observability helps data magicians quickly detect and diagnose issues, ensuring accurate, reliable data-driven decisions.

Agile DataOps
Agile DataOps

TD:LR Agile DataOps is where we combine the processes and technologies from DataOps with a new agile way of working, to reduce the time taken and increase the value of the data we provide to our customers What's in a...

The “Killer” Feature
The “Killer” Feature

One feature to rule them all As product managers we are always looking for the next “killer feature” for our product. You know the one, that feature that will become the magical thing that will have customers flooding...

3 types of product features
3 types of product features

Our UX/UI journey is accelerating We are currently full steam into the development of the initial User Interface for AgileData.io. The team have done some awesome work on the UX designs for a bunch of the core screens,...

Why we founded AgileData
Why we founded AgileData

My co-founder Nigel and I have been working in the data and analytics domain for over 30 years (well I have, he is slightly younger). We have both held multiple roles through these years, Nigel primarily in the...

AgileData App

Explore AgileData features, updates, and tips

Network

Learn about consulting practises and good patterns for data focused consultancies

DataOps

Learn from our DataOps expertise, covering essential concepts, patterns, and tools

Data and Analytics

Unlock the power of data and analytics with expert guidance

Google Cloud

Imparting knowledge on Google Cloud's capabilities and its role in data-driven workflows

Journey

Explore real-life stories of our challenges, and lessons learned

Product Management

Enrich your product management skills with practical patterns

What Is

Describing data and analytics concepts, terms, and technologies to enable better understanding

Resources

Valuable resources to support your growth in the agile, and data and analytics domains

AgileData Podcast

Discussing combining agile, product and data patterns.

No Nonsense Agile Podcast

Discussing agile and product ways of working.

App Videos

Explore videos to better understand the AgileData App's features and capabilities.

Subscribe to our newsletter

We will email you once a fortnight, no spam, pinky promise

Let me read it first