Because sharing is caring
In this instalment of the AgileData DataOps series, we’re exploring how we handle the challenges of parsing files from the wild. To ensure clean and well-structured data, each file goes through several checks and processes, similar to a water treatment plant. These steps include checking for previously seen files, looking for matching schema files, queuing the file, and parsing it. If a file fails to load, we have procedures in place to retry loading or notify errors for later resolution. This rigorous data processing ensures smooth and efficient data flow.
Customer segmentation is the magical process of dividing your customers into distinct groups based on their characteristics, preferences, and needs. By understanding these segments, you can tailor your marketing strategies, optimize resource allocation, and maximize customer lifetime value. To unleash your customer segmentation magic, define your objectives, gather and analyze relevant data, identify key criteria, create distinct segments, profile each segment, tailor your strategies, and continuously evaluate and refine. Embrace the power of customer segmentation and create personalised experiences that enchant your customers and drive business success.
Immerse yourself in the magical world of data with AgileData’s ‘Ask a Quick Question’ capability. Perfectly designed for data analysts and business analysts who need to swiftly extract insights from data, this capability facilitates quick data queries and rapid exploratory data analysis.
TD:LR In mid 2023 I was lucky enough to present at The Knowledge Gap on the Information Product Canvas. Watch The Information Product Canvas, is an innovative pattern designed to capture data requirements visually and repeatably, making it easier for both stakeholders...
We discuss how to handle change data in a hands-off filedrop process. We use the ingestion timestamp as a simple proxy for the effective date of each record, allowing us to version each day’s data. For files with multiple change records, we scan all columns to identify and rank potential effective date columns. We then pass this information to an automated rule, ensuring it gets applied as we load the data. This process enables us to efficiently handle change data, track data flow, and manage multiple changes in an automated way.
In an insightful episode of the AgileData Podcast, Shane Gibson hosts Ahmed Elsamadisi to delve into the evolving world of data modeling, focusing on the innovative concept of the Activity Schema. Elsamadisi, with a rich background in AI and data science, shares his journey from working on self-driving cars to spearheading data initiatives at WeWork. The discussion centers on the pivotal role of data modeling in enhancing scalability and efficiency in data systems, with Elsamadisi highlighting the limitations of traditional models like star schema and data vault in addressing complex, modern data queries.
Unveiling the Secrets of Data Quality Metrics for Data Magicians: Ensuring Data Warehouse Excellence
Data quality metrics are crucial indicators in a data warehouse that measure the accuracy, completeness, consistency, timeliness, and uniqueness of data. These metrics help organisations ensure their data is reliable and fit for use, thus driving effective decision-making and analytics
The AgileData Context feature enhances data understanding, facilitates effective decision-making, and preserves corporate knowledge by adding essential business context to data. This feature streamlines communication, improves data governance, and ultimately, maximises the value of your data, making it a powerful asset for your business.
This blog explores AgileData’s use of Google Cloud, specifically its BigQuery service, for cost-effective data handling. As a bootstrapped startup, AgileData incorporates data storage and compute costs into its SaaS subscription, protecting customers from unexpected bills. We constantly seek ways to minimise costs, utilising new Google tools for cost-saving recommendations. We argue that the efficiency and value of Google Cloud make it a preferable choice over other cloud analytic database options.
Imparting knowledge on Google Cloud's capabilities and its role in data-driven workflows
Describing data and analytics concepts, terms, and technologies to enable better understanding
In the word of agile, there are three common testing techniques that can be used to improve our testing practices and to assist with enabling automated testing.
Shane Gibson chats to Tammy Leahy about how she helped scale the data teams she leads.
Join Shane and guest Eric Broda as they discuss data products.
TD:LR … you don’t always need to use DAGs to orchestrate Previously we talked about how we use an ephemeral Serverless architecture based on Google Cloud Functions and Google PubSub Messaging to run our customer data...
TD:LR When we dreamed up AgileData and started white-boarding ideas around architecture, one of the patterns we were adamant that we would leverage, would be Serverless. This posts explains why we were adamant and what...
This is the first of a series of articles detailing how we built a platform to make data fun and remove complexity for our users
In 2022 Shane Gibson was lucky enough to present “Analysts can model democratising data modeling” at the Knowledge Gap Conference
Watch the presentation.
TD:LR I talk to Scott Hirleman on the Data Mesh Radio podcast on my thoughts on Data Mesh and the need for resuable patterns in the data & analytics domain. My opinion on Data Mesh I am not a fan of the current...
Marketing Analytics involves analysing data from various channels, such as social media, email, and websites, to assess the performance of marketing efforts.
Product Analytics focuses on understanding and improving user experience and satisfaction with digital products or services.
TD:LR AgileData mission is to reduce the complexity of managing data. In the modern data world there are many capability categories, each with their own specialised terms, technologies and three letter acronyms. We...
Join Shane and guest Laura Bell as they discuss how you can eat the security elephant in an agile way.
TD:LR Data Mesh 4.0.4 is only available for a very short time. please ensure you scroll to the bottom of the article to understand the temporal nature of the Data Mesh 4.0.4 approach.This article was published on 1st...
Data catalogs are comprehensive inventories of an organisations data assets, helping data analysts and information consumers to quickly find, understand, and utilise relevant information. They foster collaboration, maintain data governance, and ensure compliance.
Early in 2022 Shane Gibson was lucky enought to talk to the Catalog and Cocktails podcast crew about agile in the data domain. Watch or listen to the episode.
There is a lot of vendor washing going on A lot of data vendors are vendor washing their technologies to pretend they enable "Data Mesh" as they are punting on Data Mesh being the new thing for 2022. I think they are...
Data observability provides comprehensive visibility into the health, quality, and reliability of your data ecosystem. It dives deeper than traditional monitoring, examining the actual data flowing through your pipelines. With tools like data lineage tracking, data quality metrics, and anomaly detection, data observability helps data magicians quickly detect and diagnose issues, ensuring accurate, reliable data-driven decisions.
Join Shane and guest Benn Stancil as they discuss the difference between the age old analyst role and the new emerging role of analytics engineer (amongst a few other interesting things)
Join Shane and guest Brian McMillian as they discuss the art of architecture in an agile data world. We discuss 3 things: 1. the 4x approach2. data vault3. everything is code
Join Shane and guest Raphael Branger as they discuss combining agile ways of working with the world of data and business intelligence (BI).
Join Shane and special guest Shaun McGirr as they discuss the combining of agile ways of working with analytics teams.
TD:LR AgileData's mission is to reduce the complexity of managing data. A large part of modern data complexity is selecting, implementing and maintaining a raft of different technologies to provide your "Modern Data...