AgileData DataOps

Magical DataOps insights from our Chief Data Plumber

The magic of DocOps

TD:LR Patterns like DocOps provide massive value by increasing collaboration across team members and automating manual tasks. But it still requires a high level of technical skills to work in a DocOps way.  For the AgileData App and Platform, we want to delvier those...
The challenge of parsing files from the wild
The challenge of parsing files from the wild

In this instalment of the AgileData DataOps series, we’re exploring how we handle the challenges of parsing files from the wild. To ensure clean and well-structured data, each file goes through several checks and processes, similar to a water treatment plant. These steps include checking for previously seen files, looking for matching schema files, queuing the file, and parsing it. If a file fails to load, we have procedures in place to retry loading or notify errors for later resolution. This rigorous data processing ensures smooth and efficient data flow.

Magical plumbing for effective change dates
Magical plumbing for effective change dates

We discuss how to handle change data in a hands-off filedrop process. We use the ingestion timestamp as a simple proxy for the effective date of each record, allowing us to version each day’s data. For files with multiple change records, we scan all columns to identify and rank potential effective date columns. We then pass this information to an automated rule, ensuring it gets applied as we load the data. This process enables us to efficiently handle change data, track data flow, and manage multiple changes in an automated way.

New Google Cloud feature to Optimise BigQuery Costs
New Google Cloud feature to Optimise BigQuery Costs

This blog explores AgileData’s use of Google Cloud, specifically its BigQuery service, for cost-effective data handling. As a bootstrapped startup, AgileData incorporates data storage and compute costs into its SaaS subscription, protecting customers from unexpected bills. We constantly seek ways to minimise costs, utilising new Google tools for cost-saving recommendations. We argue that the efficiency and value of Google Cloud make it a preferable choice over other cloud analytic database options.

ELT without persisted watermarks ? not a problem
ELT without persisted watermarks ? not a problem

We no longer need to manually track the state of a table, when it was created, when it was updated, which data pipeline last touched it …. all these data points are available by doing a simple call to the logging and bigquery api. Under the covers the google cloud platform is already tracking everything we need … every insert, update, delete, create, load, drop, alter is being captured

Agile DataOps
Agile DataOps

TD:LR Agile DataOps is where we combine the processes and technologies from DataOps with a new agile way of working, to reduce the time taken and increase the value of the data we provide to our customers What's in a...

AgileData App

Explore AgileData features, updates, and tips


Learn about consulting practises and good patterns for data focused consultancies


Learn from our DataOps expertise, covering essential concepts, patterns, and tools

Data and Analytics

Unlock the power of data and analytics with expert guidance

Google Cloud

Imparting knowledge on Google Cloud's capabilities and its role in data-driven workflows


Explore real-life stories of our challenges, and lessons learned

Product Management

Enrich your product management skills with practical patterns

What Is

Describing data and analytics concepts, terms, and technologies to enable better understanding


Valuable resources to support your growth in the agile, and data and analytics domains

AgileData Podcast

Discussing combining agile, product and data patterns.

No Nonsense Agile Podcast

Discussing agile and product ways of working.

App Videos

Explore videos to better understand the AgileData App's features and capabilities.

Subscribe to our blogs

We will email you whenever we publish a new blog post, no spam, pinky promise