I can write a bit of code faster

01 Oct 2022 | AgileData Network, AgileData Way of Working, Blog

TD:LR

To get data tasks done involves a lot more than just bashing out a few lines of code to get the data into a format that you can give it to your stakeholder/customer.

Unless of course it really is a one off and then bash away.  But I bet if that information provides value to your stakeholder/customer then the stakeholder/customer will come back for more.  They always do.

Shane Gibson - AgileData.io

Just shut up and let me code, alright

One thing we have known for a long time is developers/engineers love to code.

It is what they have trained for, it is what they do all day, it is what they enjoy doing all day and it is also what they are good at.

For some it is about solving problems via code, for others it is about the art of crafting awesome and beautiful code itself. 

At AgileData our focus is to reduce the complexity of managing data, and we are focussed on solving that for people who don’t/can’t/won’t code.

So we have always known that developers/engineers are not our target persona, the target persona we aim to help is a data literate analyst.

It’s just a couple of simple lines of code

When we talk to an analyst we can typically group the analyst into one of three persona’s:

  • Low level of data literacy : they do not work in the data domain very often;
  • Data literate, but can’t code : they work in the data domain on a regular basis but don’t write or maintain code, Microsoft Excel or another tool is their best friend
  • Data literate and they can code : when they want to solve a data problem they open a coding window and starting bashing the keyboard

When we show the third analyst persona (or an engineer for that matter) how we solve data problems using the AgileData low code environment we often hear something along the lines of:

“I could learn your product, how to point and click through it to do the data work, but it is much quicker for me to just type in a few lines of code in my code editor to get the task done”.

(To be fair we still have work to do to reduce the number of clicks required to get each data task done, something we are focussed on)

Tasks to be done

Let’s look at that statement, “get the task done”.

If the task you have been asked to do is bash some data as a one off and give your stakeholder/customer a one off answer, then hell yes, if you can bash some code quickly to get the task done, bash away.  If you can wrangle some data in Excel and create a couple of formulas to get the one off task done, then wrangle away.

But we have all experienced the scenario when it is not really a one off task, once we have given the data or information to the consumer they will want an updated version of this information in the future.  And so we end up having to manually repeat the one off task.

There are also a few other tasks we should do even when it is a “one off” task that needs to be done.  

We should probably test the data we produced so we make sure the consumer can trust it, we should probably document the work we did in case we get asked a question about it in the future or somebody else has to look at it when you are not around, we should probably backup it up so we don’t lose the data or the code, if we are asked to make a change we should probably version the data and the code so we can know what was changed.

Will the real data tasks to be done please stand up

The founders of AgileData have spent years as consultants helping get these data tasks done for organisations.  We have defined processes, practices and patterns to get these tasks done in a safe way.  We have built bespoke platforms and code for organisations that automate these tasks, reducing the effort and risk of doing them repeatedly.

For the first couple of years building out the AgileData product we were focussed on crafting the code that automated these tasks in our platform, as we knew we were going to need them.

Let’s look at the real data tasks that need to be done when managing data to provide information to a stakeholder or customer who is going to make a business decision or take some action based on that information.

Versioning

Many of the data tasks I am about to describe will force you to make changes to the original code you bashed together.  If you have spent time with engineers you will know you should probably create versions of your code as you change them to save your arse when something unexpected happens.

And that will introduce you to the Versioning task to be done.

You might take the simple path and just copy and rename the code before you change it.  Hopefully you will use a simple naming pattern like 1.1, 1.2, 1.3 or 2022-09-01, 2022-09-02 etc.  Names such as: 

“Salesforce Active Customer Count – test delinquent customers 2021 – shanes attempt 2 – Tuesday – final version dont delete” 

make it a little hard to determine which is the current version when you return to edit it six months after you have written it.

You might go one step further and use Onedrive, Dropbox or another file storage tool and rely on it keeping versions of your code for you.

Or you might use a source control system such as git, and you will spend your time understanding how to push, pull, merge, commit etc.  At least you won’t have to write the code to do the versioning, you will just leverage the git capability.  You will however spend an absorbent amount of time trying to decide if you should use a branching pattern or not.

You will hit the problem when somebody else checked out your code and edited it at the same time you did, and you lose the race to merge your changes back into the git repository before they do and you will be left with the dreaded merge conflict resolution task.

You will look at options to version your data as well as your code and you will decide not to go down that path, as there are few visible patterns available to help you version data.

Documentation 

You will lightly document your code, after all code is self describing. You will put a few notes in your code as comments if you have time.  Six months later you will need to make a change to that code and the note of “filter customer records” will tell you what you did (although the code line of “where customer_type != ‘gold’” tells you that anyway) but the note gives you no clue why gold customers are excluded or even what makes a customer gold.  At some stage you will leave the organisation for good, or go on holiday, and one of your peers will be tasked with making the change and will need to interpret your cryptic notes.

And that will introduce you to the Documentation task to be done.

You might create better notes and comments in your code.  You might use a seperate tool to document what you have done, piles of word files anyone?  You might use a wiki, confluence or SharePoint site to write your documentation so anybody can find, read and update it.  You might create code that parses the comments in your code and automagically creates/updates the documentation.

You will definitely over document at the beginning in your initial rush of enthusiasm and then pair that back to just what you think you will need as the novelty wears off.

Orchestration

At some stage you will have bashed out a few different chunks of code, and these chunks of code have created useful data, and when you are asked to do the next data task you are going to be tempted to reuse some of that predefined data to save time.  Hell it makes sense to reuse that data compared to doing it all again from scratch.

And that will introduce the Orchestration task to be done.  

You will need to write code to orchestrate your other code, so it will run in a certain order and manage the dependency between the data and code.

For example, run code1 which creates table1 and then once that has successfully completed run code2 that uses table1 to create table2.

At some stage your code will fail, but the dependent code will execute regardless of the failure. You will need to add the ability to manage failures in your orchestration code.

Scheduling

The information you provided has added value to your stakeholder/customers, you will be asked to provide it on a regular basis.  You are going to get pretty sick of sitting there manually running the code you wrote each day, or even running it every hour.

And that will introduce the Scheduling task to be done.

You will need to write code that submits or executes your code to run at a certain time of day or certain day of the week.

For example, run at 7am each business day, but not if it is a public holiday.  

Don’t forget to factor in when the data is available in the System of Capture and the impact of the orchestration code you have already written.  No point scheduling the code if the new data isn’t available or scheduling the code to run out of order with the other code it is dependent on.

Logging & Monitoring

You will start by manually executing your code to run and sit there and watch it do its beautiful thing, revelling in the successful completion message at the end of the run.  Then you will get bored and wander off for a coffee or tea while its running, or start multi-tasking while it runs.  Or you will have already created the Scheduling and/or Orchestration code and so wont be there to watch the code run.

And that will introduce the Logging & Monitoring task to be done.

You will need to write code that monitors your code as it runs and records when it completed successfully or when it failed.  You might create code that monitors the execution logs (you are creating and storing execution logs right!) and uses those to determine a success or failure, or you may create code that requires the execution code to push events to the monitoring system.  

This second event pushing pattern will require you to edit the original code to push those events, hopefully you have completed your versioning code task already so you can manage these changes safely,

Testing

You will either get stuck in an endless code review trying to work out why the data you have just created doesn’t look right or the information consumer will approach you with those immortal words “the data looks weird and wrong”.  After endless hours investigating you will determine you forgot to put in a where clause or the System of Capture you collected the data from did a sneaky little change without telling you.

And that will introduce you to the Testing task to be done.

You will need to write code, that executes code, that tests your code and/or data.  Ideally you will use a config driven pattern where you can define your tests in a natural language (as config) and your testing code uses this config to automate the creation of the validation code and execute it, and the results of the test are stored somewhere.  Or you will not have time and you will just manually write test code each time you write data code and closely couple the two.

Alerting

You will either be drinking coffee or watching cat videos while your code executes and will miss when it finishes.  Or you will have completed the Orchestration and Scheduling tasks to be done and so won’t be up at 3:08am to see it finish.

And that will introduce you to the Alerting task to be done.

You will write code that mirrors the patterns of the Monitoring task to be done, assuming you have completed that already.  You will parse the logs or push events to identify code which has successfully completed and a set of statuses that provide fine grained information on what completed actually means.  You will then push these notifications to your favourite collaboration tool, aka Slack, Teams or email and eagerly open those tools upon waking to check how the rest of your day is going to play out (for the first month or so anyway).

If your stretched for time you might just write them out to an alert file and promise yourself that you will open and review it first thing each day (after your first coffee of course).

Reconciliation

At some point somebody is going to tell you the information you are producing doesn’t match the data in the System of Capture.

And that will introduce the Reconciliation task to be done.

You will need to write code that compares the data you have produced with the System of Capture you collected it from.  You will start out using brute force patterns where you compare all the data in one go. Over time you will need to refine these patterns to compare and reconcile the data with patterns that put a smaller amount of compute load on the System of Capture, run quicker or cost less to execute than your brute force patterns.  You will start using Hashdiff’s, comparing row and column counts and checksums.

You will also need to compare the multiple output data tables you have created, which hold similar data, hopefully you have already done the testing task and can reuse that instead.

Data Collection

You started off by manually extracting the data you needed to get started from the System of Capture, or the person asking for the information may have sent you a one off csv file with the data they had manually extracted.  You wrote your code against that manual data as it was just a one off task.  Your stakeholder/customer got value from the information you provided and so asked you to rerun your code with up to date data from theSource of Capture, again you either manually extract the data yourself or get the latest data sent to you.

After a couple of iterations of doing these manual data collection tasks you are asked to automate it as a regular process, and this includes the automated collection of the data from the System of Capture.

Or the person that extracts the file manually for you will leave the organisation and you will be asked to automate the collection of the data.

And that will introduce you to the Data Collection task to be done.

You will need to find a technology pattern that will allow you to connect and extract the data from the System of Capture and that technology will be highly dependent on the technology being used by the System of Capture.  The technology you can use to  automatically collect data from your SaaS based Shopify system, may not be able to be used to collect data from your cloud hosted Oracle database that sits under your bespoke operational system.  And neither of those technologies may be able to collect data in near realtime from the message queues you use as part of your customer facing web app.

You will then need to make sure your chosen data collection technology can be integrated with the logging, monitoring and alerting tasks you have already completed.

The automated data collection pattern will also make you think about adding a new data layer in your data architecture to “stage” the collected data, if you haven’t added that data layer already.

If you can’t programmatically access the System of Capture to collect the data then you will have to create a file drop pattern that automates as much of the process to land and run your code for the manually collected files as you can.  This includes letting the consumer upload the file themselves, validate that the file is in the format and schema you expected, automate the running of the code when the file is uploaded, stopping duplicate files being processed and ensure all your logging, monitoring and alerting tasks work with manually uploaded files.

Data Mutation

Once you have automated the data collection, at some time in the near future, the group who maintains the System of Capture will make a breaking change.  It might be a group of internal software engineers who drop a column they don’t need any more, or a enterprise software vendor who upgrades your internal enterprise software which results in a migration of the data to a schema that bares no resemblance to the old version, or a SaaS application where a field or API that used to only return numeric values now returns strings. 

Or somebody will change the values that can be stored in a column without making any schema or API changes, and so none of your data tasks will break.  However the information being consumed will now be wrong due to the logic you were using in your code suddenly becoming invalid.

And that will introduce you to the Data Mutation task to be done.

You will start off by focussing on the task of detecting schema mutation.  You will implement the schema validation pattern where you check directly before or after you collect the data that the schema is what you were expecting and add failed checks to your automated alerts.  Each time it breaks you will spend the day working out what the impact of the break was and manually resolving it.

You will create new tests to validate the results of the data after your code has run to ensure the logic in your code is still valid.

You will then try to find and implement patterns that enable automated self healing of Data Mutations as a result of changes in the System of Capture, so you don’t lose whole days manually rectifying them when they happen.

Change Data Capture and Historical Data Storage

You started out just storing the current view of the data from the System of Capture.  Most stakeholders/customers start out just wanting a view of the data “as at now”.   Overtime edge cases will appear that require you to store changed data or a historical view of that data.  

The System of Capture may not store historical changes, and somebody will want to know what was the person’s last name “as at” February last year.  Last names don’t change often but they do change, for example when somebody gets married.

Or you may be in a highly regulated industry and you will need to store an auditable record of any data used to make decisions.

And that will introduce you to the Change Data Capture and Historical Data Storage tasks to be done.

You will need to find a technology pattern that helps you detect changes in the System of Capture and that technology will be highly dependent on the technology being used by the System of Capture.  The technology which automatically detects and gives you change records from your SaaS based Shopify system won’t be able to be used to detect and send changes from your cloud hosted Oracle database that sits under your bespoke operational system.  And neither of those technologies will be able to detect and send changes in near realtime from your message queues that you use as part of your customer facing web app.

Once you have found all the technology patterns you need to achieve the change data detection and capture task, next you will need to decide how you store these changes in your data platform.  This task will require you to pick a data modeling pattern, do you adopt the source specific change tables aka Permanent Staging Area (PSA), or Data Vault, or Anchor, or Third Normal Form (3NF), or Dimensional, or Activity schema patterns.  And these data model patterns will be influenced by your data repository technology as you will need to ensure that it can handle the  inserts/upserts/windowing and join or no join requirements the data modeling pattern requires.

The change data capture pattern will also make you think about adding a new data layer in your data architecture to “stage” the change data, if you haven’t added that data layer already.

Lastly you will need to find patterns that make this historical change data consumable.  Do you maintain two sets of consumable data, one “as at now” and one “as at any point in time”.  Or do you use a windowing function to allow a single set of consumable data to be used and the “as at” pointer is then controlled in the Last Mile Information App.

Hopefully the patterns you implemented for the data collection tasks made this task easy and didn’t require you to redevelop a bunch of your previous patterns.

Fine grained security and Data Masking

As the information you have delivered gets used more and more and the audience who access it expands from a small group of users to the majority of people in your organisation (typically via a self service visualisation tool), you will be asked to hide certain data from certain groups of people.  

And that will introduce you to the Fine Grained Security or Data Masking tasks to be done.

You will first be asked to hide big chunks of data, from big chunks of the information consumers, for example Salary numbers from everybody outside the Human Resources group.  You will apply a pattern where you either hide the Salary column from a group of users, or you will create two consumable versions of the data, one with Salary data and one without.  Hopefully you have already integrated some sort of Single Sign-On in your Last Mile Dashboarding tool and this is integrated with your data repository  so you can identify the user who is accessing the Dashboard and automatically hide or show this data.

You will then be asked to add more people groups and the membership of each group will be smaller.  You will be asked to hide more sets of data and so will create data groups.  

You will start to think about mapping people to roles and using roles to grant access to the data.  You will also start seeing the people groups match your organisational hierarchy, with names like Human Resources, Finance and Sales.  You will start to see your data groups match the names of your Systems of Capture, with names like Human Resources data, Finance data and Customer data.

Then you will be asked to hide data from a specific list of people groups but not hide it from a different list of people groups.  You will also be asked for a person to be in multiple people groups, but not all people groups.  

You will know you are at this level of complexity when you end up with an Excel pivot table that you use to understand the large number of mappings of the people groups to the data groups you need to manage.  

You will then move on to reimplementing the fine grained security patterns using a policy based pattern rather than group/data mapping.

At some time during this process you will be asked to stop using a pattern which is based on “hiding data”.  It might be that you are hiding rows of data that are flagged as “sensitive”, for example sensitive claims.  The problem is that when the consumer looks at the count of claims those sensitive claims are missing.  Or you might be asked to let consumers know that the Customer Name field exists but just not show some of the consumers the name itself. You might even be asked to show Customer Names to all consumers unless it relates to a Sensitive Claim.

You will implement a data masking pattern where you replace the sensitive data with placeholder data, perhaps the word “SENSITIVE” or a mask the data itself , for example “#### #####” for John Smith.

You might decide to combine the data masking pattern with the policy based pattern you implemented earlier.

When the data governance people see the beauty of your data masking pattern they will ask you to implement it in the Non Production data environments so the engineers can work with the Sensitive data but not see it.  This will force you to adopt a dynamic data asking approach which allows the testing and logging tasks to still run, all while not being able to see the actual data itself.

Data Modeling

Starting out small, creating code that quickly provides information to your stakeholder/customer which helps them realise some value is an agile way of working and we highly recommend this way of working.  However over time this code stacks up and commonly causes a few problems, we liken it to creating a Jenga block tower. Eventually the tower gets so tall it gets unbalanced and crashes in a pile of blocks.

One of the areas we see this happen is when there is no data modelling undertaken.  Analysts (and to be fair even some engineers) just jump into writing code without planning the data structures they are going to create.

Over time you will find the Customer record lives in multiple different tables, nobody will know which one to trust or use.  When a System of Capture gets changed nobody will know which one has been updated and is still valid for use and which is now out of date. Often the column in the table won’t be named something logical like “Customer” so you will have to look at the data in the column to work out what it holds. 

Even if you really are planning on creating new disposable blobs of code and data every time, you will enter into conversations with the producers and the consumers of the data who will want to understand how you have modeled and stored the data. 

And that will introduce you to the Data Modeling task to be done.

You will start off using Microsoft Excel / PowerPoint / Visio to design the core business concepts (Customers, Employees, Suppliers, Products, Orders, Payments etc) in your organisation, or you will try to reverse engineer the data model from your System of Capture to determine these.

You might use a data modeling tool to create visual Entity Relationship Diagrams (ERD’s) of the data models, or go the whole hog and try and model the entire enterprise in one go using a tool like EA Spark.

You might take it one step further and try to automate the creation of the physical data model in your data repository from the conceptual / logical models you have drawn in the ERD tool.

Lineage

After you have spent time “just writing a few lines of code to get the task done” for a while,  each blob of code has added value to your stakeholders/customers and now has value and you cant delete it.  You have also reused some of your previously created data as reusing it just made sense.  You will now have a bunch of code and data that is reliant on each other.  

You will get asked to make a simple change to the logic for a small part of that data and you will make a simple change to one of those blobs of code.  You will awake in the morning to discover your monitoring kicked in and you are alerted of a failure when your scheduled code ran last night.

You look at the results of your versioning task and review what you changed, and you check the results of your documentation task to see why you made that change.  But none of that tells you what was broken by the change you made.  Ideally your testing task would have had regression tests already defined in config that would have altered you when you made this change that you were breaking a dependency elsewhere, but alias you haven’t completed the testing task yet or you haven’t managed to add regression test patterns to the testing config,

And that will introduce you to the Lineage task to be done.

You will craft some code that creates a visualisation of the dependencies across your code and data.

You might parse the execution logs to build out these dependencies, or you might parse your code to identify them.  If you have completed the versioning task using git this might help as your code is all in one accessible place.  You might put hints in your code and parse those to identify the dependencies.  

You might rewrite your entire code development process to first create config that is then used to autogenerate the execution code.  That config will store the dependencies that let you easily see the lineage.  

You might use some open source code to generate the lineage map each time code is edited or you might reuse some D3 code to create a dynamic lineage map that you can use to explore the lineage.  

At some stage you will have created so much code and data that the lineage map will start to look like mad person knitting and you will need to add code that allows you to refine what is shown on the lineage map so you can focus on the problem at hand.

Cataloging

After delivering the first bit of information to your stakeholder/customer you will be asked for more.  It might be a variation of what you just delivered or something that requires a different set of data.  So you will bash out a new set of code and data.  

After a while you will have a pile of code and a pile of data and the consumers of this information will keep asking things like “which data holds the customer name again?”, “what data does the feature 2 column hold again?”, and “how did you calculate active customer again?”

Hopefully you have already completed the Documentation tasks so you can easily look up the answers to these questions.  Or you have got good at writing code to query your code or the data that you created.  But eventually you are going to get frustrated and want to help the consumer serve themselves.

And that will introduce you to the Data Catalog task to be done.

You will create code that parses your code and the data to create “metadata” and you will surface this via a self service data catalog.  

You will probably start off by producing a static data catalog that can be easily viewed, it might be a set of html pages you generate each time you add code or code is executed. You might push this content to a wiki, confluence or SharePoint site so you don’t have to build menus, sign-ons etc.  

Eventually as you get asked to update the “context” or add business descriptions to the catalog, you will be tempted to create code that allows the information consumer to update this information themselves.  Congratulations you have just jumped from the work of being an analyst to the world of being a software engineer.

If you have already completed the lineage task you will embed the lineage maps into the catalog.  After all a picture beats a thousand words.

Cost optimisation

After a while you will be a victim of your own success.  You will be providing valuable information on such a cadence that you are running a lot of queries that are using a lot of compute on your cloud analytics database of choice.  And this compute will result in a large bill from said cloud analytics database provider.  Although the cost is outweighed by the value added with this information there is a disconnect between the person using the information and the person paying the bill for the technology which creates it.

And that will introduce you to the Cost Optimisation task to be done.

The first thing you will need to understand is where the costs are being driven from, is it your code which creates and updates the data or is it something else outside your control.  Perhaps another business group has brought a shiny self service reporting tool and connected it to the valuable data you create without telling you and now you have hundreds of users slicing and dicing to their heart’s content, generating a expensive query cost with every click of their mouse or push of their finger on their iPad.

Or those cheeky data scientists are using your data as a source for their analytics feature factory and are doing full table scans every night as they recalculate their feature flags or retrain their machine learning models.

Ideally you have already completed the logging & monitoring task and so have a good source of this data to use, ideally with a historical volume of logging data from a reasonable period of time so you can determine trends.  Nothing worse than optimising for that big query that stands out like a dogs whatsit and then realise that was a one off query that never happens again.

Maybe you only logged & monitored the success of your code and so you have no data to inform you how much executing each blob of code costs.  Or you only logged & monitored the code you created and you can’t see the cost for the consumers using your data.  If that’s the case you will need to change the logging and monitoring code to capture the data you need and wait a little while for the data to turn up.

You might start by manually looking at the logging data to see if you can spot the easy wins, or you might drink your own champagne so to speak and write some blobs of code that transform the logging data into data that you can easily consume in the future to understand your compute and storage costs (hint if your using modern cloud platforms focus on the compute not the storage, storage costs are typically way cheaper than compute costs).  After all, you know you are going to be asked to review and reduce the costs on a regular basis.  

Or you might get really excited and draft in the help of some of those data scientist bods (who you expect are the cost culprits) and use linear regression, clustering or even machine learning to identify the cost outliers.  Although the old SQL “group by” isn’t a bad way to get started, typically there will be some quick wins when you first start the process.

Once you have found the code that is causing the high compute costs, that’s just the beginning of the task.  Now you have to figure out how to optimise the code to actually reduce the compute cost.

Single List

At this stage you have created safe versioned code, you are testing the data as the code runs, you are logging and monitoring each time the code runs, you are alerting when there is an issue, you have modeled the data so you can describe the difference between a Customer, Supplier, Employee and Product records.  You might even have completed the Catalog and Lineage tasks so your consumers can self-serve and gain visibility of the output of all these tasks.

And when they do serve themselves they will notice that Customers are stored in multiple tables, and they will ask you where the single list of customers can be found.

And that will introduce you to the Single List task to be done.

Also known as creating a single view, creating a golden record or Master Data Management (MDM).

You will look at your customer data and notice that each table that holds a Customer record has a different identifier for that Customer.  The Customer records from your Order System of Capture uses a numeric id for each unique Customer, the Customer record from your financial System of Capture uses the first 8 digits of the Customers name combined with a 4 digit numeric and the Customer data from your Salesforce CRM System of Capture looks to use some kind of hash id, but also weirdly seems to create more than one type of id format for Customers.

First you will model the way you want to store this single list of records.  You will hopefully design it so you can store a single list of Suppliers, Employees, Products etc in the future as you will no doubt be asked to do so.

Then you will work on matching algorithms that allow you to compare lists from two or more different tables and suggest a recommended match.  You will start off with a simple brute force matching pattern, which looks for Customers with the same name, but over time you will extend it out to use other fields, for example address, date of birth (if your Customers are individuals) or supposedly unique identifiers like drivers license id or company number.  Hopefully you will have completed the Data Masking task as you will need to hide these unique identifiers from groups of people. 

You will start applying patterns which cleanse the data to make it easier to match, for example changing Bob to also match with Robert, and parsing addresses into their component parts so you can match each part of the address, for example City.

Then you will introduce a pattern where you can cascade through a set of matching rules to control the sensitivity of the matches, for example if the name and date of birth is an exact match pick that one, else try and add parsed address to the matching rules next.

Single View

You will have modeled the data and you will have created single lists of things like Customers, Suppliers, Employees, Products etc.  

You have modeled the counts and amounts, the core metrics, the KPI’s or the OKR’s based on the System of Capture.  You have one Customer Dashboard that shows the number of orders from your Order System of Capture.  You have another Customer Dashboard that shows the total revenue, costs and margin for each Customer from your Financial System of Capture.  And finally you have a third Customer Dashboard that shows the number of interactions a Customer has had with you via your CRM System of Capture.

A stakeholder/customer then asks you how hard it would be to have all this information in a single Customer Dashboard.

And that will introduce you to the Single View task to be done.

Your last mile Dashboarding tool might let you do this quickly by placing each chunk of this information into separate widgets on the same Dashboard page, giving the veneer of a Single View.

If you have already completed the Single List task then depending on the data modeling pattern you used you might relink all the counts and amounts to also link to the Single List records you created.  This would allow you to create a single set of filters on the one Dashboard screen allowing the consumer to filter on a Customer record and see all the metrics being filtered by that record.

Getting the real data tasks done

As you can see, to get data tasks done involves a lot more than just bashing out a few lines of code to get the data into a format that you can give it to your stakeholder/customer.

Unless of course it really is a one off and then bash away.  But I bet if that information provides value to your stakeholder/customer then the stakeholder/customer will come back for more.  They always do.

Keep making data simply magical

AgileData.io provides both a Software as a Service product and a recommended AgileData Way of Working.  We believe you need both to deliver data in a simply magical way.

A Modern Data Stack underpins the AgileData.io solution.

AgileData.io