Tagging is essential to FinOps but how can I persuade (especially large and complex) organisations to implement robust tagging? Hard to do in practice.
I recommend focusing on the monetary benefits that tagging will bring.
For example, if tagging is used to define the environment (Dev/Prod) then do focus on the impact on financial reporting or tax implication of Dev versus Prod. In many countries, some dev costs can be claimed back. For reporting, knowing what is prod allows to really report things correctly in the P&L.
You need to put yourself in the executive’s shoes who will spend the money to get tagging, and explain the ROI of this initiative.
As you point out, I’d definitely point out the benefits, and there I’d add reduced remediation times in case of problems. If your untagged resources experience service degradation or failure, you loose precious time looking for the contact details of stakeholders and workload owners.
Tagging is really beneficial, but we also need to work on an efficient tagging strategy to maximise the ROI. Tagging is often perceived by engineers as a tedious task that hinders productivity. Also, excessive use of cost allocation tags will blow up your CUR file, with obvious negative impacts on performance and costs during analysis. I always aim at reducing the “required tagging effort” as much as possible:
- automatic tagging: you can auto-tag resources that belong to certain accounts or Organization Units, so upon creation only the non-obvious tags need to be set.
- 3rd party tools to automatically allocate costs even for shared resources can reduce the amount of tags you need.
This way we can persuade managers as well as engineers to implement tagging policies.
I agree that the manual application of tags cannot work.
There is also that tags can be operational or financial. Operational tags can used to do technical things on the right resources. For example, you can use tagging to define the frequency of backup or choose which servers to turn off at night.
Financial tags, which are the ones we usually refer to in FinOps, are an evolution of operational tags and help slice and dice the financial data.
I group tags into operational, financial, and administrative, where administrative tags specify ownership and purpose. There is some overlap: some tags will belong to more than one category, e.g., tags for ownership can specify who to contact in case of issues but also be used for cost allocation.
+1 to pointing out monetary/cost benefits of tagging, especially when presenting the problem to Engineers within a shared environment where without tagging becomes very difficult to perform allocation.
And in most cases, engineers inherit environments that weren’t tagged in the first place. Small environments are easily fixed but when you have thousands of assets, multiple accounts, clouds, etc… the task becomes difficult. One step at a time in tagging and show the benefits of having proper tagging in place. Policies that prohibit anyone from deploying or using services that don’t have tagging in place.
Interesting, I always assumed that good tagging could only be done when IaC is used.
It would be great if some of that was implemented as a Cloud Vendor service.
It would be great if some of that was implemented as a Cloud Vendor service.
Actually, there is a tool by a 3rd party vendor where tagging for cost allocation has become very easy and quick.
Basically, you just tag the main resources, e.g., a VM or an EC2 instance, and the tool analyses the network traffic to quantify the usage of all involved resources. So, if you have three distinct resources for workload A, B, and C, it will allocate the costs for all distinct and shared resources to those workloads.
The main idea is that nothing happens in the cloud without a triggering event such as an API request or scheduled event. From there you can follow the network transactions and interactions to figure out which resources are interacting with each other. Once you tag a distinct resource, the tool will do the rest.
It can also go much deeper than pure cost allocation which is already super cool. It can match cost spikes to actual usage, e.g. a cost spike in RDS can be matched to the single queries responsible for the cost spike.