Sunday, April 15, 2018

Azure Log Analytics Usage Caps

I am really happy with the state of the OMS solution in Azure.  It is constantly getting updated with new technologies.  One thing that has bugged me a bit in the past was how there was no way to effectively tell how much usage you had.  Sure there were some tools available to you (ie: using log analytics itself to determine usage) but there was no easy way to translate this to cost, and further, there was no way to limit this.

Just recently, the Log Analytics team announced a new feature called the data volume cap. The idea here is that you can set a maximum/limit on your how much data your workspace will ingest on a particular day.  This is a very important feature, especially after you turn on a service that checks a network connection on a tight loop!

When you navigate to the Usage and estimated costs, you now see the following:







The display above breaks down the logs per day, and also breaks it down by solution name.  As you can see from above, my Log Management solution is my main culprit.  This is probably due to all the performance data that I gather on my production databases.

Clicking on the Data Volume Management now gives you the following options:



It allows you to set the retention in days (like always) and also allows you to set a daily volume cap.  This cap allows you to set a cutoff time, which presumable doesn't line up with how log analytics determines daily usage.

The other part of this is the Azure Monitor integration for billing.  You can now visit the Usage and Estimated Cost tab of the monitor blade to understand estimated monthly costs.  There currently are only a few meters in here, but presumably this will expand.

Under the new pricing model your log analytics shows up as Data Ingestion



I can already see limitations to the current implementation.  It would be really nice to be able to limit particular solutions, or even particular resources within a solution.  I'd also like to have it easy to create alerts when the data limit is being reached, and/or create zones (ie: yellow range is above my tolerance, but I still want to collect).


One good example of the above would be to not limit the ingestion of Security logs, but maybe tailor back the performance logs when I am reaching the limit.  I'll be sure to pass some of this feedback to the team.