A large IT network can generate terabytes of operational data every day – everything from application and system logs through to monitoring data and events. All of that information can help you to improve service quality, lower costs, and optimize capacity. It’s a treasure trove waiting to be unearthed.
However, there’s a problem. We’re talking about huge amounts of data here – more information than humans can comprehend, let alone process. How do you unlock the value of operations analytics ITOA data, gleaning key insights from a seemingly chaotic ocean of disconnected information?
That’s where IT Operational Analytics (ITOA) comes in. It applies big data principles to IT operational data, extracting, indexing and analyzing information across multiple datasets. These can include wire data, self-reported machine data, synthetic transactions, data from software agents, and other similar sources. Often, the data comes from a myriad of monitoring tools, each with its own siloed view of the world. With ITOA, the goal is to break down these barriers – creating a holistic view of IT environment behavior.
What Problems Does ITOA Address?
Historically, many ITOA tools have focused on predictive analytics based on performance metrics. By profiling network performance, they create a model of normal IT behavior. Then, they look for deviation from this behavior – anomalies that indicate something is starting to go wrong, even if there are no other immediate symptoms. By detecting and reporting these anomalies, ITOA systems give warning of IT issues, allowing IT operations to take proactive steps to prevent service outages and degradations.
Of course, this is only one instance of applying big data analytics to IT operational data. There are other use cases – for example, assessing the risk of changes, or looking for discrepancies between production and disaster recovery environments. However, performance-based predictive analytics remains a predominant use case, along with log analysis for IT security.
How Is ITOA Evolving?
While this type of predictive analytics does give warning of potential issues, it also has its limitations. It takes an outside-in approach, looking for performance anomalies that could lead to service issues. However, because it is primarily a historical analysis tool, services can already be impacted before the anomaly is discovered. And, it doesn’t tell you what is wrong – just that something isn’t right. Uncovering the issue still takes major time and effort, creating significant effort and risk.
This leads to a common question: What is the difference between ITOA and AIOps?
The good news is that ITOA isn’t standing still. Most recently, we’ve seen momentum building around AIOps, which can be thought of as the next generation of ITOA. Like ITOA, AIOps is an analytics platform that analyzes IT operational data, but it does this by applying artificial intelligence, rather than simply relying on traditional big data analytics techniques. It also analyzes a broader set of data, including event data and ITSM information. And, it does this in real time, resulting in immediately actionable insights.
What Are the Benefits?
First, because AIOps adds event data, it can now pinpoint the reason for service and infrastructure issues. Unlike earlier ITOA approaches, IT operations now gets immediate information about why a service is likely to experience – or is already experiencing – an issue. That dramatically reduces MTTR, since much of the manual investigation work is replaced by automated diagnostic processes.
Second, because AIOps applies machine learning to IT operational data, it can identify historical patterns that have led to service issues in the past. When it sees these patterns start to emerge again – for example, a sequence of minor events – it can predict with higher levels of confidence that the issue is about to happen again. Operational staff now have hours to act– so they can fix the problem before service and customers are affected. And AIOps discovers these patterns automatically using machine learning, so there’s no need to tell it what to look for – unlike with legacy ITOA approaches.
Third, because AIOps also applies machine learning to ITSM information, it can now enrich events with meaningful business and operational information. Because it has access to the CMDB, it can now correlate events more quickly and accurately. Using incident data, it can now rank events by learning which ones typically result in major incidents – and even learn how these incidents were diagnosed and resolved in the past. This allows AIOps to more precisely identify the root cause of service issues, and to make remediation recommendations.
To discover more about ITOA and AIOps, and how they help you to deliver radically better service quality at a significantly lower cost, contact us for a demo.