When it comes to data analytics, most organisations have historically focused primarily on descriptive and diagnostic capabilities. Descriptive analytics explains what is happening in an IT system and uses analysis levers, including analysing trends, mining patterns, and detecting changes and anomalies. Diagnostic analysis encompasses functions including critical path analysis, bottleneck analysis, fault propagation models, and root-cause analysis to explain why something is happening in the system.
With an increased focus on instrumentation and observability, allied to significant advances in AI, enterprises are now looking beyond simply what happened and why and seeking to apply advanced intelligence to draw valuable predictive insights from data. IT leaders are looking for insights that can inform them about what is likely to happen in the future and how to prepare for it, for example:
- Whether they will meet business service-level agreements (SLA) on a given day.
- How workloads are likely to be over an upcoming holiday season.
- Is their infrastructure robust enough to meet those projected workloads?
- Which infrastructure elements are likely to be at risk in the near future?
What Is Predictive Analytics?
Predictive analytics is not a new discipline, having been used in rudimentary form when the first computers started to be used by governments and institutions in the 1940s. However, in today’s IT environment, predictive analytics is used to gather historical data — past trends, patterns, correlations, and associations — and apply statistical algorithms and machine learning techniques to identify the likelihood of future outcomes. Different tools, including regression models, classification trees, neural networks, and queuing models, are used to perform this analysis.
How Predictions are Generated
There are a number of different prediction algorithms that exist, with varying degrees of sophistication. Some techniques are univariate in nature, observing only a single characteristic or attribute. They look at one-time series in isolation, mining changes, trends, and patterns, then use time series forecasting algorithms to forecast the future values of the time series. For example, the response time of a URL can be predicted by analysing the time series of past response time values and using that to derive trends and patterns in the time series to predict future values.
More complex techniques include the holistic analysis of the system. Here, instead of looking at one component or one-time series alone, multiple components and their inter-dependencies are analysed to predict the system. In this instance, the response time of a URL can be predicted by modelling the underlying topology of the application server, database, compute, storage, and network components. Predictions are done by creating detailed multi-variate models considering various metrics such as workload, application server heap, thread pool utilisation, database query response time, and CPU and memory utilisation, amongst other measures.
Leveraging Predictive Analytics with AIOps
With the evolution of artificial intelligence for IT operations (AIOps), predictive analytics can be leveraged to enable new and more targeted use cases, in addition to workload management, which has been the main focal point historically. Advanced applications include real-time system behaviour predictions, real-time metric predictions, real-time event predictions, and real-time transaction predictions.
Predicting System Behavior
Predictive analysis paired with AIOps can play a powerful lever in predicting the behaviour of the various business and IT entities within an enterprise IT ecosystem. For instance, it can be used to predict the CPU and memory utilisation of a server, the behaviour of a disk, application, or backup, the request count and utilisation of databases, and the request count and response time of URLs. System behaviour analysis helps enterprises predict future anomalies and events such as node down, service down, filesystem full, network bottlenecks, and application delays with confidence. Having informed predictions can give early warnings of these events, thereby giving enough time to take corrective actions.
Predicting Transactions
Another interesting space for predictive analysis is to predict transaction performance. When a transaction is initiated by a user, it typically must go through several “hops,” each serving a different function and each hosted on a tech stack of entities. Predicting the drops, delays, and jitters in these transactions involves capturing how the data and control flow across various paths, assessing critical paths and bottlenecks across these paths and then predicting the time taken at each step to assess the overall total end-to-end time transaction time and whether it will be successful or not. This predictive analysis not only helps in predicting application performance but also points to potential bottlenecks that should be proactively acted upon to prevent potential delays, drops, or outages.
Predicting Processes
Predicting processes presents another interesting use case for predictive analytics. Many industries rely on various back-office processes to perform different end-of-day operations to get the system ready for business before the next working day. These processes are usually defined by a set of batch jobs, precedence relationships, start-trigger conditions, and execution schedules. Such prediction algorithms require prediction of which jobs will run today, how long they will run for, and when they will start and end. These predictions can be performed using the historical behavior of batch jobs but also need to be adapted in real-time to cater to everyday changes in workload or infrastructure utilisation. This predictive analysis helps assess the impact of any failures and delays, provides early warnings of potential SLA violations, and recommends corrective actions to ensure the timely completion of these processes.
Additional Considerations for Implementing Predictive Analytics
When conducting predictive analytics, there are several aspects to consider. Some of these considerations are outlined in the following section.
Use an Ensemble of Algorithms
One prediction algorithm is often not enough to adequately cater to various real-world data variations. Therefore, there is a need to use an ensemble of algorithms and a smart layer on top of these algorithms. This layer needs to select the most suitable algorithm for the right data. Most algorithms work best when tuned with the right set of parameters. This layer should self-tune these parameters based on the data and the use case. This layer should also self-learn and adapt to changing data and system properties.
Small Data Analysis
Many real-world situations do not occur frequently enough to generate substantial amounts of data. For example, month-end processes run once a month, and high-severity outages happen rarely in any given year. In these types of situations, predictive analytics cannot be completed using big-data algorithms. Therefore, a creative alternative small data solution is needed. Techniques such as data augmentation and data inversion often turn out to be effective levers in such cases.
Adapting Predictions in Real-Time
Predictions based on historical data alone often are good for producing coarse-grained predictions if that is the goal. However, for more granular predictions, it is important to tailor these predictions in real-time by adapting to any unexpected deviation from history. For example, consider a batch process that runs every day. Coarse-grained predictions will predict the start and end time of the process at the start of the day. Fine-grained predictions will adapt these predictions as different tasks of the process progress and adjust the predictions to any unexpected failures or delays.
Ensuring Explainability
Predictions have little business value if they are not correctly interpreted and understood. Therefore, it is particularly important to make predictions explainable to the end-user. Visual and textual evidence should be provided to explain why the underlying AI engine predicted what it predicted. This not only helps in gaining trust, demonstrating value, and increasing AI adoption, but it also helps to drive constructive feedback from the domain experts to adapt and fine-tune predictions based on their tacit knowledge.
The ROI of Predictive Analytics
In closing, predictive analytics helps enterprises prepare for the future, providing data-driven intelligence to analyse potential outcomes and make more informed decisions to either prevent or better prepare for potential challenges that lie ahead.
Combining AIOps to expand the applications of specific real-time predictive analytics in areas such as system behaviour prediction, process prediction, and transaction prediction significantly increases organisational intelligence, delivering numerous tangible benefits. Looking ahead, we can expect organisational prediction policies and capabilities to be expanded from simple workload management processes to wider applications in areas such as enterprise resource planning operations (ERPOps) and business and monitoring (B&M) functions.