Within any given enterprise, automation workloads are relied upon for many business-critical services. Optimizing these workloads, and keeping them optimized, is vital. Without advanced analytics for workload automation, this task is challenging if not impossible.
Today, automation workloads run in environments spanning from the mainframe to microservices, from on-premises data centers to the cloud. As the number and locations of workloads continue to expand, so do the tools and interdependencies. While there are now more integrations and more tools, there’s less visibility. Many organizations have multiple workload automation platforms, ranging from on-premises schedulers to tools from cloud and ERP providers. The result is fragmented views that leave teams struggling with suboptimal workloads and supporting infrastructures.
One example is insurance companies with subsidiaries requiring customer policy management driven by complex back-end analysis. Rapidly changing events can impact claim volumes and drive multi-tiered, real-time processes that require responsive process coordination. Valuation reserve risk analysis and agency enablement require low latency, dynamic workloads that span multiple lines of business and generate large amounts of data.
Without the ability to visualize the end-to-end automated process and deep historical monitoring data, gaining a clear picture of the environment and optimizing it can be hard. Jobs will be viewed independently, sometimes only at the object level, without insights into dependencies and business context. For the example above, business users need to view real-time, risk-to-quote process flows to help drive accuracy and responsiveness. These complex flows have often evolved in disparate sources or tools and providing cross-silo visibility becomes a critical enabler.
In this new era of workload complexity, it is more important than ever to have the right tools in place, so teams have the insights and control they need to ensure workloads, and the environments they run in, remain optimized. In the following sections, we take a look at the six key capabilities teams need to do effective workload optimization.
Required Workload Optimization Capabilities
Today automation solutions require intelligent analytics to provide visibility across workloads, regardless of whether they are running on premises, in the cloud, or across a mix of environments. These solutions need to address several key requirements.
1. Consistent monitoring and collection of data
Consistent monitoring is an essential building block of effective workload optimization. Without a solution that consistently monitors and calculates key performance metrics, teams will be operating in the dark, lacking the insights needed to inform optimization efforts.
2. Long-term historical data archive
Amassing historical workload data is vital. It is only with a long-term data repository that operations groups will be in a position to accurately analyze workload performance and spot trends. These groups need visibility into scheduler component issues as well as application-specific inefficiencies. To reduce risk, it is vital to maintain a long-term historical repository of workload event data. Users also need a tool that provides an efficient, easy-to-use analysis and reporting engine.
3. Business process visualization and SLAs
It is essential to achieve end-to-end visibility of business processes. To do so, users must begin with a specific job, and then establish a chronological view of the entire business process, including all of that job’s associations and dependencies. In addition, job linkages and event information associated with SLAs should also be captured.
Once teams aggregate a composite history of multiple runs, they’ll be able to identify where specific jobs are impeding performance. By measuring SLA performance over time, teams will be well-positioned to spot problems and even identify potential or emerging issues and address them before SLAs are affected.
To achieve these objectives, users need solutions that can provide high-level SLA views, while also enabling them to drill down and pinpoint which jobs contributed to delays. Teams need access to intuitive dashboards and charts that offer visibility into how current performance compares with an SLA’s thresholds, along with its historic and forecasted performance.
4. Capacity optimization
Infrastructure performance is directly related to workload performance. That’s why resource utilization represents such a key focus area for optimization. All too often, a poorly performing agent or network device can have an instant impact on performance. Further, those issues can spread to other services and jobs.
To guard against these types of problems, workload teams have to pay attention to component latencies, design and operational delays, and impingement on shorter, critical-path durations. With the right insights, they can remove inefficiencies and identify optimization opportunities.
5. Trending analytics
Workload teams need a platform that provides real-time alerting for SLA violations, significant variances in process executions, and system delays. Ideally, they should be able to leverage change modeling and simulation capabilities, so they can validate the impact of changes before executing them. When teams have these advanced capabilities, they can more quickly and efficiently identify performance and latency issues. Further, they can objectively measure the impact of any remedial actions they may apply.
6. System and workload performance analytics
The performance of workload engines can also have a major impact on automation-powered services. Workload engines may exhibit performance trends on an hourly, daily, or even quarterly basis. The key to ongoing optimization is being able to track these trends and spot emerging issues is key to ongoing optimization. This requires a repository of long-term workload data.
In addition, given the complex, interdependent nature of workload engines, teams need granular visibility. At the server level, a workload engine may be performing acceptably. However, workload performance may be affected by a connection to a slow server agent that is executing a job. If users can gain visibility into the load on specific agents, they’ll be able to correlate overall slowdowns with a given set of poorly performing servers. By establishing granular monitoring and reporting on scheduler component performance, teams can more quickly identify and address issues.
 
            
           
           
              
             
                   
                  