As IT operations professionals increasingly turn to artificial intelligence (AI) and machine learning to streamline how they maintain system reliability, the field of AIOps is starting to heat up as a skill specialization. AIOps helps to automate the analysis of data streaming from IT monitoring tools and optimize the workflows in IT service management (ITSM) areas such as inquiry management and self-service capabilities.
For operations, ITSM, site reliability engineering (SRE), and AIOps engineering teams to get the most out of new AIOps capabilities, they are going to need to bolster their skills and processes to layer in best practices around new specializations such as data science and process design.
While AIOps teams may need fewer Tier 1 incident responders on hand, they will need more data science gurus to curate data and train algorithms. They'll need to team up with experienced troubleshooters and responders to help design workflows and runbooks, as well as to intervene with complex root-cause analysis. Additionally, teams will need thoughtful planners who can get the most out of AIOps to design resilient architectures that feature automated preventative maintenance and self-healing capabilities.
To keep up, operations professionals should start now in developing the skills for an AIOps future. Because it is such a nascent field, there are no AIOps-specific professional organizations or major certifications yet. However, with a little bit of creative digging into resources geared at related subspecialties such as SRE, network automation, and application performance management, it's possible to start looking for the contacts and learning necessary to understand the AIOps tools and capabilities available today, and build up AIOps skills and practices.
Here are 15 great AIOps resources that IT Ops pros should know about.
AIOps-specific events
SKILup Day: AIOps & MLOps
Date: October 15, 2020
Location: Virtual
Cost: Free
Hosted by the DevOps Institute, this free, one-day virtual summit offers beginners in AIOps a chance to bone up on the brief history of the niche and hear about use cases and benefits of AIOps practices. The virtual event sports an online networking lounge to share ideas with fellow travelers, plus a resource library and relevant videos to come up to speed on AIOps.
AIOps 2020: International Workshop on Artificial Intelligence for IT Operations
Date: December 14-17, 2020
Location: Dubai, United Arab Emirates
Research paper submission deadline: August 16, 2020
Cost: Collacated with ICSOC 2020, which cost €150 to €850 in 2019
Planned as a workshop event collocated with the International Conference on Service Oriented Computing (ISOC 2020), AIOps 2020 is an academic- and researcher-oriented event focused on the cutting-edge advances made in AIOps around areas such as self-healing, early anomaly, fault and failure (AFF) detection, and root-cause analysis techniques.
AIOps Exchange
Date: The website notes that the event was postponed due to the pandemic, but in mid-August, it still said that the hope was to convene in early summer.
Location: TBA
Cost: TBA
2019 was the inaugural year for AIOps Exchange, which is a one-day forum event that features best practices advice from a handful of enterprise IT Ops professionals and heavily emphasizes the roundtable format for an intimate, interactive format for exchanging ideas on the newest techniques in refining SRE strategies using AIOPs, supporting DevSecOps, and developing the right culture within teams to effectively use AIOps technology. It is one of the few vendor-neutral AIOps-specific events currently out there, so keep checking for a solid date.
AIOps Expo
Date: February 9-12, 2021
Location: Miami Beach, Florida
Cost: $599 to $3,599
AIOps Expo focuses on a range of areas relating to how AI and machine learning can be used for IT operations, including application performance, network performance, and security. The previous event featured an agenda that ran through in-depth discussions around topics such as AIOps maturity models, how to build teams for AIOps, and blending DevOps and AIOps.
Related Events
Interop Digital
Date: October 5-8, 2020
Location: Virtual
Cost: $499+
Interop has gone to a completely virtual format for the fall, and there will be plenty of crossover into AIOps territory across the four days of its programming. Sessions and training are particularly strong in AIOps leadership content such as overcoming cultural hurdles to leverage AIOps and introductory sessions on how AI and automation will serve as the backbone of IT operations in the future.
ONUG Fall 2020
Date: October 14-15 2020
Locations: New York and online
Cost: Free to $199
Started as the Open Networking User Group, ONUG has evolved into an enterprise leadership group focused on building out the digital enterprise through evolved practices in enterprise cloud, DevSecOps, and automation. The agenda for its fall conference fits right into the AIOps wheelhouse, with content planned around automating cloud governance, cloud-native DevOps, automating multi-cloud observability, as well as specific AIOps topics.
SRECon20 Americas
Cost: Free
Date: December 7-9, 2020
Location: Virtual
A conference focused squarely on the site reliability engineers who are tasked with practicing AIOps, SRECon20 Americas hasn't yet released its agenda but chances are high there will be lots of good research on the use of AI in IT. Last year the USENIX event featured many talks around automating management of cloud infrastructure, designing resilient data pipelines, using open AIOps tooling to improve observability, and many other topics of interest to AIOps professionals.
Training and Courses
SRE Foundation
Cost: $1,595
Date: The next class begins September 22, 2020.
Location: Online
At many organizations today, site reliability engineers and AIOps engineers are one and the same, and it only follows that a certification in SRE fundamentals will offer the foundation upon which ops professionals can thrive with AIOps. With accreditation run by the DevOps Institute, the SRE Foundation class by the ITSM Academy is a four-day course that dives deep into the principles of SRE, including teaching and labs around achieving service-level objectives (SLOs), monitoring for service-level indicators (SLI), and the tools and automation techniques for maintaining system reliability.
AIOps Essentials
Cost: Free with Linux Academy seven-day trial
Date: On demand
Location: Online
Designed for IT Ops professionals charged with care and feeding of Kubernetes clusters, this class focuses squarely on bringing AIOps practices to cloud-native environments using the open-source Prometheus event monitoring and alerting platform. The goal is to get pros comfortable enough with the tooling to integrate Prometheus rules with Kubernetes APIServer to start scaling nodes and effectively managing a hybrid cloud through machine learning.
AI for Everyone
Cost: $49 for 180 days of certificate eligibility
Date: Ongoing
Location: Online
In order to start leveraging AI for IT, it would help to truly understand the basics on AI and how it's being applied today. This is a relatively short (six hours) nontechnical course designed to provide executives and managers with a broad overview of AI use cases, applications, and techniques, as well as pointers on how to get started building AI projects and teams.
Python for Data Science and AI
Cost: Free
Date: Ongoing
Location: Online
IT Ops pros seeking to build out a foundation of learning for automation, data science, and AI would be well served to start brushing up on Python. This free beginner Coursera class will give students the fundamentals in Python basics, data structures, and programming fundamentals over the course of 22 hours of learning. It can be taken on its own or used as a part of a broader IBM Data Science Professional Certificate.
Books, research, and reports
Cognitive Computing Recipes
Cost: $19.24+
While it isn't specifically an AIOps book, Cognitive Computing Recipes offers a solid survey of the use of deep learning and machine learning for developers and IT pros and features an entire chapter on AIOps as a part of its catalog of real-world use cases. The chapter provides practical information on improving on key reliability metrics such as mean time to detect and mean time to repair through the use of AI.
Practical Network Automation
Cost: $19.79+
A primer in the fundamentals of network automation, infrastructure as code, and analytics-driven IT ops, this tome provides practical advice in leveraging Python, Ansible, and other tooling to improve network performance, support DevOps practices, and apply continuous integration/continuous delivery principles. The book includes a chapter on the key pillars of AIOps, including information on collecting and managing data, and analyzing it using machine-learning principles.
ACM Digital Library
Cost: $5/article+
The Association for Computing Machinery (ACM) has published some valuable research on AIOps and data-driven performance management in the last year. Among the highlights are broad looks at the latest research innovations in AIOps, as well as in-depth research into topics such as predicting node failures in ultra-large-scale cloud computing platforms, tools for automated log parsing, and the latest techniques in time-series anomaly detection.
AIOps Exchange Report
Cost: Free
Last year's inaugural AIOps Exchange event put together a survey and this report, which offers some interesting statistics on the drivers for AIOps adoption, the barriers to getting the most out of AI for IT and the value delivered by AIOps. It's a quick read but provides some talking points for professionals and executives taking the first steps to exploring AIOps
Keep learning
Choose the right ESM tool for your needs. Get up to speed with the our Buyer's Guide to Enterprise Service Management Tools
What will the next generation of enterprise service management tools look like? TechBeacon's Guide to Optimizing Enterprise Service Management offers the insights.
Discover more about IT Operations Monitoring with TechBeacon's Guide.
What's the best way to get your robotic process automation project off the ground? Find out how to choose the right tools—and the right project.
Ready to advance up the IT career ladder? TechBeacon's Careers Topic Center provides expert advice you need to prepare for your next move.