A major bank in Amsterdam has been investing for many years in the automation and harmonisation of its Continuous Delivery activities to ensure a smooth production process. The reliability of systems and applications has always been a central objective of those investments: the customer must be able to manage his banking affairs at any time and any place. And reliability is closely bound up with good monitoring of the bank’s systems and applications, which is a responsibility shared by around 300 DevOps teams. Devoteam was tasked with setting up and professionalising that monitoring and training the people in the use of the Agile Scrum Kanban and DevOps working method.
In a DevOps culture, teams are expected to enjoy a high degree of freedom and independence. That increases their innovative capability and promotes a smooth production process. The downside is the resultant proliferation of solutions to problems which the teams should not have to deal with, including monitoring. This was solved partly by offering the teams integrated standard tools as attractively as possible with professional support. The teams’ freedom to innovate was not seen as a problem, but indeed was actually exploited by allowing multiple solutions, provided they were tested against the criteria of the IT architecture (including by means of certification). These included the question of whether monitoring data would be shared with other teams and the Master Control Room.
Maturity model
A key challenge lay in the wide disparity of knowledge of monitoring among the teams. Some were already far advanced, whereas others were at the start of their learning curve. Moreover, Devoteam wanted to involve all teams in the process to ensure broad support for the chosen approach. That knowledge therefore had to be shared and aligned first. Each application team had to learn how to monitor its own applications in an optimum way and which tools could be used for that purpose within the overall parameters of the bank’s IT architecture. Based on a number of interactive workshops, a coherent Maturity Model for Monitoring was developed jointly with Devoteam and the teams. The model was set up with five maturity levels, ranging from infrastructure monitoring to business value chain monitoring. A result of the strong team involvement was that the guidelines were developed from the bottom up and not imposed from the top down. That generated broad support and rapid acceptance of the new working methods in the teams.
Awareness
Devoteam also deployed a varying number of communication tools to raise awareness among all teams and stimulate knowledge-sharing. These included the production of a glossy booklet and a ‘training primer’. The booklet, which is also available online, explains what monitoring is and how the tools can be used at each maturity level. The primer is an e-learning module which uses multiple-choice questions and practical cases to provide guidance for the bank’s DevOps engineers in the technical jargon used in monitoring. The ultimate aim is to enable the teams to set up their own monitoring with the right tools. Manuals were also written and bootcamps were organised to involve the entire community of many hundreds of professionals in the processes surrounding monitoring. In 2015 the focus was on the first three maturity levels: infrastructure components, application components and the application itself.
In 2016 more attention will be devoted to level IV: monitoring of the entire IT value chain. A large number of open social channels will be used to provide support, with efforts being made as far as possible to exclude process minutiae and provide effective support for the self-reliant teams. For simple questions, the teams can turn to the online monitoring community, while more difficult questions can be posed during the weekly support sessions. Devoteam has even set up monitoring certifications, involving the testing of various key capabilities and the provision of M-Level certificates for professionals.
Further increase in reliability
The increasing quality of monitoring of the applications has further increased the reliability of the systems and the harmonisation of the processes and tools in the bank’s IT. What is more, outside office hours the monitoring can now be switched more easily from the teams to the Master Control Room, where the applications are monitored overnight. That means optimum monitoring is assured 24/7 and problems are solved before they can cause actual damage.