Overview:
We are looking for a proactive, self-motivated and personable individual who will be responsible for maintaining a consistent and timely delivery of Incident and Problem management with best practices. You will be working with internal stakeholders, customers, 3rd party vendors and internal business units. You will own the end-to-end process to ensure all hit the agreed SLA's.
In this exciting role you will be the central point of conflict and issue escalation to senior management and compile reports of incident and problems. You will need to analyse and report on patterns and trends to improve future service delivery and reduce major incidents. You will then need to take it a step further to ensure appropriate action is taken to anticipate, investigate and resolve any problems in systems and services that will be full documented. This can be done by regular audits, reviews and assessments.
The role will be responsible for managing production incidents and outage events as well managing problems within the Group Technology division. You will provide leadership and coordination across infrastructure, application and partner teams to quickly remediate production issues and reduce mean time to resolution; as well as pushing for active problem records to be addressed and managed effectively so root causes are identified quickly with a plan to eliminate them clearly defined as part of the problem management processes with the Technology Operation and Product teams. Ensures appropriate managerial relationships are established and maintained to build and strengthen trust regarding end-to-end enterprise incident management resolution and enterprise problem management; serves as a focal point for escalation of issues to be resolved and for problems to be addressed. Facilitates ITIL standards adherence.
Overview:
We are looking for a proactive, self-motivated and personable individual who will be responsible for maintaining a consistent and timely delivery of Incident and Problem management with best practices. You will be working with internal stakeholders, customers, 3rd party vendors and internal business units. You will own the end-to-end process to ensure all hit the agreed SLA's.
The role will be responsible for managing production incidents and outage events as well managing problems within the Group Technology division. The role will provide leadership and coordination across infrastructure, application and partner teams to quickly remediate production issues and reduce mean time to resolution; as well as pushing for active problem records to be addressed and managed effectively so root causes are identified quickly with a plan to eliminate them clearly defined as part of the problem management processes with the Technology Operation and Product teams. Ensures appropriate managerial relationships are established and maintained to build and strengthen trust regarding end-to-end enterprise incident management resolution and enterprise problem management; serves as a focal point for escalation of issues to be resolved and for problems to be addressed. Facilitates ITIL standards adherence.
Responsibilities:
- Manage incidents and outages
- Manage the review, assignment and classifications of incidents, outages and problem cases
- Actively engage with operations teams and engineers, and manage the involvement of application development and other areas in the change and problem management process
- Create and review incident and problem management reports and identify action plans to improve key performance indicators as necessary
- Introduces key ITIL disciplines and practical project management techniques to ensure effective end to end problem management
- Perform quality assurance on completed incident, outage, problem investigations and change management records
- Conduct Root Cause Analysis (RCA), Port Mortem and Problem Management meetings
- Define reporting requirements needed in the management of the incident, outage and problem management processes
- Review incident, outage and problem processes, identify trends and recommend improvements
- Providing incident resolution status as requested.
- Validating incident severity if required, or assisting with correcting invalid incident severity.
- Ensuring the quality and accuracy of incident information, as appropriate.
- Process Review for Incident/ Problem Management and implement enhancements and document process.
Experience and skills required:
- Strong ServiceNow Experience
- Strong experience working with MSP and outsource operations
- ITIL framework certification / ITIL v3 foundation certified
- Strong analytical and project management skills
- Ability to manage an incident/outage bridge with 50+ technical and business stakeholders
- Ability to guide and assist in technical troubleshooting during an incident/outage
- The ability to lead cross functional teams effectively at all levels of the organization
- Coordination skills: managing (complex) IT technical investigations
- Advanced knowledge of incident, outage, problem and change management
- Experience managing 24/7 Application, Infrastructure and/or Operation teams preferred
- Experience supporting Application and Infrastructure in AWS preferred
- Adaptability to demanding circumstances that require timely and accurate responses
- Strong verbal and written communication skills with the ability to articulate complex ideas in easy-to-understand business terms to senior leaders