Mission:
For this project, you will be responsible for ensuring the required stability, availability and performance for the applications under the team scope. Also, the incident resolution in a timely manner collaborating with internal (Development/Infrastructure teams) or external stakeholders (Service Providers) to drive long-term solutions.
Main Responsibilities:
Application Stability & Availability
• Monitor and maintain the Application at the scope ensuring high availability and performance.
• Actively collaborate in incident management (e.g, participate in Situation Rooms for P1/P2 INCs, RCA), problem resolution (identify incident trends & participate for definitve solutions). Also, ensure the implementation and compliance of ITIL Governance within IT Production (SLAs).
• Execute change requests and deployments following ITIL & DevOps tools and processes.
• Proactively identify and resolve technical issues to ensure smooth business operations.
• Participate in on-call rotations and ensure 24/7 availability for critical applications.
Technical Support & Collaboration
• Serve as a point of contact for the Development team, troubleshooting issues and coordinating fixes.
• Work closely with Scrum teams to design, deploy, and enhance systems.
• Implement upgrades, patches, and new functionalities while ensuring minimal impact on users.
Documentation & Knowledge Sharing
• Maintain and update technical documentation for processes, configurations, and troubleshooting guides.
• Share best practices and knowledge with the global support team to improve efficiency.
Platform Monitoring
• Implement and optimize monitoring tools within production environment (e.g, Dynatrace).
• Collaborate with development and other relevant stakeholders (Centers of Expertise) to define effective observability and monitoring practices.
• Act as a promoter for awareness of observability aiming the early detection and resolution of potential issues.