:
-
-
-
WazifaMe Logo

Careers

  • Home
  • Jobs
  • Create My Profile
    • About Us
    • Contact
    • Blog
Login / Register
Wazifame Logo

Pages

  • Home
  • About
  • Job Listing
  • Pricing
  • FAQs
  • Contact Us

Contact Info.

  • [email protected]
  • Egypt Office: Egypt - 42 Lusaka Street, off Hassan Al Maamon, Nasr City, Cairo, Egypt.
  • UAE Office: Office 603, Al Muteena Technic Bldg. Salah Al Din Road Deira Dubai - UAE.

© 2025 WazifaMe (v2.45.2). All Rights Reserved.

  • Terms & Conditions
  • Privacy Policy
  • Loading
  • Loading
  • Loading
  • Loading

Site Reliability Engineer

  • Full Time
  • First Shift (Day)
  • Experience: Fresh
  • Old Cairo, Egypt
  • 1 Vacancy

Job Summary

The Site Reliability Engineer is responsible for the proactive support of products to ensure high product performance, with a continuous focus on improvement. The role involves identifying and resolving the root causes of operational incidents, implementing solutions to enhance stability, and preventing recurrence.

 

The Site Reliability Engineer manages the creation and maintenance of the event catalogue to trigger events and develops both manual remediation approaches and automated workflows to address alerts. Additionally, they oversee the deployment of IT services and solutions, ensuring seamless integration with minimal disruption. 

 

 WHAT YOU’LL DO

 

  • Design, build, and maintain support systems to ensure high availability, scalability, and performance of critical infrastructure.
  • Lead incident response and root cause analysis for system failures, including problem investigations and coordination with relevant teams.
  • Implement and manage automation for system provisioning, deployment, self-healing, and performance monitoring to increase operational efficiency.
  • Establish and monitor SLIs/SLOs, proactively identify performance issues, and drive continuous improvements in service reliability.
  • Collaborate with development and operations teams to embed reliability best practices and evolve toward zero-downtime architecture.
  • Manage and optimize an event catalog, including event definitions, thresholds, remediation actions, and relevance across products.
  • Develop event response protocols, provide training, and ensure efficient handling of incidents across teams.