Requirements: English
Company: Crdit Agricole Italia
Region: Reggio Emilia , Emilia-Romagna
Following the creation of a new internal structure, we are looking for an experienced Site Reliability Engineer (SRE) to join our Infrastructure team.Responsibilities:System Reliability: Ensuring the reliability and availability of our platforms and technological systems through robust monitoring, reporting, and incident response procedures.Infrastructure Automation: Automating the deployment, scaling, and management of services and infrastructure components for critical applications like digital channels and branches.Resource Planning: Collaborating with cross-functional teams to forecast and plan future resource requirements for all infrastructure systems.Performance Optimization: Analyzing platform performance to improve efficiency, ensuring an optimal experience for users and end customers.Incident Management Support: Participating in troubleshooting sessions, supporting operational and application teams, analyzing monitoring data and root causes, and proposing solutions.Security: Supporting implementation and maintaining security best practices, participating in vulnerability assessments and threat mitigation.Continuous Improvement: Improving system reliability through root cause analysis, incident reporting, and proactive maintenance and evolution of systems and platforms.Required Experience:Excellent knowledge of Terraform and AnsibleUnderstanding of containerization technologies (e.g., Docker, containerd)Expertise in Kubernetes management and components (e.g., ingresses, monitoring stacks, custom autoscalers)Strong troubleshooting skillsUnderstanding of delivery systems (e.g., Helm, GitOps)Knowledge of at least one major cloud providerScripting and programming skills (e.g., Bash, Python, Go)Understanding of networkingExperience with databases like Oracle DB, MongoDB, PostgreSQLNice to Have:Experience with GCP, AWS, AzureExperience with distributed systems such as caching systems (e.g., Redis), message brokers (e.g., RabbitMQ), log collection systems (e.g., ELK)What We Offer:Autonomy and responsibility: freedom to choose, try, fail, and learnCareer growth: evaluations every six months to guide your developmentContinuous training: access to courses and industry expert learning opportunitiesLocation: Reggio Emilia, Italia#J-18808-Ljbffr