Site Reliability Engineer
Data de publicação: 13/04/2022
Data de validade do anúncio: 30/06/2022
- Localização: Luanda, Luanda
Detalhes do anúncio
Oferta de emprego | |
Tempo determinado | |
Licenciatura | |
4-7 anos | |
AKZ | |
Mensal | |
About the Job • Site Reliability Engineering (SRE) combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. SRE ensures that services—both our internally critical and our externally-visible systems—have reliability, uptime appropriate to users' needs and a fast rate of improvement. Additionally SRE’s will keep an ever-watchful eye on the capacity and performance of our system. Much of our software development focuses on optimizing existing systems, building infrastructure and eliminating work through automation. • On the SRE team, you’ll have the opportunity to manage the complex challenges of scale which are unique to us, while using your expertise in coding, algorithms, complexity analysis and large-scale system design. • SRE's culture of diversity, intellectual curiosity, problem-solving and openness is key to its success. Our organization brings together people with a wide variety of backgrounds, experiences and perspectives. We encourage them to collaborate, think big and take risks in a blame-free environment. We promote self-direction to work on meaningful projects, while we also strive to create an environment that provides the support and mentorship needed to learn and grow. Responsibilities • Engage in and improve the whole lifecycle of services—from inception and design, deployment, operation, and refinement. • Support services before they go live through activities such as system design consulting, developing software platforms and frameworks, capacity planning, and launch reviews. • Maintain services once they are live by measuring and monitoring availability, latency, and overall system health. • Scale systems sustainably through mechanisms like automation; evolve systems by pushing for changes that improve reliability and velocity. • Lead sustainable incident response, blameless postmortems, and production improvements that result in direct business opportunities for Organization. • Manage individual project priorities, deadlines, and deliverables. • Provide guidance to other team members on managing end-to-end availability and performance of mission-critical services, on building automation to prevent problem recurrence, and on building automated responses for non-exceptional service conditions. Minimum qualifications: • Bachelor's degree in Computer Science, a related technical field involving software/systems engineering, or equivalent practical experience. • Experience programming in at least one of the following languages: C, C++, Java, Python, or Go. • Experience with algorithms and data structures. • 3-5 years of experience in computing, distributed systems, storage, or networking. Preferred qualifications: • Expertise in designing, analyzing, and troubleshooting large-scale distributed systems. • Ability to debug, optimize code, and automate routine tasks. • Systematic problem-solving approach, coupled with effective communication skills and a sense of drive. • Experience with algorithms and data structures and/or Unix/Linux systems internals (e.g., filesystems, system calls) and administration. |
Enviar para amigo Anúncios desta categoria
Informações úteis
- Contacte directamente com o anunciante;
- Não faculte mais dados que os necessários;
- Tenha em atenção propostas demasiado atractivas;
- Este portal não se responsabiliza pelas informações prestadas pelos utilizadores;
- Para mais esclarecimentos consulte os "Termos e Condições de Uso";