Is hybrid: No
Is remote: No
Employer: Google
Minimum qualifications:
-
Bachelor's degree in Computer Science, a related technical field, or equivalent practical experience.
- 5 years of experience with data structures and algorithms and software development in one or more programming languages.
- 3 years of experience in designing, analyzing, and troubleshooting large-scale distributed systems.
- 2 years of experience leading projects and providing technical leadership.
Preferred qualifications:
- Experience developing and supporting Google scale production systems.
- Experience enhancing and supporting large production systems on compute infrastructure.
- Experience with software engineering and development in C++, Python, GCL, APIs and Go.
- Experience with networking, capacity and performance.
- Experience with large-scale system and architecture design and complex system integrations or migrations.
About the job
Vertex 1P GenAI Site Reliability Engineering (SRE) is looking for individuals passionate about shaping the future of artificial intelligence, Generative AI and machine learning platforms while driving production excellence including reliability, scalability and performance through sound SRE principles. You will have the opportunity to collaborate closely with a global team of SREs and developers to solve large and complex problems while building and supporting groundbreaking AI/ML tools that enable both internal and Google Cloud customers to thrive with game-changing artificial intelligence built on the rapidly growing Vertex GenAI platform.
Vertex AI is a key pillar to the Cloud AI's mission to responsibly deliver AI that enables industries and organizations to transform and solve real-world problems through a single unified artificial intelligence platform that enables building, deploying, and scaling ML models faster with either pre-trained and custom tooling.
Responsibilities
- Improve and ensure reliability of Vertex AI products and services.
- Develop/Influence scalable and sustainable system architecture and designs for products, services and enhancements.
- Help define strategy and set direction for Vertex AI services to increase reliability, efficiency and ultimately feature velocity.
- Lead efforts to adopt infrastructure and standards (P2020, Cloud Horizontals, etc.) for Vertex AI.
- Resolve outages or service disruptions and help design solutions to ensure systems are protected from similar classes of problems in the future.
Google is proud to be an equal opportunity workplace and is an affirmative action employer. We are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity or Veteran status. We also consider qualified applicants regardless of criminal histories, consistent with legal requirements. See also
Google's EEO Policy and
EEO is the Law. If you have a disability or special need that requires accommodation, please let us know by completing our
Accommodations for Applicants form.