About the job
We are a global biopharmaceutical company focused on human health. Our purpose is to find treatment to fight pain and ease suffering. We combine breakthrough science and advanced technology to develop life-changing medicines and vaccines.
Digital & Data is at the heart of Sanofi: our ambition is to be the leading digital healthcare platform to develop & deliver medicine faster, enable healthcare professionals to improve treatments and help patients improve their health.Our scale, strong connections within health ecosystems across the world, and ability to leverage Sanofis capabilities make us the best place to push the boundaries of medicine through technology.
We are building Digital R&D products developing outstanding capabilities that will participate in new drug discovery together with scientists, in trial efficiency and cycle time reduction, leveraging Data Analytics and Artificial Intelligence.
Main responsibilities:
Work with business teams to understand requirements, and translate them into technical needs
Gather/organize large & complex data assets, and perform relevant analysis
Ensure the quality of the data in coordination with Data Analysts and Data Scientists (peer validation)
Propose and implement relevant data models for each business case
Create data models and optimize queries performance
Communicate results and findings in a structured way
Partner with Product Owner and Data Analysts to prioritize the pipeline implementation plan
Partner with Data Analysts and Data scientists to design pipelines relevant for business requirements
Leverage existing or create new standard data pipelines within Sanofi to bring value through business use cases
Ensure best practices in data manipulation are enforced end-to-end
Actively contribute to Data community
Remains up to date on companys standards, industry practices and emerging technologies
You are a dynamic Data Engineer interested in challenging the status quo to ensure the seamless creation and operation of the data pipelines that are needed for the enterprise data and analytics initiatives following industry standard practicesand tools for the betterment of our global patients and customers.
You are a valued influencer and leader who has contributed to making key datasets available to data scientists, analysts, and consumers throughout the enterprise to meet vital business use needs. You have a keen eye for improvement opportunitieswhile continuing to fully comply with all data quality, security, and governance standards.
About you
Experience:
Strong experience in automation tools and methodologies
Experience working with data models, query tuningand data architecture design
Experience working within compliance (e.g.: quality, regulatory - data privacy, GxP, SOX) and cybersecurity requirements is a plus
Experience in the healthcare industry is a strong plus
Soft skills :
Excellent written and verbal communication skills
Experience working with multiple teams to drive alignment and results
Experience working with stakeholders
Product-oriented, flexible, positive team player
Self-motivated, takes initiative
Enthusiastic about exploring and adopting new technologies.
Problem solving & critical thinking
Technical skills :
Proficient with AWS cloud services (Azure & GCP a plus)
Proficient in SQL and relational databases technologies/concepts
Proficient in Data warehousing solutions (Snowflake a plus)
Proficient with Python for data processing and scripting (R a plus)
Proficient in transformation tool (dbt a plus)
Proficient in using automation tools for CI/CD (Github Action, ArgoCD, CircleCI, )
Good knowledge of Integration Services (Informatica / IICS, Talend or similar data integration services a plus)
Good knowledge of orchestration frameworks (e.g. Apache Airflow, Prefect, Dagster)
Good knowledge of logging and using monitoring tool such as Datadog, Grafana
Good knowledge of data warehouse performance optimization
Good knowledge of infrastructure as code (Terraform, CloudFormation a plus)
Familiarity with containerization (Docker) and container orchestration (Kubernetes / Openshift a plus)