Requirements: English
Company: Datumo
Region: Wrocaw , Lower Silesian Voivodeship
Datumo specializes in providing Data Engineering and Cloud Computing consulting services to clients from all over the world, primarily in Western Europe, Poland and the USA. Core industries we support include e-commerce, telecommunications and life science. Our team consists of exceptional people whose commitment allows us to conduct highly demanding projects .
Our team members tend to stick around for more than 3 years, and when a project wraps up, we don''t let them go - we embark on a journey to discover exciting new challenges for them. It''s not just a workplace; it''s a community that grows together!
proven record with a selected cloud provider GCP preferred, Azure or AWS
good knowledge of JVM languages - Scala or Java or Kotlin
good knowledge of Python
good knowledge of SQL
BigQuery/Snowflake/Databricks or similar
in-depth understanding of big data aspects like data storage, modeling, processing, scheduling etc.
ensuring solution quality through automatic tests, CI / CD and code review
English proficiency at B2 level, communicative in Polish
another JVM (Java/Scala/Kotlin) programming language
experience in Machine Learning projects
familiarity with one of BI tools: Power BI/Looker/Tableau
willingness to share knowledge (conferences, articles, open-source projects)
100% remote work, with workation opportunity
project switching possible after a certain period
Medicover private medical care, co-financing of the Medicover Sport card
opportunity to learn English with a native speaker
GCP, Azure, Snowflake)
Discover our exemplary project:
Cost optimization on Snowflake data platform
Datumo optimized a Snowflake-based platform for a pharmaceutical company, aiming to reduce costs and enhance ELT processes. Airflow orchestrated the platform, using Python scripts for data extraction, focusing on data snapshots with hundreds of millions of records. Analytics engineering on Google Cloud Platform
The project entails creating and improving data pipelines on Google Cloud Platform (GCP) to aid analytics and data science teams. The objective is to optimize data workflows utilizing Cloud Composer (Apache Airflow), BigQuery, and Dataproc (Apache Spark) for scheduling, warehousing, and processing respectively. Key responsibilities encompass optimizing SQL queries for better performance, developing internal libraries to streamline tasks, and advocating for data processing best practices. Additionally, the project offers opportunities for progression into data science or MLOps.
Technical interview - 60 minutes
Find out more by visiting our website -