Data Engineer
This job posting is no longer active.
Description
PRIMARY PURPOSE: | Engineering graduate with relevant experience of 4 to 8 years, who is passionate and eager to learn and contribute. The primary requirement would be to build and design Python data pipelines. Experience on writing complex pipeline using Spark & Pandas. Hands-on experience with CICD pipelines system. Strong understanding of RDBMS is required. Familiar with Agile development. Preferred large data processing experience.
|
DUTIES & RESPONSIBILITIES: | |
1. | Design Data Pipelines using Spark & Pandas/Polars |
2 | Design Data Models & high-performance SQL quires to pull data from different sources |
3 | Create/ maintain / optimize AWS Glue Jobs and data pipeline. |
4 | Understand Business requirement and deliver solution as expected. |
5 | Support Analyst in to Extracting data, reporting and analysis to generate business insights |
Knowledge | · Knowledge of Manufacturing, production. · Knowledge of Python Pandas/Polars/Spark with the ability to understand and work with formulas and formats submitted by customer. · Knowledge Complex ETL Pipelines |
Technical Skills | · Python. · Pandas/Polars/Spark. · Data Modelling/SQL design. · Gitlab/Github (CICD). · ETL Pipeline with any other tool like (Apche Nifi/AWS Glue/EMR). |
Preferred Skills | · Agile development · SOLID principle. · RDBMS. · DevOps. |
PREFERRED: Previous experience with developing Data pipelines using Python and refactor the existing code to follow the best practices in industry. Maintain existing Data pipelines and AWS ETL Jobs. Have analytical skills to meet business requirements. Hands-on experience on CICD process and database modelling.
“Koch is proud to be an equal opportunity workplace”
This job posting is no longer active.