Koch
Join our Talent Network
Skip to main content

Data Engineer

Description

Molex is a globally recognized provider of electronic solutions in a wide range of industries including data communications, consumer electronics, industrial, automotive, commercial vehicle and medical. It is a subsidiary of Koch Industries, one of the largest privately held companies in the world. Our vision is to be the leading global provider of innovate electronic solutions that create value for our customers and society. 

What do we do? 

Within Molex, we are a small team of engineers and data scientists working on the building blocks of ‘Industry 4.0’. We are creating an IoT (Internet of Things) based data platform that receives data signals from over a million sensors across industries, process the data and provide real-time monitoring systems in a centralized environment. Being a subsidiary of Koch Industries, we have multiple clients already who have signed up to use our products. We are at very early stages of product development. We are actively looking for smart and motivated people across levels who can join us on our journey to build the platform for next-gen industries. Contact us if you are interested in this opportunity.

What You Will Do In Your Role

  • Create and manage a scalable ETL pipeline usable by anyone (engineers, data scientists) for a variety of purposes (analytics, dashboards, visualizations, machine learning) including infrastructure and abstractions.
  • Propose services, frameworks, and other capabilities that you predict will be a necessity in the future.
  • Build and maintain a robust data collection system that reaches across multiple sources and data marts.
  • Build and maintain a robust data collection system that reaches across multiple sources and data marts.
  • Monitor and tune existing infrastructure in close collaboration with the ops team.
  • Monitor and tune existing infrastructure in close collaboration with the ops team.
  • Build the cloud infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using Kafka and Spark ‘big data’ technologies.
  • Build the cloud infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using Kafka and Spark ‘big data’ technologies.
  • Create and manage a scalable ETL pipeline usable by anyone (engineers, data scientists) for a variety of purposes (analytics, dashboards, visualizations, machine learning) including infrastructure and abstractions.

The Experience You Will Bring

REQUIRED: 
  • Worked on Big Data platforms / products using distributed systems and other technologies 
  • Hands on experience on processing real-time data streams 
PREFERRED: 
  • Previous work experience in internet-based product companies, preferably B2B products 
  • Worked on New Products from scratch or early stage startups 

The Skills & Abilities You Will Bring 

REQUIRED: 
  • 3+ years of experience with hands on design and coding in Scala using Spark / Flink 
  • 3+ years of experience with a data streaming platform such as Kafka or any other. 
  • Experience in designing and scaling data infrastructure, models, and pipelines. 
  • Expertise in using RDD, Dataframes, Datasets, Spark streaming, Spark SQL. 
PREFERRED: 
  • Experience in MongoDB or any NoSQL database.

Sign up for our talent network.

Not ready to apply? Take a minute to sign up to receive notifications on opportunities that match your interests.

Sign Up Now