Job Description

ML Data Engineer

Location:

JohnstonRhode Island

Date Posted:

2/9/2023

Employment Type:

Consulting

Recruiter:

Clayton Minnich

Recruiter Email:

clayton.minnich@avidtr.com

Job ID:

JN -022023-15452

Job Description

Overview:
We currently have an opening for a data engineer with experience supporting machine learning models using computer vision/image processing. The ideal candidate will have strong experience with data pipelines and data cleansing, aggregating datasets for training ML models. Previous experience with the Agile SDLC methodology is preferred. 

As a Data Engineer you bring: 
  • Strong problem-solving skills 
  • Commitment to delivery 
  • Excellent communication skills and a desire to collaborate openly within a fast-moving team 
  • A deep desire to learn and apply technology in a pragmatic way to create client value 
  • Experience building enterprise data management systems that support large volumes of data 
Responsibilities:
  • Design and develop big data pipelines to manage imagery and label datasets for computer vision AI/ML model development and their associated outputs 
  • Support data quality and cleansing efforts as part of executing data pipeline construction work 
  • Managing labelling and annotations implementations for image processing 
  • Create data pipelines for consumption by image operations/curation teams 
  • Design, implement, and improve upon metadata management systems, specifically supporting model reproducibility and extensibility 
  • Work directly with data scientists to provision data for model development and source requirements for upcoming work 
  • Interact with downstream customers and core development team to define integration points and support model output management 
  • Automate and execute all levels of testing (unit, integration, and regression) 
  • Champion engineering excellence and proactively recommend solutions 
Skills/Knowledge:
  • Understanding of core data engineering concepts – including ERD design, common data management tools such as Azure Data Factory, API documentation solutions like Swagger, data pipeline optimization, and tradeoffs in design decisions, data warehouses/data lakes 
  • Experience with CI/CD practices, DevOps, and MLOps principles 
  • Experience with functional and system integration testing 
  • Proficiency in scripting language such as python 
  • Proficiency in SQL 
  • Proficiency in big data management technologies such as databricks, pyspark, and scala 
  • Prior experience with scaling using containerization in a cloud environment 
  • A Bachelor’s degree in a technical discipline such as Computer Science is preferred 
Additional Preferred Skills:
  • Experience working with Agile methodologies and frameworks 
  • Experience working in Azure cloud and supporting Azure Machine Learning deployments
Apply for this job