Job Description ML Data Engineer Location: Johnston, Rhode Island Date Posted: 2/9/2023 Employment Type: Consulting Recruiter: Clayton Minnich Recruiter Email: clayton.minnich@avidtr.com Job ID: JN -022023-15452 Job Description Overview:We currently have an opening for a data engineer with experience supporting machine learning models using computer vision/image processing. The ideal candidate will have strong experience with data pipelines and data cleansing, aggregating datasets for training ML models. Previous experience with the Agile SDLC methodology is preferred. As a Data Engineer you bring: Strong problem-solving skills Commitment to delivery Excellent communication skills and a desire to collaborate openly within a fast-moving team A deep desire to learn and apply technology in a pragmatic way to create client value Experience building enterprise data management systems that support large volumes of data Responsibilities: Design and develop big data pipelines to manage imagery and label datasets for computer vision AI/ML model development and their associated outputs Support data quality and cleansing efforts as part of executing data pipeline construction work Managing labelling and annotations implementations for image processing Create data pipelines for consumption by image operations/curation teams Design, implement, and improve upon metadata management systems, specifically supporting model reproducibility and extensibility Work directly with data scientists to provision data for model development and source requirements for upcoming work Interact with downstream customers and core development team to define integration points and support model output management Automate and execute all levels of testing (unit, integration, and regression) Champion engineering excellence and proactively recommend solutions Skills/Knowledge: Understanding of core data engineering concepts – including ERD design, common data management tools such as Azure Data Factory, API documentation solutions like Swagger, data pipeline optimization, and tradeoffs in design decisions, data warehouses/data lakes Experience with CI/CD practices, DevOps, and MLOps principles Experience with functional and system integration testing Proficiency in scripting language such as python Proficiency in SQL Proficiency in big data management technologies such as databricks, pyspark, and scala Prior experience with scaling using containerization in a cloud environment A Bachelor’s degree in a technical discipline such as Computer Science is preferred Additional Preferred Skills: Experience working with Agile methodologies and frameworks Experience working in Azure cloud and supporting Azure Machine Learning deployments Apply for this job