Job Summary:
We are seeking a highly skilled and motivated Machine Learning Engineer with a strong foundation in programming and machine learning, hands-on experience with AWS Machine Learning services (especially SageMaker), and a solid understanding of Data Engineering and MLOps practices. You will be responsible for designing, developing, deploying, and maintaining scalable ML solutions in a cloud-native environment.
Key Responsibilities:
- Design and implement machine learning models and pipelines using AWS SageMaker and related services.
- Develop and maintain robust data pipelines for training and inference workflows.
- Collaborate with data scientists, engineers, and product teams to translate business requirements into ML solutions.
- Implement MLOps best practices including CI/CD for ML, model versioning, monitoring, and retraining strategies.
- Optimize model performance and ensure scalability and reliability in production environments.
- Monitor deployed models for drift, performance degradation, and anomalies.
- Document processes, architectures, and workflows for reproducibility and compliance.
Required Skills & Qualifications:
- Strong programming skills in Python and familiarity with ML libraries (e.g., scikit-learn, TensorFlow, PyTorch).
- Solid understanding of machine learning algorithms, model evaluation, and tuning.
- Hands-on experience with AWS ML services, especially SageMaker, S3, Lambda, Step Functions, and CloudWatch.
- Experience with data engineering tools (e.g., Apache Airflow, Spark, Glue) and workflow orchestration.
- Proficiency in MLOps tools and practices (e.g., MLflow, Kubeflow, CI/CD pipelines, Docker, Kubernetes).
- Familiarity with monitoring tools and logging frameworks for ML systems.
- Excellent problem-solving and communication skills.
Preferred Qualifications:
- AWS Certification (e.g., AWS Certified Machine Learning – Specialty).
- Experience with real-time inference and streaming data.
- Knowledge of data governance, security, and compliance in ML systems.