drjobs Machine Learning Operations Engineer - Remote العربية

Machine Learning Operations Engineer - Remote

Employer Active

drjobs

Job Alert

You will be updated with latest job alerts via email
Valid email field required
Send jobs
Send me jobs like this
drjobs

Job Alert

You will be updated with latest job alerts via email

Valid email field required
Send jobs
Job Location drjobs

Re - Italy

Monthly Salary drjobs

Not Disclosed

drjobs

Salary Not Disclosed

Job Description

NAVA Software solutions is looking for a Machine Learning Operations Engineer

Details:
Machine Learning Operations (MLOps) Engineer AWS (with LLM Focus)
Location: Remote work
Duration: 12 months

Responsibilities:

  • LLMOptimized MLOps Infrastructure: Design and implement MLOps infrastructure on AWS tailored for LLMs leveraging services like SageMaker EC2 (with GPU instances) S3 ECS/EKS Lambda and more.
  • LLM Deployment Pipelines: Build and manage CI/CD pipelines specifically for LLM deployment addressing unique challenges like model size inference optimization and versioning.
  • LLMOps Practices: Implement LLMOps best practices for monitoring model performance drift detection prompt management and feedback loops for continuous improvement.
  • RESTful API Development: Design and develop RESTful APIs to expose LLM capabilities to other applications and services ensuring scalability security and optimal performance.
  • Model Optimization: Apply techniques like quantization distillation and pruning to optimize LLM models for efficient inference on AWS infrastructure.
  • Monitoring and Observability: Establish comprehensive monitoring and alerting mechanisms to track LLM performance latency resource utilization and potential biases.
  • Prompt Engineering and Management: Develop strategies for prompt engineering and management to enhance LLM outputs and ensure consistency and safety.
  • Collaboration: Work closely with data scientists researchers and software engineers to integrate LLM models into production systems effectively.
  • Cost Optimization: Continuously optimize LLMOps processes and infrastructure for costefficiency while maintaining high performance and reliability.

Qualifications:

  • Experience: 3 years of experience in MLOps or a related field with handson experience in deploying and managing LLMs.
  • AWS Expertise: Strong proficiency in AWS services relevant to MLOps and LLMs including SageMaker EC2 (with GPU instances) S3 ECS/EKS Lambda and API Gateway.
  • LLM Knowledge: Deep understanding of LLM architectures (e.g. Transformers) training techniques and inference optimization strategies.
  • Programming Skills: Proficiency in Python and experience with infrastructureascode tools (e.g. Terraform CloudFormation) REST API frameworks (e.g. Flask FastAPI) and LLM libraries (e.g. Hugging Face Transformers).
  • Monitoring: Familiarity with monitoring and logging tools for LLMs such as Prometheus Grafana and CloudWatch.
  • Containerization: Experience with Docker and container orchestration (e.g. Kubernetes ECS) for LLM deployment.
  • Problem Solving: Excellent problemsolving and troubleshooting skills in the context of LLMs and MLOps.
  • Communication: Strong communication and collaboration skills to effectively work with crossfunctional teams

Employment Type

Full Time

Company Industry

Report This Job
Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.