Machine Learning Engineer | Python | Pytorch | Distributed Training | Optimisation | GPU | Hybrid, San Jose, CA Job at Enigma, San Jose, CA

S0plbGluTWo0VnFUM1JKbGlhRVNMZmk5S0E9PQ==
  • Enigma
  • San Jose, CA

Job Description

Machine Learning Engineer | Python | Pytorch | Distributed Training | Optimisation | GPU | Hybrid, San Jose, CA

Title: Machine Learning Engineer

Location: San Jose, CA

Responsibilities:

  • Productize and optimize models from Research into reliable, performant, and cost-efficient services with clear SLOs (latency, availability, cost).
  • Scale training across nodes/GPUs (DDP/FSDP/ZeRO, pipeline/tensor parallelism) and own throughput/time-to-train using profiling and optimization.
  • Implement model-efficiency techniques (quantization, distillation, pruning, KV-cache, Flash Attention) for training and inference without materially degrading quality.
  • Build and maintain model-serving systems (vLLM/Triton/TGI/ONNX/TensorRT/AITemplate) with batching, streaming, caching, and memory management.
  • Integrate with vector/feature stores and data pipelines (FAISS/Milvus/Pinecone/pgvector; Parquet/Delta) as needed for production.
  • Define and track performance and cost KPIs; run continuous improvement loops and capacity planning.
  • Partner with ML Ops on CI/CD, telemetry/observability, model registries; partner with Scientists on reproducible handoffs and evaluations.

Educational Qualifications:

  • Bachelors in computer science, Electrical/Computer Engineering, or a related field required; Master’s preferred (or equivalent industry experience).
  • Strong systems/ML engineering with exposure to distributed training and inference optimization.

Industry Experience:

  • 3–5 years in ML/AI engineering roles owning training and/or serving in production at scale.
  • Demonstrated success delivering high-throughput, low-latency ML services with reliability and cost improvements.
  • Experience collaborating across Research, Platform/Infra, Data, and Product functions.

Technical Skills:

  • Familiarity with deep learning frameworks: PyTorch (primary), TensorFlow.
  • Exposure to large model training techniques (DDP, FSDP, ZeRO, pipeline/tensor parallelism); distributed training experience a plus
  • Optimization: experience profiling and optimizing code execution and model inference: (PTQ/QAT/AWQ/GPTQ), pruning, distillation, KV-cache optimization, Flash Attention
  • Scalable serving: autoscaling, load balancing, streaming, batching, caching; collaboration with platform engineers.
  • Data & storage: SQL/NoSQL, vector stores (FAISS/Milvus/Pinecone/pgvector), Parquet/Delta, object stores.
  • Write performant, maintainable code
  • Understanding of the full ML lifecycle: data collection, model training, deployment, inference, optimization, and evaluation.

Machine Learning Engineer | Python | Pytorch | Distributed Training | Optimisation | GPU | Hybrid, San Jose, CA

Job Tags

Similar Jobs

ProAdjuster by Allied

Remote Field Property Adjuster Job at ProAdjuster by Allied

 ...are self-motivated individuals who like working from home, being a team-player, and getting the...  .... WHAT YOU CAN EXPECT As an independent field property adjuster, your primary...  ...property owners, witnesses, and sometimes contractors to gather relevant information.... 

W.W.Rowland Trucking Co., Inc.

Intermodal Customer Service Rep. Job at W.W.Rowland Trucking Co., Inc.

W.W. Rowland Trucking Company, LLC has an immediate opening for an Intermodal Customer Service Representative to join our team in our Dallas, Texas location - open to consider candidates in the Houston metroplex . The position communicates with customers on service...

TALNT Team

Fuel Dispatch Operations Manager Job at TALNT Team

 ...Position Summary The Dispatch Operations Manager is responsible for leading and optimizing all dispatch operations for fast growing fuel transport company in Southern CA. This role combines hands-on operational execution with strategic customer relationship management... 

Avamere

Full Time Interim/Float Director of Nursing/Registered Nurse Job at Avamere

 ...Full-Time Float/Interim Directorof Nursing Services / DNS (RN) Avamere Washington State Region Status: Full-Time Setting:Skilled Nursing (SNF) Schedule: Monday-Friday Apply at Teamavamere.com Join our team as the Director of Nursing (DNS... 

Sichenzia Ross Ference Carmel LLP

Junior to Mid-Level Corporate Associate Job at Sichenzia Ross Ference Carmel LLP

 ...Junior to Mid-Level Corporate Associate About the Company : Sichenzia Ross Ference Carmel LLP is seeking a Junior to Mid-Level Corporate Associate with 1-5 years of experience in the field of securities and corporate law. This is an excellent opportunity for skilled...