ML Engineer

Company

Orcrist builds the Orcrist Intelligence Platform (OIP), a secure, Kubernetes-native data intelligence system deployed as SaaS or self-hosted/on-prem (including air-gapped missions). We fuse data processing, ML, and intuitive UX for defense, law-enforcement, and enterprise teams.

Role

Productionize the NLP/audio/document models that power OIP’s insight experiences. You’ll own model packaging, deployment, monitoring, and evaluation—partnering with Research and product squads to deliver trustworthy enrichment worldwide.

What you’ll do

Package and deploy models (ASR, translation, OCR, NER, summarization) using Triton/KServe on Kubernetes.
Build evaluation pipelines (WER, BLEU, F1, latency, cost) and automate release gating.
Operate streaming + batch inference via Kafka, Temporal, and backfill tooling.
Monitor drift/quality with Prometheus, Grafana, Evidently; optimize inference cost and performance.
Collaborate with TypeScript teams on payload schemas, contracts, and human-in-the-loop feedback loops.

About you

4–8+ years ML engineering/MLOps, shipping models to production.
Strong Python, PyTorch/Transformers, and experience with Triton/KServe or similar.
Comfortable with Kubernetes, GitOps, CI/CD, and GPU workload operations.
Knowledge of evaluation metrics, monitoring, and annotation workflows.
Eligible to work in Germany; export-control screening required for certain programs.

Nice-to-haves

Temporal, Beam/Flink, or Ray Serve experience; ONNX/TensorRT optimization.
German language (B1+) and familiarity with defense or public safety datasets.
WhisperX, DeepStream/GStreamer, or vector search integrations.

What we offer

Modern MLOps stack: Triton, Temporal, Kafka, MLflow/Weights & Biases, Evidently, Kubernetes.
Remote-first in Germany with regular Berlin meetups, 30 days vacation, equipment & learning budget.