Lead AI Systems Architect (Inference)
Back to Job Search

Lead AI Systems Architect (Inference)

Reference: SJE16
Location
San Francisco, CA, USA
Salary
$225,000 - $300,000
Contract Type
Permanent
Work Arrangement
In-Office (Full-Time)
Skill Requirements
  • Software Engineering

Lead AI Inference Systems Engineer

Join an early-stage start-up building the future of AI hardware. Working alongside world-class engineers and leading AI labs, you'll help design workload-specialised accelerators that push the boundaries of AI inference performance.

This is a rare opportunity to work across the full hardware-software stack, shaping how frontier AI models execute on next-generation silicon.

About the Role

You'll work at the intersection of AI models, compiler and runtime systems, kernel optimisation and computer architecture. From analysing real-world AI workloads to influencing silicon design, you'll play a key role in building accelerators designed around the demands of modern foundation models.

What You'll Do

  • Analyse AI inference workloads to identify performance bottlenecks across compute, memory and data movement
  • Optimise models through quantisation, kernel fusion, graph optimisation and low-precision execution
  • Build performance models and workload analysis tools to guide hardware design
  • Collaborate with architecture, compiler and RTL teams to define next-generation accelerator features
  • Develop kernels, profiling tools and reference implementations to evaluate performance
  • Use AI agents and automation to accelerate architecture exploration and optimisation

What You'll Bring

  • Experience optimising AI inference systems or ML infrastructure
  • Strong understanding of AI models, compiler/runtime systems and computer architecture
  • Experience with performance analysis, memory optimisation and workload profiling
  • Strong Python and C++ development skills
  • Experience with PyTorch or modern ML frameworks
  • Comfortable working across software and hardware in a fast-moving start-up

Nice to Have

  • AI accelerator architecture or hardware-software co-design experience
  • Compiler, runtime or kernel optimisation expertise
  • Quantisation, graph lowering or memory planning experience
  • Experience building performance models or simulation tools

Apply Now

Please fill in the form below to apply for this job.

Apply Now
Get in touch
Sebastian Eyre image
Sebastian Eyre
Similar Jobs
23rd Jun 2026

Senior Performance Engineer

In-Office (Full-Time)Software Engineering
23rd Jun 2026

Software Architect

In-Office (Full-Time)Software Engineering
2nd Jun 2026

Compiler Optimization Engineer

HybridSoftware Engineering

Get in touch.

oho connects the future to your hands. Let us know what we can do for you.