The runtime engine (RTE) team is building a distributed inference engine that takes compiled neural networks (generated by our compiler) and runs them on an HPC infrastructure. The goal of the RTE is to accelerate the inference as much as possible, lowering latency and increasing throughput. In order to achieve this goal, we must research and implement efficient techniques for scheduling and distributing data and computation, using intra and inter-node optimizations. The product fulfills Zama’s mission by enabling the use of privacy-presenving AI in practice due to its efficient distribution of heavy computations, making the cost and latency viable for end users.
Your team (and thus you) will be responsible for:
Our process is described in detail here: https://zama.ai/2020/04/28/how-we-hire-at-zama/
These companies are also recruiting for the position of “Cloud Computing and DevOps”.