BatvoiceAI is looking for a Junior Machine Learning Engineer to join our DATA team to help handle the RUN-time maintenance and expansion of our 500+ models running in production simultaneously.
You will collaborate with:
The rest of the Data team as they create new ways of organizing our many models and pushing the state of the art with your involvement.
The Operations team to ensure smooth model availability in our Kubernetes environment.
The Application team (Backend focus) as you ensure the optimal integration of our models in our ETL pipeline and server-side code.
Put differently, this position is for you if you’re a Data/ML-focused junior with some notions of Software Engineering and Operation principles that wants to stay on top of all those subjects.
Access : at these scales, even something as simple as making sure the model is accessible to the runtime environment (so an Operations concern) is a challenge. Imagine if it was done the typical Data Science way, where you download the model, load it, then run the task, and destroy the worker. With a small model of 100MB, and merely 1000 calls handled a day per client, we would end up with 100GB of traffic per client per day, before any work has been done.
Data : we have many datasets of different types. When training a new model — even if it’s something we’ve partly automated for common cases — you must be able to figure out what datasets will be applicable, and how to use them optimally. This isn’t always obvious!
Integration : our models run in a variety of contexts, many of which require custom attention. A model that runs as part of a real-time analysis on-premises will need to be treated differently from a model that runs as part of an asynchronous background pipeline. And we sure have a lot of them!
We are a fully remote company, though we do make physical offices available if you prefer those.
We do not measure seniority by years of experience, but in terms of results and ability, leading to rapid advancement opportunities. As part of your career path within BatvoiceAI, this growth could later lead to joining the BUILD team as a DevOps or backend engineer.
We offer comprehensive healthcare coverage through Alan for you and your family (50% participation).
From their second year onward, team members who demonstrate strong commitment and motivation may be granted stock options, reflecting their contribution to BatvoiceAI’s long-term success and journey.
Research has shown that having strict quantitative requirements discourage applicants that would have otherwise performed well at the position from applying. As such, instead of experience, we define skills you should have. Important: even though this offer mentions some tools, having a skill does not mean knowing a specific tool. Combined with our public definition of Junior, as well as our company Values, we encourage you to apply if you believe you are a good fit for our needs, provided you can justify it in some way. We document how we make hiring decisions in our public Interview Process document, which should similarly give you an idea as to whether you are likely to be selected.
Data Science : to create and refine models you need to be comfortable with data science concepts, statistical modeling, and at least be aware of the general ecosystem. This includes TF-IDF, accuracy, precision, F1, Jaccard on the theoretical side, and Scikit-Learn, Pandas, Numpy and MLFlow on the practical side. Don’t worry if you’re not an expert in all of these, but you should at least know what they are. If you’ve got opinions one this subject, this is a great time to make them known to stand out.
Software Engineering : You’ll be participating in the integration of the models, making sure they’re being used correctly, and overseeing the pipeline. To be able to do this, you need to be at ease in Python and shouldn’t be afraid to dig into SQL queries (we use PostgreSQL), or even debug odd edge-cases we discover (we’ve already discovered a couple!), so experience with debugging is also appreciated. If you prefer LLDB over GDB, this is the time to explain why!
Operations : fetching data, pushing models, making them available to the code, none of these things happen in isolation — they happen on actual infrastructure. You’ll be cooperating with the Operations team to make this happen, which means you need to keep actual infrastructural needs in mind and be able to interact with it. We’re currently on AWS, using Kubernetes on top of EKS, scheduling using Argo Workflows. Most of these are not set in stone (for example, we’re looking into deployments on OVH and On-Premises), but none of these words should scare you if you’re interested in this position.
Teamwork: our Engineers should not silo or work silently, an importance that is ever greater in a fully remote company. This position requires a mix of autonomy, discipline, cooperation, and communication.
Human languages: English is our working language, but our clients speak French — you should know both to at least some degree.
The entire recruitment process (including the steps we take internally) is documented in our Public Handbook. The short version looks like this:
1. Your CV and Cover Letter are reviewed by Engineers. This will take 1-2 business days at most.
2. Bidirectional technical interview. If you’re selected for this step, you will be contacted to schedule this. It can happen as early as the day you’re selected, or within the next couple of days.
3. HR interview and Culture Fit with Maxime, our CEO. This step depends on his availability, but should not take more than a couple of days to organize.
We expect to finish the selection process by October 17th, at least for the first batch of candidates. Also note that while the start date is marked as November 3rd, if you’re a good fit, we’re willing to wait (significantly) longer for you.
Meet Maxime, CEO
Rencontrez Julie, Project Manager
Ces entreprises recrutent aussi au poste de “Data / Business Intelligence”.