Senior Data Engineer
This position was filled!
Who are they?
GitGuardian is a global post-series B cybersecurity startup; we’ve raised $44M by the end of 2021 with American and European investors including top-tier VC firms.
More than ever in 2023, we have a very solid business model with a fast-growing ARR, multi-year contracts and great customer retention rates.
Among our early investors who saw our market value proposition, are the co-founder of GitHub, Scott Chacon, along with Docker co-founder / CTO Solomon Hykes 👀
We develop code security solutions for the DevOps generation and are a leader in the market of secrets detection & remediation.
Our solutions are already used by hundreds of thousands of developers in all industries and GitGuardian Internal monitoring is the n°1 security app on the GitHub marketplace 🔥
We work with some of the largest IT outsourcing companies, publicly listed companies like Talend or tech companies like Datadog.
More than 85% of our customers are in the United States.
Rencontrez Edouard, VP Product
Job description
Context
Our products are a set of tools that scan GitHub public activity and git private repositories.
They are used by different teams: Software Development and Ops teams, Application Security, Threat Response and the buying decision comes from CISOs / CTOs / Directors of Security.
By design GitGuardian is a data driven company. Both co-founders are former Data Scientists and the first product of GitGuardian is real-time processing of all new GitHub events. Our secret detection engine has been battle tested against huge amounts of data.
That’s why building data products that provide useful insights of the business is a key responsibility within our organization, your work will matter and will be taken seriously !
Missions
Design, build and maintain the company’s central Data Warehouse: infrastructure deployment, sources integration, pipeline development and optimisation, data documentation, data quality monitoring
Enrich the Enterprise Data Model by modeling business entities and events, designed to enable and support the highest levels of accuracy and quality for reporting and analytics
Stay up-to-date with the latest industry trends, technologies, and best practices in Data Engineering and contribute to the overall Data strategy and roadmap
Provide technical leadership, mentorship, and guidance to junior data engineers, including code reviews, best practices, and knowledge sharing
Implement data security and privacy best practices, including data encryption, data masking, and access controls, to protect sensitive data
Advantages
You will create data features that bring high value to the business
You will be working on a cutting-edge technology for Cloud Data Warehouse
The data ecosystem is very diverse (Amazon RDS for PostgreSQL, Elasticsearch, MongoDB, various SaaS providers)
The Data Team builds and maintains its own infrastructure with high standards in terms of automation and IaC thanks to a close collaboration with the DevOps team
You will be part of a scale-up adventure with a strong engineering culture
Our technical stack
Snowflake
PostgreSQL, Elasticsearch, MongoDB
Airbyte
Metabase, Tableau
GitLab
AWS, Terraform, Docker, Kubernetes
Preferred experience
_If you think you are only matching 70% to 80% of these criterias, please send us your resume !
And if you still have some questions before applying, you can directly write to us at :_
Hard skills
5+ years of hands-on experience in designing, developing, and implementing complex data pipelines and ETL processes
Strong programming skills in one or more programming languages focused on data processing (Python, Scala …) along with skills in application best practices (code modularity, unit tests, documentation, etc)
Strong knowledge of data structures, Data Warehouse, data modelisation and structuration
Experience with cloud-based data platforms, and proficiency in using Cloud Data Warehouse such as Snowflake or BigQuery
Strong database and SQL skills, including experience with relational databases such as PostgreSQL, and familiarity with NoSQL databases, such as MongoDB or Elasticsearch
Affinity with Ops topics (CI/CD, monitoring, infrastructure, finops) and tools (Terraform, Ansible)
Experience with Data Visualization tools, such as Metabase or Tableau, is a plus
Familiarity with Machine Learning and Data Science concepts is a plus
Soft skills
You like to analyze data and extract useful insights
You are above average in terms of rigor and autonomy, and you always check your results against expectations
You like to write high quality and re-usable code
You are autonomous, proactive and curious
You are a team player with strong communication skills
You are able to work in a fast-paced and dynamic environment, and adapt to changing requirements
You speak fluent French and English
Bonus points
You don’t embed API keys in your code ;)
Deep understanding of the startups dynamics and challenges
Have experienced strong team growth in a previous company
Recruitment process
1 visio call with a recruiter
To discover your professional project, present to you the team, and evaluate if there could be a mutual match
1 technical interview with the Lead Data Engineer Alexis
To evaluate your hard skills for the position and project yourself into the role
1 technical test depending on your seniority
To see how you are doing hands on coding
1 final interview with the COO or the CEO
To explain to you our company’s vision and ambitions to the next couple of years, and make sure you are up for the position