Senior Data Engineer/Architect

Job summary
Permanent contract
Brno
Salary: Not specified
A few days at home
Experience: > 5 years
Skills & expertise
Generated content
Innovation
Communication skills
Adaptability
Collaboration and teamwork
Performance analysis
+13

ThreatMark
ThreatMark

Interested in this job?

Questions and answers about the job

The position

Job description

The Mission

As a Data Lake Engineer at ThreatMark, your primary mission will be to develop and maintain the Data Lake environment with a goal to enable easier and quicker execution of data analysis and machine learning tasks. You will collaborate closely with our data analysts, data scientists, and engineering teams to ensure our data is precise, accessible, and meaningful, ultimately enhancing the quality of our products. Your work will enable ThreatMark to achieve its business objectives with confidence, backed by reliable and insightful data analysis.

General

  • Seniority: Medior (3+ years of experience)

  • Hire: Employee or Contractor

  • Employment Type: Full-time, Employee or Contractor

  • Place of work: Offices in Brno, Bratislava or Prague; Full Remote Possible

Responsibilities

In this role, you will:

  • Data Lake Development:

    • Build and maintain infrastructure for storage of structured and semi-structured multitenant data in Data Lake.

    • Maintain and develop configuration of AWS infrastructure and IAM policies.

    • Develop, automate and orchestrate data ingestion, ETL processes and maintenance jobs.

    • Create a layer of consolidated data to be used for data analysis and reporting.

    • Enable and standardize usage of AWS services for publishing reports and interactive dashboards.

  • Data Quality and Integrity:

    • Ensure high levels of data quality and integrity across all data sources and pipelines.

    • Implement monitoring and alerting mechanisms to detect and address data issues promptly.

  • Performance Optimization:

    • Set up data storage policies for lower storage costs.

    • Optimize data processing workflows for performance and efficiency.

    • Address bottlenecks and ensure data pipelines can scale with increasing data volumes.

  • Data Security and Compliance:

    • Ensure compliance with relevant data protection regulations and standards.

    • Implement data security measures to safeguard sensitive information.

  • Collaboration and Support:

    • Work closely with data analysts and data scientists to understand their data needs and ensure data availability.

    • Provide support and guidance on best practices for writing ETL and data engineering tasks within the AWS environment and used technologies.

    • Aid in MLOps processes.

    • Automate data reports.

    • Participation in data analysis and research welcomed.

  • Technological environment setup:

    • Enable and support data analysts and ML engineers to utilize unified DL environment for all their needs and tasks (MLOps, MLFlow, Remote code execution, GitLab integration, …)

Preferred experience

Qualifications

  • Proven experience with development of data storage and manipulation solutions (3+ years).

  • Strong proficiency in building and maintaining ETL and data pipelines.

  • Ability to communicate effectively in English.

  • Absolute Must-haves:

    • SQL, Python, Git
  • Need to know, have experience with or ability to quickly adopt as you will be working with:

    • PySpark, Terraform, Airflow

    • AWS Services: S3, IAM, EC/EMR, Lambda, Glue, QuickSight (experience with their equivalents in Azure or GCP is relevant)

    • Databrics

  • Additional points for:

    • Iceberg, Docker, Kubernetes, Helm, GitLab CI/CD

What We Value

  • Ownership: A strong ability to take ownership and move towards shared goals without supervision.

  • Collaboration: A positive, can-do attitude with no-excuse startup mindset, clear, honest and timely communication.

  • Innovation: A fervent passion to learn new skills and technologies, seeking improvement, being open to new ideas, and making data-driven decisions.

  • Adaptability: Thriving in a fast-paced and evolving environment, being flexible and ready to take on new challenges.

  • Practicality and efficiency: Employing “80/20 rule” (Pareto principle) in solving tasks.

At ThreatMark, we value diversity and are committed to creating an inclusive environment for all employees. If you are passionate about data analysis and eager to contribute to a team that is making a significant impact in the cybersecurity landscape, we encourage you to apply. Please submit your resume and a brief cover letter explaining your interest in the role and how your skills align with our mission.

Want to know more?