[VO2 Data] - Data Engineer GCP

Shrnutí práce
Plný úvazek
Paris
Plat: Neuvedeno
Několik dní doma
Dovednosti a odbornost
Ukládání dat
Programovací jazyky
Composer
Java
Nosql
+4

VO2 GROUP
VO2 GROUP

Máte zájem o tuto nabídku?

jobs.faq.title

Pozice

Popis pozice

Job description

Data Collection: Extraction of data from various sources, whether it be databases, files, or real-time data streams.

Data Cleaning and Transformation: Cleaning, filtering, enriching, and transforming data to prepare it for analysis. This may include handling missing data, normalization, format conversion, etc.

Data Pipeline Design: Creation of data pipelines to automate data flow, including managing dependencies between different pipeline stages.

Data Storage: Selection of appropriate storage solutions, whether it be Google Cloud Storage, Bigtable, BigQuery, or other GCP services.

Data Integration: Integrating data into data warehouses, columnar data stores, NoSQL databases, or data lakes.

Data Quality Management: Implementation of data quality controls to ensure data integrity and quality.

Data Security: Implementation of security measures to protect sensitive data, including data access, identity and access management, encryption, etc.

Performance Optimization: Monitoring and optimizing the performance of data pipelines to ensure quick response to queries and efficient resource utilization.

Documentation: Documenting data pipelines, data schemas, and processes to facilitate understanding and collaboration.

Automation: Automating ETL (Extract, Transform, Load) processes to minimize manual intervention.

Collaboration: Collaborating with data scientists, analysts, and other team members to understand their needs and ensure data readiness for analysis.

Monitoring: Constant monitoring of data pipelines to detect and resolve potential issues.

Scalability: Designing scalable data pipelines capable of handling growing data volumes.

This list of tasks is not exhaustive and is subject to change.

Profile sought

GCP Mastery: A deep understanding of GCP services and tools is essential for designing and implementing data engineering solutions.

Real-time Data Processing: Ability to design and implement real-time data pipelines using services like Dataflow or Pub/Sub.

Batch Data Processing: Competence in creating batch data processing workflows with tools like Dataprep, Dataprep, and BigQuery.

Programming Languages: Proficiency in programming languages such as Python, Java, or Go for script and application development.

Databases: Knowledge of both NoSQL databases (Cloud Bigtable, Firestore) and SQL databases (BigQuery, Cloud SQL) for data storage and retrieval.

Data Security: Understanding of data security best practices, including authorization management, encryption, and compliance.

Orchestration Tools: Ability to use orchestration tools such as Cloud Composer or Cloud Dataflow to manage data pipelines.

Problem-solving: Aptitude to solve complex problems related to data collection, processing, and storage, as well as optimizing the performance of data pipelines.

Chcete se dozvědět více?

Tato volná pracovní místa by vás mohla zajímat!

Tyto společnosti rovněž nabírají pracovníky na pozici "{profese}".

Podívat se na všechny nabídky