Cybrient Technologies SA logo

Data Engineer

Job Description - Data Engineer Position: Data Engineer Engagement: Full-time, 1-year contract (extendable) Location: Houston, United States of America (USA), on-site Languages: English (working proficiency) Expectations: On-site, 5 days a week Role Overview We are seeking a Data Engineer (Data Ingestion & Cloud Modernization) to design, build, and maintain scalable data ingestion pipelines across modern cloud platforms. The role focuses on implementing reliable batch and near-real-time data pipelines, modernizing legacy ETL processes, and enabling data platforms using Azure and Snowflake technologies. This position is well-suited for mid-level data engineering professionals with hands-on experience in data ingestion, cloud-based data platforms, and Python-based pipeline development. The role requires strong collaboration with data platform, analytics, and application teams to ensure reliable, secure, and high-quality data delivery across the organization. Key Responsibilities Data Ingestion & Pipeline Engineering Design, build, and maintain robust data ingestion pipelines (batch and near-real-time) from diverse sources such as databases, APIs, files, and event streams. Migrate legacy ETL/ELT processes to modern Azure and Snowflake architectures through re-platforming and refactoring initiatives. Implement incremental data loads, CDC (Change Data Capture) patterns, schema evolution handling, and backfill/reprocessing strategies. Standardize ingestion workflows by developing reusable frameworks, templates, and best practices. Azure & Snowflake Modernization Develop cloud-native ingestion solutions using Azure services such as: o Azure Data Factory / Synapse Pipelines for orchestration. o Azure Databricks and/or Spark for transformations. o Azure Storage / ADLS Gen2 for landing and staging layers. o Event-driven services (e.g., Event Hubs) where applicable. Build ingestion and loading patterns into Snowflake using: o Snowflake stages, file formats, and COPY INTO commands. o Snowflake Streams and Tasks where appropriate. o Data modelling foundations for raw - curated data layers using dbt. Real-Time Data Ingestion Build components to capture streaming data sources. Develop real-time transformation pipelines and ensure timely delivery of data to downstream consumer services. Platform Enablement & Reusable Components Develop shared service components, Python libraries, and integration templates to accelerate delivery across Data Engineering and Application teams. Follow integration best practices and ensure consistency across digital services and data pipelines. Data Quality, Reliability & Observability Implement data validation and quality checks (completeness, freshness, duplicates, schema drift). Ensure pipelines are reliable and recoverable through idempotency, retry logic, re-run capabilities, and alerting mechanisms. Implement observability through logging, metrics, lineage metadata, and pipeline health dashboards. Security, Governance & Ways of Working Apply security best practices including least privilege access, secrets management, encryption, and secure connectivity. Follow data governance standards such as naming conventions, data retention policies, classification, and documentation. Collaborate within agile delivery processes including code reviews, CI/CD pipelines, iterative release planning, and cross-team coordination. Qualifications Education: Bachelors degree in computer science, Information Technology, Engineering, or a related field (or equivalent practical experience). Experience: 36 years of hands-on data engineering experience with a strong focus on data ingestion and pipeline development. Skills: o Experience building production pipelines using Azure Data Factory, Databricks, or Synapse. o Strong SQL skills and experience working with modern cloud data warehouses, ideally Snowflake. o Proficiency in Python for data processing, automation, and pipeline utilities. o Experience with data ingestion patterns such as batch processing, CDC, and streaming ingestion. o Familiarity with cloud data architecture concepts and modern ELT practices. Soft Skills Strong analytical and problem-solving abilities. Collaborative mindset with the ability to work across data, engineering, and application teams. Attention to detail with a focus on data reliability and quality. Proactive approach to improving data platform capabilities and automation. Effective communication and documentation skills.