One of our clients is looking for an experienced, self-driven Senior Data Engineer to design, implement and optimize their data pipeline in their Cloud-based Big Data platform.
- Design, build, optimize and maintain data pipelines from ingestion, processing to generating metrics/data required by the Business Operations team, product managers, and data scientists.
- Crystalize BI requirements with the business operations team and product managers. Identify the required data sources in collaboration with our development team.
- Data modeling of our data warehouse based on the source systems and business requirement and develop business reports.
- Optimization of our Hadoop/Spark-based processing framework and batch jobs for increased processing throughput, reliability, and accuracy.
Desired qualifications and skills:
- Strong experience in ETL, simplify and optimize complex SQL.
- Hands on experience of Hadoop systems in particular Spark, Hive.
- Experience in system, performance optimization.
- Building analytics reports/dashboards for business stakeholders.
- Strong in SQL and familiar with Java, Scala programming languages.
- Hands on experience of Linux/Unix environment & Java VM.
- Master or bachelor degree or in computer science / information system or equivalent.
What we consider as an advantage:
- Experience of any BI reporting tool.
- Experience in building real-time data pipelines is a big plus.
- Knowledge data administration (such as Meta data, data quality, etc.) related framework and methodology.
- Knowledge or experience of mass data batch processing, streaming and machine learning.