Job Seekers, Please send resumes to email@example.com
- Hive/(Talend or Pentaho or Informatica or similar ETL) /Impala/MapReduce/Spark
- Scala / Python / Java
- ETL /Data warehousing
- RDBMS/Data Modelling
- Design and Build distributed, scalable, and reliable data pipelines that ingest and process data at scale and in real-time.
- Data Analysis and Data exploration
- Explore new data sources and data from new domains
- Production Alize real time/Batch ML models on Python/Spark
- Evaluate big data technologies and prototype solutions to improve our data processing architecture.
- 8+ years of hands-on programming experience with 4+ years in Hadoop platform
- Proficiency in data analysis and strong SQL skills.
- Knowledge of various components of Hadoop ecosystem and experience in applying them to practical problems – Hive/Impala/Spark-Scala/MR.
- Proficiency with shell scripting & Python
- Experience in data warehousing, ETL tools , MPP database systems
- Experience working in HIVE & Impala & creating custom UDFs and custom input/output formats /serdes
- Ability to acquire, compute, store and process various types of datasets in Hadoop platform
- Understanding of various Visualization platforms (Tableau)
- Experience with Scala.
- Excellent written and verbal communication skills
Minimum years of experience*: 8
Interview Process (Is face to face required?) No
Does this position require Visa independent candidates only? Yes