Job ID :
Company :
Internal Postings
Location :
New York, NY
Type :
Duration :
12 Months+
Salary :
Status :
Openings :
Posted :
24 May 2019
Job Seekers, Please send resumes to


  1. SQL
  2. Hive/(Talend or Pentaho or Informatica or similar ETL) /Impala/MapReduce/Spark
  3. Unix
  4. Scala / Python / Java
  5. ETL /Data warehousing
  6. RDBMS/Data Modelling
  7. Tableau/QlikView

Job Responsibilities:

  • Design and Build distributed, scalable, and reliable data pipelines that ingest and process data at scale and in real-time.
  • Data Analysis and Data exploration
  • Explore new data sources and data from new domains
  • Production Alize real time/Batch ML models on Python/Spark
  • Evaluate big data technologies and prototype solutions to improve our data processing architecture.

Candidate Profile:

  • 8+ years of hands-on programming experience with 4+ years in Hadoop platform
  • Proficiency in data analysis and strong SQL skills.
  • Knowledge of various components of Hadoop ecosystem and experience in applying them to practical problems – Hive/Impala/Spark-Scala/MR.
  • Proficiency with shell scripting & Python
  • Experience in data warehousing, ETL tools , MPP database systems
  • Experience working in HIVE & Impala & creating custom UDFs and custom input/output formats /serdes
  • Ability to acquire, compute, store and process various types of datasets in Hadoop platform
  • Understanding of various Visualization platforms (Tableau)
  • Experience with Scala.
  • Excellent written and verbal communication skills

Minimum years of experience*: 8

Interview Process (Is face to face required?) No

Does this position require Visa independent candidates only? Yes