We provide IT Staff Augmentation Services!

Hadoop Developer Resume

SUMMARY:

  • 9+ years of IT experience with extensive knowledge in software development life cycle(SDLC) involving requirements gathering, Design, Architecting, Analysis, Development, Maintenance, implementation.
  • 4+ years of exclusive experience in Hadoop and its components like HDFS, Map Reduce, Hive, Apache Pig, Hcatalog, Kafka, Sqoop, HBase, Flumes, Zookeeper, Spark and Oozie.
  • Proficient in data architecture/ DW/ Bigdata/ Data Integration/ Data Governance/ Metadata Management using custom, open source or off the shelf tools.
  • Experience of implementing Data Lake concept.
  • Involved in project bidding on Hadoop platform.
  • Experience in distributed message broker like Apache Kafka.
  • Working experience in creating complex data ingestion pipelines, data transformations, data management data data governance in a centralized enterprise data hub.
  • Experience in Distributed processing framework like Mapreduce, Spark and Tez.
  • Extensive experience in developing PIG Latin Scripts and using Hive Query Language for data analytics.
  • Architecting, solutioning and modelling data integrity platform using sqoop, flume, kakfa, spark.
  • Hands on experience using Sqoop to import data into HDFS from various RDBMS and vice versa.
  • Good knowledge on Kafka for streaming real - time feeds from external rest applications to Kafka topics.
  • Experience working with Cloudera, Hortonworks and Confidential Big Insight Distribution.
  • Experience using numpy, pandas packages in python.
  • Experience in indexing log files using Elasticsearch.
  • Able to assess business rules, collaborate with stakeholders and perform source-to-target data mapping, design and review.
  • Experience in Data Warehouse Concepts and implementations of large scale Datawarehouse.
  • Knowledge of implementing ELK stack (Logstash, Elasticsearch and Kibana).

TECHNICAL SKILLS:

Technologies: Hadoop, Hive, Pig, HBase, Sqoop, AWS, Hcatalog, Flume, Zookeeper, Kafka, Spark, Elasticsearch, Ambari, Zeppelin, Hue, Oozie, logstash, Kibana, Maven, GitHub, Jenkins.

ETL Tool: Datastage 7.5X2 (PX) and 8.5

Database: Oracle 9i, DB2, Teradata.

Languages: C, SQL, Scala, Python.

Tool: & Utilities Control M, SQL Developer, WebSphere MQ, Eclipse, Intellij, Pycharm

PROFESSIONAL EXPERIENCE:

Confidential

Hadoop Developer

Modules Environment: Hortonworks 6.2, Sqoop, vertabelo, Hive, GitHub, Source tree.

  • Analysis the various bank applications and structure of data before data modeling and the data transformation required before loading of the data.
  • Designing for all the LOB of the bank as per the Risk requirement.
  • Create, manage logical, physical and conceptual models for Risk data store.
  • Develop jobs for data ingestion based on the business requirement into Hadoop platform.
  • Prepare High level design documents based on the understanding of Requirement Elicitation to technical specification for the different modules.
  • Develop scripts for managing and scheduling jobs for data ingestion into Hadoop Cluster.
  • Reviewing project/task statuses/issues with the business and ensuring completion of project on time.

Confidential

Hadoop Admin & Developer

Modules Environment: Centos 6.7, Cloudera 5.7, Solr, Sqoop, Kafka, Spark, Hive, R, GitHub, Jenkins, Maven, D3.

Responsibilities:

  • Developed Big Data Solutions that enabled the business and technology teams to make data-driven decisions on the best ways to acquire customers and provide them business solutions.
  • Handled importing of data from various data sources, performed transformations, loaded data into HDFS.
  • Worked on custom Pig Loaders and Storage classes to work with a variety of data formats such as JSON, CSV. etc.
  • Developed Spark applications to perform all the data transformations coming from multiple sources.
  • Created Spark applications to perform various data cleansing, validation, transformation and summarization activities according to the requirement.
  • Explored MLlib algorithms in Spark to understand the possible Machine Learning functionalities that can be used for our use case.
  • Used both Hive context as well as SQL context of Spark to do the initial testing of the Spark job.
  • Have experience in executing Hive Queries using Spark SQL that integrates Spark environment.
  • Worked on designing the D3 baseboards and Mentoring the young team.

Confidential

Hadoop Developer

Modules Environment: Centos 6.7, Cloudera 5.5, hive, Sqoop, Kafka, Pig, Flume, Hcatalog, Hue, Oozie

Responsibilities:

  • Optimizing the Hive queries using Partitioning and Bucketing techniques, for controlling the data distribution.
  • Written Sqoop scripts to ingest data from different RDBMS data Source.
  • Used Flume and Sqoop extensively in gathering and moving data files from Application Servers to Hadoop Distributed File System (HDFS).
  • Developed Hive scripts in Hive QL to de-normalize and aggregate the data.
  • Implemented the hive partitions, hive joins, hive bucketing.
  • Involved in converting Hive/SQL queries into Spark transformations using Spark Data Frames.
  • Analyzed the spark SQL scripts and designed the solution to implement.
  • Used Spark SQL to process the huge amount of Structured data.
  • Set-up zeppelin and developed spark jobs for data analysts.
  • Implemented near real time data pipeline using framework based on Kafka, Spark.
  • Written Oozie workflow for scheduling jobs and for writing pig scripts and hive QL.
  • Experience in building/tuning real-world systems under scale/performance constraints.
  • Prioritize daily workflow and demands on quality, time and resources.

Confidential

Developer

Module Environment: Centos 6.7, Kafka, Elasticsearch, Log stash, Kibana

Responsibilities:

  • Designing and implementing data processing pipelines for different kinds of data sources, formats and content.
  • Responsible for setup and intake of logs using Apache Kafka.
  • Written log stash scripts to process logs from Kafka to Elasticsearch.
  • Developed Elasticsearch scripts for indexing and searching log data.
  • Developed visualization dashboards using Kibana.

Confidential

Sr Developer

Modules Environment: Unix, Teradata and DataStage 8.5

Responsibilities:

  • Interacted with end user community to understand the business requirements and in identifying data sources.
  • Preparing and maintained HDD document related to MBI.
  • Design and develop bteq scripts in Teradata based on business requirement.
  • Design and development of ELT module using DataStage.
  • Working on scheduling toll control-m for automating the ETL process.
  • Documented ETL test plans, test cases based on design specification for unit testing, system testing and prepared test data for testing.

Confidential

Sr Developer

Modules Environment: Unix, Teradata and DataStage 8.5

Responsibilities:

  • Worked and coordinated with subject matter experts and business data quality for data requirement documents and technical documents.
  • Preparing and marinating AAA document.
  • Design and develop bteq scripts in Teradata based on business requirement.
  • Design and development of ELT module using Datastage.
  • Prepared visio design document for ETL job flow.
  • Working on scheduling toll control-m for automating the ETL process.
  • Involved in unit testing, integration testing and UAT .

Confidential

Sr Developer

Modules Environment: Unix, DB2 and Datastage 7.5

Responsibilities:

  • Worked and coordinated with subject matter experts and business data quality for data requirement documents and technical documents.
  • Preparing and marinating AAA document.
  • Design and development of ELT module using Datastage.
  • Prepared visio design document for ETL job flow.
  • Working on scheduling toll control-m for automating the ETL process.
  • Involved in unit testing, integration testing and UAT.

Confidential

Responsibilities:

  • Responsible for the creation, execution, and maintenance of DB2 databases that tracked access of employee and assisted in report creation.
  • Conducted and participated sessions with the Project managers, Business Analysis Team, Finance and Development to team gather, analyze and document the Business and reporting requirements.
  • Generated daily, weekly, and monthly inventory reports utilizing Cognos.
  • Provided technical assistance in identifying, evaluating, and developing cost effective systems and procedures that met business requirements.
  • Created ad hoc reports for management to support key decision-making.
  • Involved in documenting the project steps and presenting to the team members.
  • Maintained weekly department reporting utilizing Cognos and Excel.

Hire Now