Hadoop Developer Resume

SUMMARY:

9+ years of IT experience with extensive knowledge in software development life cycle(SDLC) involving requirements gathering, Design, Architecting, Analysis, Development, Maintenance, implementation.
4+ years of exclusive experience in Hadoop and its components like HDFS, Map Reduce, Hive, Apache Pig, Hcatalog, Kafka, Sqoop, HBase, Flumes, Zookeeper, Spark and Oozie.
Proficient in data architecture/ DW/ Bigdata/ Data Integration/ Data Governance/ Metadata Management using custom, open source or off the shelf tools.
Experience of implementing Data Lake concept.
Involved in project bidding on Hadoop platform.
Experience in distributed message broker like Apache Kafka.
Working experience in creating complex data ingestion pipelines, data transformations, data management data data governance in a centralized enterprise data hub.
Experience in Distributed processing framework like Mapreduce, Spark and Tez.
Extensive experience in developing PIG Latin Scripts and using Hive Query Language for data analytics.
Architecting, solutioning and modelling data integrity platform using sqoop, flume, kakfa, spark.
Hands on experience using Sqoop to import data into HDFS from various RDBMS and vice versa.
Good knowledge on Kafka for streaming real - time feeds from external rest applications to Kafka topics.
Experience working with Cloudera, Hortonworks and Confidential Big Insight Distribution.
Experience using numpy, pandas packages in python.
Experience in indexing log files using Elasticsearch.
Able to assess business rules, collaborate with stakeholders and perform source-to-target data mapping, design and review.
Experience in Data Warehouse Concepts and implementations of large scale Datawarehouse.
Knowledge of implementing ELK stack (Logstash, Elasticsearch and Kibana).

TECHNICAL SKILLS:

Technologies: Hadoop, Hive, Pig, HBase, Sqoop, AWS, Hcatalog, Flume, Zookeeper, Kafka, Spark, Elasticsearch, Ambari, Zeppelin, Hue, Oozie, logstash, Kibana, Maven, GitHub, Jenkins.

ETL Tool: Datastage 7.5X2 (PX) and 8.5

Database: Oracle 9i, DB2, Teradata.

Languages: C, SQL, Scala, Python.

Tool: & Utilities Control M, SQL Developer, WebSphere MQ, Eclipse, Intellij, Pycharm

PROFESSIONAL EXPERIENCE:

Confidential

Hadoop Developer

Modules Environment: Hortonworks 6.2, Sqoop, vertabelo, Hive, GitHub, Source tree.

Analysis the various bank applications and structure of data before data modeling and the data transformation required before loading of the data.
Designing for all the LOB of the bank as per the Risk requirement.
Create, manage logical, physical and conceptual models for Risk data store.
Develop jobs for data ingestion based on the business requirement into Hadoop platform.
Prepare High level design documents based on the understanding of Requirement Elicitation to technical specification for the different modules.
Develop scripts for managing and scheduling jobs for data ingestion into Hadoop Cluster.
Reviewing project/task statuses/issues with the business and ensuring completion of project on time.

Confidential

Hadoop Admin & Developer

Modules Environment: Centos 6.7, Cloudera 5.7, Solr, Sqoop, Kafka, Spark, Hive, R, GitHub, Jenkins, Maven, D3.

Responsibilities:

Developed Big Data Solutions that enabled the business and technology teams to make data-driven decisions on the best ways to acquire customers and provide them business solutions.
Handled importing of data from various data sources, performed transformations, loaded data into HDFS.
Worked on custom Pig Loaders and Storage classes to work with a variety of data formats such as JSON, CSV. etc.
Developed Spark applications to perform all the data transformations coming from multiple sources.
Created Spark applications to perform various data cleansing, validation, transformation and summarization activities according to the requirement.
Explored MLlib algorithms in Spark to understand the possible Machine Learning functionalities that can be used for our use case.
Used both Hive context as well as SQL context of Spark to do the initial testing of the Spark job.
Have experience in executing Hive Queries using Spark SQL that integrates Spark environment.
Worked on designing the D3 baseboards and Mentoring the young team.

Confidential

Hadoop Developer

Modules Environment: Centos 6.7, Cloudera 5.5, hive, Sqoop, Kafka, Pig, Flume, Hcatalog, Hue, Oozie

Responsibilities:

Optimizing the Hive queries using Partitioning and Bucketing techniques, for controlling the data distribution.
Written Sqoop scripts to ingest data from different RDBMS data Source.
Used Flume and Sqoop extensively in gathering and moving data files from Application Servers to Hadoop Distributed File System (HDFS).
Developed Hive scripts in Hive QL to de-normalize and aggregate the data.
Implemented the hive partitions, hive joins, hive bucketing.
Involved in converting Hive/SQL queries into Spark transformations using Spark Data Frames.
Analyzed the spark SQL scripts and designed the solution to implement.
Used Spark SQL to process the huge amount of Structured data.
Set-up zeppelin and developed spark jobs for data analysts.
Implemented near real time data pipeline using framework based on Kafka, Spark.
Written Oozie workflow for scheduling jobs and for writing pig scripts and hive QL.
Experience in building/tuning real-world systems under scale/performance constraints.
Prioritize daily workflow and demands on quality, time and resources.

Confidential

Developer

Module Environment: Centos 6.7, Kafka, Elasticsearch, Log stash, Kibana

Responsibilities:

Designing and implementing data processing pipelines for different kinds of data sources, formats and content.
Responsible for setup and intake of logs using Apache Kafka.
Written log stash scripts to process logs from Kafka to Elasticsearch.
Developed Elasticsearch scripts for indexing and searching log data.
Developed visualization dashboards using Kibana.

Confidential

Sr Developer

Modules Environment: Unix, Teradata and DataStage 8.5

Responsibilities:

Interacted with end user community to understand the business requirements and in identifying data sources.
Preparing and maintained HDD document related to MBI.
Design and develop bteq scripts in Teradata based on business requirement.
Design and development of ELT module using DataStage.
Working on scheduling toll control-m for automating the ETL process.
Documented ETL test plans, test cases based on design specification for unit testing, system testing and prepared test data for testing.

Confidential

Sr Developer

Modules Environment: Unix, Teradata and DataStage 8.5

Responsibilities:

Worked and coordinated with subject matter experts and business data quality for data requirement documents and technical documents.
Preparing and marinating AAA document.
Design and develop bteq scripts in Teradata based on business requirement.
Design and development of ELT module using Datastage.
Prepared visio design document for ETL job flow.
Working on scheduling toll control-m for automating the ETL process.
Involved in unit testing, integration testing and UAT .

Confidential

Sr Developer

Modules Environment: Unix, DB2 and Datastage 7.5

Responsibilities:

Worked and coordinated with subject matter experts and business data quality for data requirement documents and technical documents.
Preparing and marinating AAA document.
Design and development of ELT module using Datastage.
Prepared visio design document for ETL job flow.
Working on scheduling toll control-m for automating the ETL process.
Involved in unit testing, integration testing and UAT.

Confidential

Responsibilities:

Responsible for the creation, execution, and maintenance of DB2 databases that tracked access of employee and assisted in report creation.
Conducted and participated sessions with the Project managers, Business Analysis Team, Finance and Development to team gather, analyze and document the Business and reporting requirements.
Generated daily, weekly, and monthly inventory reports utilizing Cognos.
Provided technical assistance in identifying, evaluating, and developing cost effective systems and procedures that met business requirements.
Created ad hoc reports for management to support key decision-making.
Involved in documenting the project steps and presenting to the team members.
Maintained weekly department reporting utilizing Cognos and Excel.

We provide IT Staff Augmentation Services!

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship