We provide IT Staff Augmentation Services!

Senior Hadoop Developer Resume

Charlotte, NC

SUMMARY:

  • Over 8 years of strong experience in the IT industry that includes 4 years as a Hadoop and Spark Developer in domains like financial services and Healthcare. Maintained positive communications and working relationship at all levels. An enthusiastic and goal - oriented team player possessing excellent communication, interpersonal skills with good work ethics.
  • Expertise in Hadoop eco system components HDFS, MapReduce, Yarn, HBase, Pig, Sqoop, Flume and Hive for scalability, distributed computing and high-performance computing.
  • Experience in using Hive Query Language and Spark for data Analytics .
  • Experienced in Installing, Maintaining and Configuring Hadoop Cluster .
  • Strong knowledge on creating and monitoring Hadoop clusters on VM, HortonWorks Data Platform 2.1 & 2.2, CDH5 Cloudera Manager, HDP on Linux, Ubuntu etc.
  • Capable of processing large sets of structured, semi-structured and unstructured data and supporting systems application architecture.
  • Having good working Knowledge on Map Reduce Framework.
  • Strong knowledge in NOSQL column oriented databases like HBase and its integration with Hadoop cluster .
  • Using Build tools like Maven to build projects.
  • Good knowledge on Kafka, Active MQ and Spark Streaming for handling Streaming Data .
  • Experienced in job workflow scheduling and monitoring tools like Oozie and Zookeeper.
  • Analyse data, interpret results and convey findings in a concise and professional manner
  • Partner with Data Infrastructure team and business owners to implement new data sources and ensure consistent definitions are used in reporting and analytics
  • Promote full cycle approach including request analysis, creating/pulling dataset, report creation and implementation and providing final analysis to the requestor
  • Good Exposure on Data Modelling, Data Profiling, Data Analysis, Validation and Metadata Management.
  • Flexible with Unix/Linux and Windows Environments working with Operating Systems like Cent OS 5/6, Ubuntu 13/14.
  • Working on different file formats like JSON, XML, CSV, XLS etc.
  • Using Amazon AWS EMR and EC2 for cloud big data processing.
  • Experience in Version Control Tools like Github.
  • Have sound knowledge on designing ETL applications with using Tools like Talend.
  • Experience in working with job scheduler like Oozie.
  • Strong in databases like MySQL, Teradata, Oracle, MS SQL.
  • Strong understanding of Agile and Waterfall SDLC methodologies.
  • Strong communication, collaboration & team building skills with proficiency at grasping new Technical concepts quickly and utilizing them in a productive manner.

TECHNICAL SKILLS:

Big Data Technologies: Apache Spark, HDFS, Yarn, Hive, MapReduce, Pig,Sqoop, Flume, Oozie, Kafka, Tez

Programming Languages: Scala, Java

RDBMS: MySQL

NoSQL Databases: Hbase

Operating Systems: Linux (CentOS and SUSE),and Windows

Special tools: Maven, AutosysVersion Control: GIT

Hadoop Distributions: Cloudera,Hortonworks

Cloud: AWS

ETL Tools: Talend

Visualisation tools: Tableau

PROFESSIONAL EXPERIENCE:

Confidential,Charleston,SC

Hadoop and Spark Lead Developer

Responsibilities:

  • Understand the business needs and objectives of the system and interacted with the end client/users and gathered requirements for the integrated system.
  • Worked as a Source System Analyst to understand various source systems by interacting with each source SME’s.
  • Working on different file formats like JSON,CSV,XML using spark SQL.
  • Involved in various activities of the project like information gathering, analysing the information, documenting the functional and nonfunctional requirements.
  • Using Talend to run the ETL processes instead of Hive queries .
  • Implemented incremental load approach in spark for huge amount of data tables.
  • Using Amazon Web Services (AWS) for storage and processing of data in cloud.
  • Created Incremental eligibility document and developed code for Initial load process.
  • Extracted data from Teradata data base and loaded into Data warehouse using spark Jdbc.
  • Performed Transformations and Actions using Spark for improving the performance.
  • Load the transformation data into Hive/SaveAs Table in spark.
  • Using Build tools like Maven to build projects.
  • Using Impala for query processing.
  • Extensive experience on Unit testing by creating Test Cases.
  • Using Kafka, Spark Streaming for streaming purpose.
  • Experience in Development Methodologies like Agile, Waterfall.
  • Experience in code repositories like Github.

Environment: Apache Spark, Scala, Eclipse, HBase, Talend, Hortonworks, SparkSQL, Hive, Teradata, Hue,SparkCore, Linux, Github, AWS,JSON .

Confidential,Charlotte,NC

Senior Hadoop Developer

Responsibilities:

  • Importing and exporting data into HDFS from database and vice versa using Sqoop
  • Created Data Lake as a Data Management Platform for Hadoop.
  • Using Amazon Web Services (AWS) for storage and processing of data in cloud.
  • Using Talend and DMX-h to extract the data from other sources into HDFS and Transform the data.
  • Involved in creating workflow to run multiple hive and Pig Jobs, which run independently with time and data availability
  • Using Apache Kafka for Streaming purpose.
  • Involved in developing shell scripts and automated data management from end to end integration work .
  • Developing predictive analytic product by using Apache Spark, SQL/HiveQL.
  • Moving data in and out to Hadoop File System Using Talend Big Data Components.
  • Developed Map Reduce program for parsing and loading into HDFS .
  • Built reusable Hive UDF libraries for business requirements which enabled users to use these UDF's in Hive Querying .
  • Automating and scheduling the Sqoop jobs in a timely manner using Unix Shell Scripts.
  • Used Oozie workflow engine to manage interdependent Hadoop jobs and to automate several types of Hadoop jobs such as Java map-reduce Hive, Pig.
  • Working with JSON,XML file formats.
  • Using HBase and NoSQL databases to store majority of data which needs to be divided based on region.

Environment: Hadoop, MapReduce, Talend, Hive QL, Oracle, Cloudera, HDFS, HIVE, HBase, Java, Tableau, PIG, Sqoop, UNIX, Spark,Scala, JSON,AWS.

Confidential, Louisville,Kentucky

Hadoop Developer

Responsibilities:

  • Involved in Requirement gathering, Business Analysis and translated business requirements into Technical design in Hadoop and Big Data
  • Importing and exporting data into HDFS from database and vice versa using Sqoop .
  • Involved in developing shell scripts and automated data management from end to end integration work
  • Developed Map Reduce program for parsing and loading into HDFS information.
  • Built reusable Hive UDF libraries for business requirements which enabled users to use these UDF's in Hive Querying .
  • Automating and scheduling the Sqoop jobs in a timely manner using Unix Shell Scripts.
  • Used Oozie workflow engine to manage interdependent Hadoop jobs and to automate several types of Hadoop jobs such as Java map-reduce Hive, Pig, and Sqoop.
  • Using HBase to store majority of data which needs to be divided based on region.
  • Developed Map Reduce programs for data analysis and data cleansing.

Environment: Hadoop, MapReduce, Hive QL, Oracle, HDFS, HIVE, Java, PIG, Sqoop, Cloudera Hadoop Distribution.

Confidential

Java Developer

Responsibilities:

  • Analysing the Business Requirements and System specifications to understand the application.
  • Involved in preparing High Level Design and Low-Level Design Documents
  • Involved in coding & unit testing the new codes.
  • Prepare Test Plan & Test data.
  • Deployed web modules in Tomcat web server .
  • Testing the code changes at functional and system level
  • Conduct quality review of design documents, code and test plans
  • Ensure availability of document/code for review
  • Conduct quality reviews of testing
  • Fix problems discovered that are within the existing system functionality (Preventive Maintenance).
  • Modifications required to the code to prevent problems from occurring in future (Preventive Maintenance).
  • Involved in presenting induction to the new joiner’s in the project.

Environment: Java, Maven, UNIX, Eclipse, SOAP UI, WINSCP, Tomcat,JSP, Quality Center

Hire Now