We provide IT Staff Augmentation Services!

Hadoop Developer Resume

5.00/5 (Submit Your Rating)

White Plains, NY

SUMMARY:

  • Around 7 years of experience working as a developer which includes extensive experience in Big Data Hadoop technologies and Java/J2EE technologies.
  • Comprehensive working knowledge of Hadoop framework, Hadoop ecosystem, MapReduce, NoSQL databases in Financial and Health Care domains.
  • Worked with various development methodologies like SDLC (Waterfall Model), Agile (Scrum process) and Iterative Software development.
  • Hands on Experience in Big Data Tools & Technologies including Hadoop, HDFS, MapReduce, YARN, Hive, Pig, Hbase, Sqoop, Flume, Kafka, Spark, Impala, Oozie, UC4 Zookeeper.
  • Experience in writing HiveQL & Pig Latin to load/analyze data in Hadoop HDFS.
  • Experience in using Sqoop to migrate data between HDFS and RDBMS and using Flume to import log data.
  • Experience in NoSQL Column - Oriented Databases like HBase, Cassandra and its Integration with Hadoop cluster.
  • Data analysis with partitioning and bucketing concepts using Hive.
  • Hands on experience in messaging system such as Kafka 0.8+.
  • Hands on experience with Spark QL and Spark Streaming with Scala and Python.
  • Worked with efficient storage formats like PARQUET, AVRO and ORC integrated them with Hadoop and the ecosystem (Hive, Impala, and spark). Also used compressions techniques like Snappy and GZip.
  • Understanding of Amazon Web Services stack and hands-on experience in using S3, EMR, Redshift, DynamoDB and hosting clusters on EC2.
  • Proficient in writing SQL queries to work with relational databases such as Oracle, MySQL, MS SQL Server.
  • Previous working experience in J2EE based technologies such as Core Java, JSP, JDBC.
  • Working knowledge with Java MVC Frameworks including Struts, Spring, Hibernate.
  • Working experience in web technologies including HTML5, CSS3, JavaScript, Web Services including REST, SOAP and Spring Framework.
  • Hands on experience of testing techniques such as JUnit and version control software such as Git.
  • Oracle Certified Associate, Java SE 8 Programmer
  • Excellent interpersonal and communication skills, creative, research-minded, technically competent and result-oriented.
TECHNICAL SKILLS:

Hadoop Ecosystem Distributions: Hadoop, Spark 1.3+, MapReduce, Hive Cloudera, Hortonworks, MapR, Amazon Web 0.12+, Pig 0.11+, Flume 1.3+, HBase 0.98+, Services - EC2, S3 EMR, DynamoDB Sqoop 1.4.6, Oozie 3.3+, HDFS, Kafka 0.8.1+, Zookeeper 3.4+, Automic

Databases Methodologies: Oracle 9i/11g, MySQL 5.0+, MS SQL Server Agile Scrum, Waterfall

NOSQL: Cassandra, MongoDB

Languages Web Technologies: Java 6/7/8, Scala, Python, SQL, HiveQL, Servlets 3.0, JSP, JDBC, HTML 5, REST, Pig Latin, JavaScript, Shell-Scripting SOAP, JSON, XML, CSS

Other Systems: Eclipse, Maven, MVC, JUnit, Testing Whiz, Linux, UNIX, Windows Tableau, Git

PROFESSIONAL EXPERIENCE:

Confidential, White Plains, NY

Hadoop Developer

Responsibilities:

  • Extensively worked on writing the shell scripts to implement the dataflow logic for the ingestion through automated process. Scripts incorporating functionalities like logging, email alerts, retry logics and parameterized inputs.
  • Built Internal and External tables using Hive. Good exposure on Hive ddl’s to create, alter and drop tables/views/partitions.
  • Performed joins, dynamic partitions, and bucketing on hive tables utilizing hive SerDes like CSV , REGEX , JSON and AVRO .
  • Worked with different kind of compression techniques to save space and optimize data transfer over network using Snappy , Gzip , Lzo etc.
  • Developed script for parallel running multiple spark jobs , by getting a spark session, submitting the job with right configs and end the session upon completion.
  • Widely used Unix commands with PuTTY/Cygwin to access remote server.
  • Wrote SQL queries via Impala for accessing and analysing the processed data.
  • Involved in writing jobplans in Automic (UC4) to schedule and automate end-end process.
  • Created a process to replicate the data to Dev/QA clusters daily.
  • Designed and scheduled a workflow for a downstream system that uses the ingested data to calculate KPI metrics.
  • Actively supported the production process by monitoring the jobs and diagnosing/fixing the issues to meet the SLA on time.
  • Gained experience in managing and reviewing Hadoop log files.
  • Maintained environmental profiles specific to roles/users and scheduled cron jobs for adhoc needs.
  • Created and maintained Technical documentation and Runbooks for accessing Hadoop Clusters in different environments and logistics of jobs on client’s confluence page.
  • Worked closely with business units to define development estimates according to Agile Methodology .

Environment: Hadoop, HDFS, MapReduce, Apache Hive, Apache Pig, KornShell, Spark-SQL, Automic UC4, Impala, Kerberos, Hortonworks, Python, Unix, PuTTY, MySQL, S3, Agile/Scrum, GitBash

Confidential, Roseland, NJ

Big Data Developer

Responsibilities

  • Worked extensively with Sqoop to ingest secondary data (CRM, ODS, marketing spends) from Relational Database to HDFS. 
  • Implemented multiple Map Reduce Jobs in java for data cleansing and pre-processing. 
  • Used Flume to ingest raw data in text format to HDFS. Also used Flume interceptors to filter the data before ingesting.
  • Developed MapReduce logic to perform sanitization to remove invalid/incomplete log files.
  • Developed Hive scripts for implementing deduplication.
  • Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS. 
  • Worked with Spark RDD and Dataframes for sessionization and other transformations.
  • Wrote SQL queries via Impala for accessing and analysing the processed data.
  • Involved in writing workflows in Oozie to orchestrate multiple steps.
  • Created and maintained Technical documentation for launching Hadoop Clusters and for executing Hive queries and Pig Scripts.
  • Collaborating with the teams using several integration and defect tracking tools like Jenkins and JIRA.

Environment: Cloudera, Hadoop, Sqoop, Flume Avro, Hive, SNAPPY compression, Hive, Spark, Impala, HBASE, Oozie workflow

Confidential, Lincoln Harbor, NJ

Hadoop Developer

Responsibilities

  • Developed Map/Reduce jobs using Java for data transformations. 
  • Extensively worked on performance tuning of Hive scripts.
  • Developed Hive Internal and External tables, with operations to create, alter and drop tables/views. 
  • Proficient with the concepts of partitions - static and dynamic, bucketing on hive tables.
  • Written Sqoop scripts to inbound and outbound data to HDFS and validated the data before loading to check the duplicated data.
  • Developed Spark code using Scala and  Spark - SQL for faster testing and processing of data.
  • Experience in using Zookeeper and Oozie for coordinating the cluster and scheduling workflows. 
  • Involved in writing the shell scripts for exporting log files to Hadoop cluster through automated process.
  • Worked with using different kind of compression techniques to save data and optimize data transfer over network using Lzo , Snappy , etc.
  • Assisted in upgrading, configuration and maintenance of various Hadoop infrastructures like Pig , Hive , and Hbase .
  • Worked on Git hub repository, branching, merging, etc.

Environment: Hadoop, HDFS, Map Reduce, MapR, HIVE, Pig, Sqoop, HBase, Oozie, Zookeeper, Shell scripting, HiveQL, NOSQL database (HBASE), RDBMS, Eclipse, Oracle 11g, Tableau

Confidential, Melville, NY

Hadoop Developer

Responsibilities

  • Worked with the Data Science team to gather requirements for various data mining projects.
  • Load and transform large sets of structured, and semi structured data.
  • Wrote Map Reduce job using Java API.
  • Imported/exported data from RDMS to HDFS using Sqoop .
  • Created Hive tables and wrote Hive queries for data analysis to meet the business requirements.
  • Used Impala to read, write and query the data in HDFS .
  • Experienced in migrating HiveQL into Impala to minimize query response time
  • Configured Hive metastore, which stores the metadata for Hive tables and partitions in a relational database.
  • Worked on Flume for efficiently collecting, aggregating and moving large amounts of log data.
  • Worked on configuring security for Hadoop cluster ( Kerberos , Active Directory)
  • Installed and configured Zookeeper for Hadoop cluster. Worked on setting up high availability for cluster and designed automatic failover using zookeeper .
  • Tuning MR Programs that are running on the Hadoop cluster.
  • Worked with application teams to install Hadoop updates, patches, version upgrades and operating system as required.
 Environment: Hadoop, HDFS, MapReduce, Sqoop, Spark, Hive, Flume, Elastic search, Oozie, Zookeeper, Kerberos, Cloudera, MySQL, Putty, Eclipse

Confidential

Java/J2EE Developer

Responsibilities

  • Performed in different phases of the Software Development Lifecycle (SDLC) of the application, including: requirements gathering, analysis, design, development and deployment of the application.
  • Developed Action Forms and Controllers in Struts 2.0 framework.
  • Designed, developed and maintained the data layer using Hibernate.
  • Implemented and developed the application using Struts2, Servlets, JSP, JSTL, Collection API.
  • Used web services SOAP as a communication between Applications.
  • Configured the JDBC connection with Database layer.
  • Involved in User Design using HTML, CSS, JavaScript, AJAX, JQuery
  • Developed JavaScript validations on order submission forms.
  • JUnit was used to do the Unit testing for the application.
  • Used Apache Ant to compile java classes and package into jar archive.
  • Involved in tracking and resolving defects, which arise in QA & production environments.

Environment: Java, J2EE, JSP, Servlets, Struts 2.0/1.2, Hibernate, HTML, CSS, JavaScript, JUnit, Apache Tomcat, PL/SQL, Eclipse

Confidential

Java Developer

Responsibilities

  • Analyze the requirements and communicate the same to both Development and Testing teams.
  • Developed and implemented business logic using Java, JSP, Servlets, Java Mail API, XML.
  • Wrote SQL queries for complex operations.
  • Implemented client side validation using AJAX and Javascipt.
  • Designed interactive web pages using HTML, CSS, JavaScipt, JQuery.
  • Used Oracle as backend databases.
  • Used Log4j for External Configuration Files and debugging.
  • Code Reviews and Unit Testing with the help of JUnit.
  • Preparing user document for developers of Middleware and client teams.
  • Used Eclipse / Weblogic Workshop as the IDE.

Environment: J2EE, Java, JSP, JDBC, JavaScript, HTML, XML, JMS, Eclipse IDE, PL/SQL, Oracle, JUnit, Windows

We'd love your feedback!