Hadoop Developer Resume White Plains, NY - Hire IT People

SUMMARY:

Around 7 years of experience working as a developer which includes extensive experience in Big Data Hadoop technologies and Java/J2EE technologies.
Comprehensive working knowledge of Hadoop framework, Hadoop ecosystem, MapReduce, NoSQL databases in Financial and Health Care domains.
Worked with various development methodologies like SDLC (Waterfall Model), Agile (Scrum process) and Iterative Software development.
Hands on Experience in Big Data Tools & Technologies including Hadoop, HDFS, MapReduce, YARN, Hive, Pig, Hbase, Sqoop, Flume, Kafka, Spark, Impala, Oozie, UC4 Zookeeper.
Experience in writing HiveQL & Pig Latin to load/analyze data in Hadoop HDFS.
Experience in using Sqoop to migrate data between HDFS and RDBMS and using Flume to import log data.
Experience in NoSQL Column - Oriented Databases like HBase, Cassandra and its Integration with Hadoop cluster.
Data analysis with partitioning and bucketing concepts using Hive.
Hands on experience in messaging system such as Kafka 0.8+.
Hands on experience with Spark QL and Spark Streaming with Scala and Python.
Worked with efficient storage formats like PARQUET, AVRO and ORC integrated them with Hadoop and the ecosystem (Hive, Impala, and spark). Also used compressions techniques like Snappy and GZip.
Understanding of Amazon Web Services stack and hands-on experience in using S3, EMR, Redshift, DynamoDB and hosting clusters on EC2.
Proficient in writing SQL queries to work with relational databases such as Oracle, MySQL, MS SQL Server.
Previous working experience in J2EE based technologies such as Core Java, JSP, JDBC.
Working knowledge with Java MVC Frameworks including Struts, Spring, Hibernate.
Working experience in web technologies including HTML5, CSS3, JavaScript, Web Services including REST, SOAP and Spring Framework.
Hands on experience of testing techniques such as JUnit and version control software such as Git.
Oracle Certified Associate, Java SE 8 Programmer
Excellent interpersonal and communication skills, creative, research-minded, technically competent and result-oriented.

TECHNICAL SKILLS:

Hadoop Ecosystem Distributions: Hadoop, Spark 1.3+, MapReduce, Hive Cloudera, Hortonworks, MapR, Amazon Web 0.12+, Pig 0.11+, Flume 1.3+, HBase 0.98+, Services - EC2, S3 EMR, DynamoDB Sqoop 1.4.6, Oozie 3.3+, HDFS, Kafka 0.8.1+, Zookeeper 3.4+, Automic

Databases Methodologies: Oracle 9i/11g, MySQL 5.0+, MS SQL Server Agile Scrum, Waterfall

NOSQL: Cassandra, MongoDB

Languages Web Technologies: Java 6/7/8, Scala, Python, SQL, HiveQL, Servlets 3.0, JSP, JDBC, HTML 5, REST, Pig Latin, JavaScript, Shell-Scripting SOAP, JSON, XML, CSS

Other Systems: Eclipse, Maven, MVC, JUnit, Testing Whiz, Linux, UNIX, Windows Tableau, Git

PROFESSIONAL EXPERIENCE:

Confidential, White Plains, NY

Hadoop Developer

Responsibilities:

Extensively worked on writing the shell scripts to implement the dataflow logic for the ingestion through automated process. Scripts incorporating functionalities like logging, email alerts, retry logics and parameterized inputs.
Built Internal and External tables using Hive. Good exposure on Hive ddl’s to create, alter and drop tables/views/partitions.
Performed joins, dynamic partitions, and bucketing on hive tables utilizing hive SerDes like CSV , REGEX , JSON and AVRO .
Worked with different kind of compression techniques to save space and optimize data transfer over network using Snappy , Gzip , Lzo etc.
Developed script for parallel running multiple spark jobs , by getting a spark session, submitting the job with right configs and end the session upon completion.
Widely used Unix commands with PuTTY/Cygwin to access remote server.
Wrote SQL queries via Impala for accessing and analysing the processed data.
Involved in writing jobplans in Automic (UC4) to schedule and automate end-end process.
Created a process to replicate the data to Dev/QA clusters daily.
Designed and scheduled a workflow for a downstream system that uses the ingested data to calculate KPI metrics.
Actively supported the production process by monitoring the jobs and diagnosing/fixing the issues to meet the SLA on time.
Gained experience in managing and reviewing Hadoop log files.
Maintained environmental profiles specific to roles/users and scheduled cron jobs for adhoc needs.
Created and maintained Technical documentation and Runbooks for accessing Hadoop Clusters in different environments and logistics of jobs on client’s confluence page.
Worked closely with business units to define development estimates according to Agile Methodology .

Environment: Hadoop, HDFS, MapReduce, Apache Hive, Apache Pig, KornShell, Spark-SQL, Automic UC4, Impala, Kerberos, Hortonworks, Python, Unix, PuTTY, MySQL, S3, Agile/Scrum, GitBash

Confidential, Roseland, NJ

Big Data Developer

Responsibilities

Worked extensively with Sqoop to ingest secondary data (CRM, ODS, marketing spends) from Relational Database to HDFS.
Implemented multiple Map Reduce Jobs in java for data cleansing and pre-processing.
Used Flume to ingest raw data in text format to HDFS. Also used Flume interceptors to filter the data before ingesting.
Developed MapReduce logic to perform sanitization to remove invalid/incomplete log files.
Developed Hive scripts for implementing deduplication.
Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS.
Worked with Spark RDD and Dataframes for sessionization and other transformations.
Wrote SQL queries via Impala for accessing and analysing the processed data.
Involved in writing workflows in Oozie to orchestrate multiple steps.
Created and maintained Technical documentation for launching Hadoop Clusters and for executing Hive queries and Pig Scripts.
Collaborating with the teams using several integration and defect tracking tools like Jenkins and JIRA.

Environment: Cloudera, Hadoop, Sqoop, Flume Avro, Hive, SNAPPY compression, Hive, Spark, Impala, HBASE, Oozie workflow

Confidential, Lincoln Harbor, NJ

Hadoop Developer

Responsibilities

Developed Map/Reduce jobs using Java for data transformations.
Extensively worked on performance tuning of Hive scripts.
Developed Hive Internal and External tables, with operations to create, alter and drop tables/views.
Proficient with the concepts of partitions - static and dynamic, bucketing on hive tables.
Written Sqoop scripts to inbound and outbound data to HDFS and validated the data before loading to check the duplicated data.
Developed Spark code using Scala and Spark - SQL for faster testing and processing of data.
Experience in using Zookeeper and Oozie for coordinating the cluster and scheduling workflows.
Involved in writing the shell scripts for exporting log files to Hadoop cluster through automated process.
Worked with using different kind of compression techniques to save data and optimize data transfer over network using Lzo , Snappy , etc.
Assisted in upgrading, configuration and maintenance of various Hadoop infrastructures like Pig , Hive , and Hbase .
Worked on Git hub repository, branching, merging, etc.

Environment: Hadoop, HDFS, Map Reduce, MapR, HIVE, Pig, Sqoop, HBase, Oozie, Zookeeper, Shell scripting, HiveQL, NOSQL database (HBASE), RDBMS, Eclipse, Oracle 11g, Tableau

Confidential, Melville, NY

Hadoop Developer

Responsibilities

Worked with the Data Science team to gather requirements for various data mining projects.
Load and transform large sets of structured, and semi structured data.
Wrote Map Reduce job using Java API.
Imported/exported data from RDMS to HDFS using Sqoop .
Created Hive tables and wrote Hive queries for data analysis to meet the business requirements.
Used Impala to read, write and query the data in HDFS .
Experienced in migrating HiveQL into Impala to minimize query response time
Configured Hive metastore, which stores the metadata for Hive tables and partitions in a relational database.
Worked on Flume for efficiently collecting, aggregating and moving large amounts of log data.
Worked on configuring security for Hadoop cluster ( Kerberos , Active Directory)
Installed and configured Zookeeper for Hadoop cluster. Worked on setting up high availability for cluster and designed automatic failover using zookeeper .
Tuning MR Programs that are running on the Hadoop cluster.
Worked with application teams to install Hadoop updates, patches, version upgrades and operating system as required.

Environment: Hadoop, HDFS, MapReduce, Sqoop, Spark, Hive, Flume, Elastic search, Oozie, Zookeeper, Kerberos, Cloudera, MySQL, Putty, Eclipse

Confidential

Java/J2EE Developer

Responsibilities

Performed in different phases of the Software Development Lifecycle (SDLC) of the application, including: requirements gathering, analysis, design, development and deployment of the application.
Developed Action Forms and Controllers in Struts 2.0 framework.
Designed, developed and maintained the data layer using Hibernate.
Implemented and developed the application using Struts2, Servlets, JSP, JSTL, Collection API.
Used web services SOAP as a communication between Applications.
Configured the JDBC connection with Database layer.
Involved in User Design using HTML, CSS, JavaScript, AJAX, JQuery
Developed JavaScript validations on order submission forms.
JUnit was used to do the Unit testing for the application.
Used Apache Ant to compile java classes and package into jar archive.
Involved in tracking and resolving defects, which arise in QA & production environments.

Environment: Java, J2EE, JSP, Servlets, Struts 2.0/1.2, Hibernate, HTML, CSS, JavaScript, JUnit, Apache Tomcat, PL/SQL, Eclipse

Confidential

Java Developer

Responsibilities

Analyze the requirements and communicate the same to both Development and Testing teams.
Developed and implemented business logic using Java, JSP, Servlets, Java Mail API, XML.
Wrote SQL queries for complex operations.
Implemented client side validation using AJAX and Javascipt.
Designed interactive web pages using HTML, CSS, JavaScipt, JQuery.
Used Oracle as backend databases.
Used Log4j for External Configuration Files and debugging.
Code Reviews and Unit Testing with the help of JUnit.
Preparing user document for developers of Middleware and client teams.
Used Eclipse / Weblogic Workshop as the IDE.

Environment: J2EE, Java, JSP, JDBC, JavaScript, HTML, XML, JMS, Eclipse IDE, PL/SQL, Oracle, JUnit, Windows

We provide IT Staff Augmentation Services!

Hadoop Developer Resume

White Plains, NY

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship