We provide IT Staff Augmentation Services!

Sr. Hadoop Developer Resume

4.00/5 (Submit Your Rating)

Denver, ColoradO

SUMMARY

  • Having 9+ years of overall IT experience with 4 Years of comprehensive experience as an Apache Hadoop Developer.
  • Expertise in writingHadoopJobs for analyzing structured and unstructured data using HDFS, Hive, HBase, Pig, Spark, Kafka, Scala, Oozie and Talend ETL.
  • Good knowledge of Hadoop Architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node, YARN and MapReduce concepts.
  • Experience in working with different kind of MapReduce programs using Hadoopfor working with Big Data analysis.
  • Experience in analyzing data using Hive QL, Pig Latin and custom MapReduce programs in Java.
  • Experience in importing/exporting data using Sqoop into HDFS from Relational Database Systems and vice - versa.
  • Working experience on designing and implementing complete end-to-end Hadoop Infrastructure including PIG, HIVE, Sqoop, Oozie, Flume and zookeeper.
  • Experience in providing support to data analyst in running Pig and Hive queries.
  • Experience in writing shell scripts to dump the shared data from MySQL servers to HDFS.
  • Experience in designing both time driven and data driven automated workflows using Oozie.
  • Experience in performance tuning the Hadoop cluster by gathering and analyzing the existing infrastructure.
  • Experience in automating the Hadoop Installation, configuration and maintaining the cluster by using the tools like Puppet.
  • Experience in working with flume to load the log data from multiple sources directly into HDFS.
  • Strong debugging and problem solving skills with excellent understanding of system development methodologies, techniques and tools.
  • Worked in complete Software Development Life Cycle (analysis, design, development, testing, implementation and support) in different application domain involving different technologies varying from object oriented technology to Internet programming on Windows NT, Linux and UNIX/ Solaris platforms and RUP methodologies.
  • Familiar with RDBMS concepts and worked on Oracle 8i/9i, SQL Server 7.0., DB2 8.x/7.x
  • Involved in writing shell scripts, Ant scripts for Unix OS for application deployments to production region.
  • Having very good POC and Development experience on Apache Flume, Kafka, Spark, Storm, and Scala.
  • Exceptional ability to quickly master new concepts and capable of working in-group as well as independently with excellent communication skills.
  • Good working knowledge on Hadoop hue ecosystems.
  • Good knowledge in evaluating big data analytics libraries and use of Spark-SQL for data exploratory.

TECHNICAL SKILLS

BigData Technologies: Hadoop, MapReduce, HDFS, Hive, Pig, Zookeeper, Sqoop, Oozie, Flume, IMPALA, HBASE, Kafka, Storm.

Big Data Frameworks: HDFS, YARN, Spark.

Hadoop Distributions: Cloudera (CDH3, CDH4, CDH5), Horton works, Amazon EMR, EC2.

Programming Languages: Java, shell scripting, Scala.

Databases: RDBMS, MySQL, Oracle, Microsoft SQL Server, Teradata, DB2, PL/SQL, CASSANDRA, MongoDB.

IDE and Tools: Eclipse, NetBeans, Tableau.

Operating System: Windows, Linux/Unix.

Frameworks: Spring, Hibernate, JSF, EJB, JMS.

Scripting Languages: JSP & Servlets, JavaScript, XML, HTML, Python.

Application Servers: Apache Tomcat, Web Sphere, Web logic, JBoss.

Methodologies: Agile, SDLC, Waterfall.

Web Services: Restful, SOAP.

ETL Tools: Talend, Informatica.

Others: Solr, elastic search.

PROFESSIONAL EXPERIENCE

Confidential, Denver, Colorado

Sr. Hadoop Developer

Responsibilities:

  • Worked on analyzingHadoopcluster using different big data analytic tools including Flume, Pig, Hive, HBase, Oozie, Zookeeper, Sqoop, Spark and Kafka.
  • Developed Spark code using scala and Spark-SQL/Streaming for faster testing and processing of data.
  • Used Spark API over Cloudera Hadoop YARN to perform analytics on data in Hive.
  • Experience in refactoring the existing spark batch process for different logs written inScala.
  • Exploring with the Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark -SQL, Data Frame, Pair RDD's, Spark YARN.
  • Experienced with batch processing of data sources using Apache Spark, Elastic search.
  • Experience migrating MapReduce programs into Spark transformations using Spark and Scala.
  • Configured Sqoop and developed scripts to extract data from MySQL into HDFS.
  • Hands-on experience with production analyzing Hadoop applications viz. development, configuration management, monitoring, debugging and performance tuning.
  • Translated functional and technical requirements into detail programs running on Hadoop MapReduce and Spark.
  • Written programs inscalathat runs in spark and worked on Hue interface for querying the data.
  • Real time streaming the data using Spark with Kafka.
  • Created HBase tables to store various data formats of PII data coming from different portfolios.
  • Cluster co-ordination services through Zookeeper.
  • Spark Streaming collects this data from Kafka in near-real-time and performs necessary.
  • Installed and configured Hive and also written Hive UDFs in java and python.
  • Helped with the sizing and performance tuning of the HBase cluster.
  • Involved in the process of HBase data modeling and building efficient data structures.
  • Trained and mentored analyst and test team on Hadoop framework, HDFS, Map Reduce concepts, Hadoop Ecosystem.
  • Responsible for architecting Hadoop clusters.
  • Written shell scripts and Python scripts for automation of job.
  • Assist with the addition of Hadoop processing to the IT infrastructure
  • Perform data analysis using Hive and Pig.
  • Upgrading the Hadoop Cluster from CDH3 to CDH4, setting up High availability Cluster and integrating HIVE with existing applications.

Environment: Hadoop, HDFS, Spark, MapReduce, Pig, Hive, Sqoop, Kafka, HBase, Oozie, Flume, Scala, Python, Java, SQL Scripting and Linux Shell Scripting, Cloudera, Cloudera Manager, EC2, EMR, S3, AWS

Confidential, Uniondale, NY

Sr. Hadoop Developer

Responsibilities:

  • Involved in review of functional and non-functional requirements.
  • Facilitated knowledge transfer sessions.
  • Installed and configured Hadoop Mapreduce, HDFS, Developed multiple MapReduce jobs in java for data cleaning and pre-processing.
  • Develop MapReduce jobs for the users. Maintain, update and schedule the periodic jobs which range from updates on periodic MapReduce jobs to creating ad-hoc jobs for the business users.
  • Importing and exporting data into HDFS and Hive using Sqoop.
  • Experienced indefining jobflows.
  • Experienced in managing andreviewingHadooplog files.
  • Extracted files from Couch DB through Sqoop and placed in HDFS and processed.
  • Experienced in runningHadoopstreaming jobs to process terabytes of xml format data.
  • Load and transform large sets of structured, semi structured and unstructured data.
  • Responsible to manage data coming from different sources.
  • Got good experience with NOSQL database.
  • Supported Map Reduce Programs those are running on the cluster.
  • Involved in loading data from UNIX file system to HDFS.
  • Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.
  • Developed a custom File System plug in for Hadoop so it can access files on Data Platform.
  • This plugin allows Hadoop MapReduce programs, HBase, Pig and Hive to work unmodified and access files directly.
  • Designed and implemented Mapreduce-based large-scale parallel relation-learning system
  • Extracted feeds form social media sites such as Facebook, Twitter using Python scripts.
  • Setup and benchmarked Hadoop/HBase clusters for internal use.
  • Gained very good business knowledge on health insurance, claim processing, fraud suspect identification, appeals process etc.

Environment: Java 6, Eclipse, Oracle 10g, Sub Version, Hadoop, Hive, HBase, Linux, MapReduce, HDFS, Hive, Java (JDK 1.6),HadoopDistribution of Horton Works, Cloudera, MapReduce, DataStax, IBM DataStage 8.1, Oracle 11g / 10g, PL/SQL, SQL*PLUS, Toad 9.6, Windows NT, UNIX Shell Scripting.

Confidential

Hadoop Developer

Responsibilities:

  • Extracted the data from the flat files and other RDBMS databases into staging area and populated onto Data warehouse.
  • Worked on Cassandra for the User behavior analysis and lightning speed execution
  • Developed mapping parameters and variables to support SQL override.
  • Used existing ETL standards to develop these mappings.
  • Installed and configured Hadoop Map-Reduce, HDFS and developed multiple Map-Reduce jobs in Java for data cleansing and preprocessing.
  • Importing and exporting data into HDFS and Hive using Sqoop.
  • Used UDF’s to implement business logic in Hadoop
  • Extracted files from Oracle and DB2through Sqoop and placed in HDFS and processed.
  • Load and transform large sets of structured, semi structured and unstructured data.
  • Responsible to manage data coming from different sources.
  • Supported Map-Reduce Programs those are running on the cluster.
  • Involved in loading data from UNIX file system to HDFS.
  • Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.
  • Worked on JVM performance tuning to improve Map-Reduce jobs performance

Environment: Hadoop, MapReduce, HDFS, Hive, Oracle 11g, Java, Struts, Servlets, HTML, XML, SQL, J2EE, JUnit, Tomcat 6.

Confidential

Sr. Java Developer

Responsibilities:

  • Involved in designing Class and Sequence diagrams with UML and Data flow diagrams.
  • Developed Use Cases, Class Diagrams, Sequence Diagrams and Data Models using Microsoft Visio.
  • Worked on server side implementation using Struts MVC framework.
  • Developed JSP’s with STRUTS custom tags and implemented JavaScript validation of data
  • Developed programs for accessing the database using JDBC thin driver to execute queries, prepared statements, Stored Procedures and to manipulate the data in the database.
  • Used JavaScript for validating the Front end Web pages
  • Written SQL code blocks using cursors for shifting records from various tables based on checks
  • Written procedures and triggers for validating the consistency of metadata/ETL.
  • Used AJAX to make the Restful web service calls.
  • Developed Message Driven Beans for asynchronous processing of alerts.
  • Used IBM Clear case for source code control and JUNIT for unit testing.
  • Log4J used as logging framework.
  • Application versions were managed by SVN.
  • Followed coding and documentation standards

Environment: Java, JSP, Struts MVC, Oracle 10G, SQL, PL/SQL, JBOSS, JUnit, SQL Developer, Ajax, MAVEN, Eclipse, SVN, Log4j, REST.

Confidential

Java Developer

Responsibilities:

  • Used message driven beans for asynchronous processing alerts to the customer.
  • Used Struts framework to generate Forms and actions for validating the user request data.
  • Developed Server side validation checks using Struts validators and Java Script validations.
  • With JSP’s and Struts custom tags, developed and implemented validations of data.
  • Developed applications, which access the database with JDBC to execute queries, prepared statements, and procedures.
  • Developed programs to manipulate the data and perform CRUD operations on request to the database.
  • Worked on developing Use Cases, Class Diagrams, Sequence diagrams, and Data Models.
  • Developed and Deployed SOAP Based Web Services on Tomcat Server
  • Coding of SQL, PL/SQL, and Views using IBMDB2 for the database.
  • Working on issues while converting JAVA to AJAX.
  • Supported in developing business tier using the stateless session bean.
  • Extensively used JDBC to access the database objects.
  • Using Clear case for source code control and JUNIT testing tool for unit testing.
  • Reviewing the code and perform integrated module testing.

Environment: Java 5, J2EE 1.4,AJAX, Struts 1.0, Web Services, SOAP, HTML, XML, JSP, JDBC, ANT, XML, IBM, Tomcat, JUNIT, DB2, Rational Rose, Eclipse Helios, CVS.

We'd love your feedback!