Sr. Hadoop/lead Developer Resume
Warren, NJ
SUMMARY
- Highly analytical and result - oriented software developer with an extensive experience of 8+ years in software design and analysis coupled with a broad range of computer expertise.
- Successful history of effectively implementing systems and directing key initiatives.
- Deep-rooted interest in designing and crafting efficient modern software.
- Skilled in troubleshooting with the proven ability to provide creative and effective solutions through the application of highly developed problem-solving skills.
- A quick learner with a proclivity for new technology and tools.
- Hadoop Developer and Administrator: Experience in installing, configuring, maintaining and monitoring of Hadoop Clusters Apache, Cloudera and Sandbox.
- Hadoop Distributions: Horton works, Cloudera CDH4, CDH5 and Apache Hadoop.
- Hadoop Ecosystem: Hands-on experience on Hadoop Ecosystem including HDFS, HBase, Sqoop, MapReduce, Yarn, Pig, Hive, Impala, Zookeeper, and Oozie.
- Cassandra Developer and Modeling: Configured and setup Cassandra Cluster. Expertise in data modeling and analysis of Cassandra and Cassandra Query Language.
- HBase: Ingested Data from RDBMS, Pig and Hive to HBase. Based on the requirements custom coded MapReduce for HBase and used the client-API.
- Data Ingestion: Using Flume, designed the flow and configured the individual components. Efficiently transferred bulk data from and to traditional databases with Sqoop.
- Data Storage: Experience in maintaining distributed Storage HDFS and Columnar Storage HBase.
- Data Processing: Processed data using MapReduce and Yarn. Worked on Kafka as a proof of concept for log processing.
- Data Analysis: Expertise in analyzing data using Pig scripting, Hive Queries, Sparks (Scala) and Impala.
- Management and Monitoring: Maintained and coordinated service Zookeeper apart from designing and monitoring Oozie workflows. Used Azkaban batch job scheduler for controlling workflow of jobs.
- Messaging System: Used Kafka as proof of concept to achieve faster message transfer across systems.
- Scripting: Expertise in Hive, PIG, Impala, Shell Scripting, Perl Scripting, and Python.
- Cloud Platforms: Configured Hadoop clusters in OpenStack and Amazon Web Services (AWS).
- Visualization Integration: Integrated tableau, Google Charts, D3.js, and R with Hadoop cluster and MS Excel with Hive using ODBC connector.
- Java/J2EE: Expertise in spring web MVC and Hibernate. Proficient in HQL (Hibernate Query language).
- Project Management: Experience in Agile and Scrum project management.
- Web Interface Design: Html, CSS, JavaScript, and bootstrap.
TECHNICAL SKILLS
Big Data Technologies: HDFS, Map Reduce, Hive, Pig, Tez, HBase, Sqoop, Cloudera CDH3, CDH4, CDH5, Hadoop Streaming, ZooKeeper, Oozie, Flume, HUE, Impala and Sparks, Azkaban.
No SQL: HBase, Cassandra, MangoDB
Programming Languages: Java, Pig, HQL, Spring MVC, Hibernate, C, C++, Python, Perl
Scripting Languages: Shell, Python, Perl, Pig-Latin
Cloud Computing: OpenStack, Amazon AWS
Web Technologies: Html, Java Script, CSS, XML, JavaScript, Servlets, JSP, Bootstrap
Database Platforms: MySQL, Oracle 11g/10g/9i, SQL Server 2012/2008
Operating Systems: Windows, Red Hat, Ubuntu, Mac OS X.
IDE: Eclipse,Net Beans.
SERVERS: Apache Tomcat, IIS, Web Logic.
Software Applications: JUnit, TOAD SQL Client, MySQL Workbench, WinSCP, Putty, MSOffice, Norton Utilities, Adobe Photo Shop.
PROFESSIONAL EXPERIENCE
Confidential, Warren, NJ
Sr. Hadoop/Lead Developer
Responsibilities:
- Involved in building scalable distributed data solution using DataStax Cassandra.
- Coordinated with Administrators in setting up, configuring, initializing and troubleshooting cluster of 40 nodes and 320GB of RAM with 100TB hard disk.
- Ingested data from the database using Sqoop from traditional database MySQL to the distributed file system Hadoop for data analysis.
- Extensively involved in writing workflows for Oozie and Azkaban for scheduling the workflow of jobs.
- Analyzed data in Cassandra using Pig and Hive, by importing the data into Hadoop cluster.
- Involved in writing HIVE queries for aggregating the data and extracting useful information.
- Expertise in writing MapReduce jobs for analyzing the data in distributed file system.
- Worked closely with business team to gather requirements and add new features.
- Designed and data modeled the Cassandra for incoming data and for faster access of data using bucketing and partitioning.
- Greatly involved in writing both DML and DDL operations in DataStax Cassandra.
- Analyzed clickstream data from webservers logs which were collected by Flume.
- Used Kafka for processing logs on the cluster as a proof of concept
- Exported the analyzed data into MySQL using Sqoop and for visualization to generate reports to be further processed by business intelligence tools like Google charts, MS Excel.
Environment: Cassandra, DataStax Enterprise, Cloudera distribution CDH5, Hadoop/YARN, Linux, Rest Services, Sparks (scala) and Google Maps API.
Confidential, NYC, NY
Sr. Hadoop Developer
Responsibilities:
- Planned, installed and configured the distributed Hadoop Clusters.
- Ingested data using Sqoop to load data from MySQL to HDFS on regular basis from various sources.
- Configured Hadoop tools like Hive, Pig, HBase, Zookeeper, Flume, HBase, Impala and Sqoop.
- Built relational view of data using HCatalog.
- Ingested data into HBase tables from MySQL, Pig and Hive using Sqoop.
- Used Tez for faster execution of MapReduce jobs.
- Wrote Batch operation across multiple rows for DDL (Data Definition Language) and DML (Data Manipulation Language) for improvised performance using the client API calls
- Integrated MapReduce with HBase with HBase serving as both Data Sink and Data Source.
- Grouped and filtered data using hive queries, HQL and Pig Latin Scripts.
- Queried both Managed and External tables created by Hive using Impala.
- Implemented partitioning and bucketing in Hive for more efficient querying of data.
- Created workflows in Oozie along with managing/coordinating the jobs and combining multiple jobs sequentially into one unit of work.
- Designed and created both Managed/ External tables depending on the requirement for Hive.
- Expertise in writing custom UDFs in Hive.
- Used Pig’s svn repository of user-contributed functions.
- Integrated Hive tables with visualization tools like Tableau and Microsoft Excel.
Environment: Cloudera distribution CDH4, Hadoop/YARN, Linux, Hive, Pig, Impala, Sqoop, Zookeeper and Sparks (scala).
Confidential - Chandler, AZ
Java/Hadoop Developer
Responsibilities:
- Configured the Hadoop Cluster in Local (Standalone), Pseudo Distributed and Fully Distributed mode.
- Installed and configured HBase, Hive, Pig, Sqoop, and Oozie on the Hadoop cluster.
- Imported data from MySQL to HDFS using Sqoop for data processing and analysis.
- Extensively involved in writing Oozie Workflow engine to run executable commands, MapReduce jobs, multiple Hive and Pig Jobs.
- Optimized scripts of Hive and Pig Latin to increase efficiency.
- Expertise in writing custom MapReduce jobs for data analysis and data cleansing.
- Expertise in writing simple to complex MapReduce jobs in Hive and Pig.
- Analyzed web server logs using Pig.
- Loaded data into Pig from various formats of text.
- Used the Pig Tool to analyze large amounts of data by representing them as data flows.
- Involved in writing Hive queries to process data for the sake of visualizing and reporting.
Environment: Apache Hadoop, Cloudera Manager CDH3, HDFS, Java, MapReduce, Hive, Pig, Sqoop, Redhat, Eclipse Indigo, SQL, JUnit.
Confidential
Java Developer
Responsibilities:
- Used Spring Framework for dependency injection with the help of spring configuration files.
- Developed the presentation layer using JSP, HTML, CSS and client validations using JavaScript.
- Used HTML, CSS, JQUERY and JSP to create the user interface.
- Involved in the installation and configuration of Tomcat Server.
- Involved in dynamic form generation, auto completion of forms and user validation functionalities using AJAX.
- Designed, developed and maintained the data layer using Hibernate and performed configuration of Struts application framework.
- Created stored procedures using PL/SQL for data access layer.
- Worked on tuning of back-end Oracle stored procedures using TOAD.
- Developed and maintained ANT Scripts.
- Developed Unit test cases for the classes using JUnit/Eclipse.
- Developed stored procedures to extract data from Oracle database.
Environment: Java, JEE, JSP, Servlets, Spring Framework, Hibernate, SQL/PLSQL, Web Services, WSDL, J-Unit, Tomcat, Oracle 9i, and Windows.
Confidential
Java Developer
Responsibilities:
- Involved in Requirements gathering, Requirements analysis, Design, Development, Integration and Deployment.
- Used JavaScript to perform checking and validations at Client's side.
- Extensively used Spring MVC framework to develop the web layer for the application. Configured DispatcherServlet in web.xml.
- Designed and developed DAO layer using spring and Hibernate apart from using Criteria API.
- Created/generated Hibernate classes and configured XML apart from managing CRUD operations (insert, update, and delete).
- Involved in writing HQL and SQL Queries for Oracle 10g database.
- Used log4j for logging messages.
- Developed the classes for Unit Testing by using JUnit.
- Developed Business components using Spring Framework and database connections using JDBC.
Environment: Spring Framework, Spring MVC, Hibernate, HQL, Eclipse, JavaScript, AJAX, XML,Log4j, Oracle 9i, Web Logic, TOAD.
Confidential
Jr. Java Developer
Responsibilities:
- Gathered business requirements, wrote functional and technical specifications.
- Participated in Database schema design development and coding of DDL & DML statements/ functions.
- Involved in coding using Java, created HTML web pages using Servlets and JSPs.
- Developed JSP Pages and Servlets to provide dynamic content to HTML pages.
- Involved in developing forms using HTML and performing client side validations using JavaScript.
- Coded, tested and documented various packages/procedures/functions for libraries and stored procedures which were commonly used by different modules.
- Developed custom exceptions for handling proper exceptions.
- Created functions, sub queries and stored procedures using PL/SQL.
- Used SQL queries to get the data from Oracle database.
- Developed and tested the applications using Eclipse.
Environment: Java, JDBC, HTML, Apache Tomcat, JSP, Servlets, Structs Framework, Oracle 8i and windows.