Sr. Hadoop Developer/admin Resume
Cleveland, OH
SUMMARY
- Over 7+ years of experience in software development, 3+ years of experience in developing large scale applications using Hadoop and Other Big data tools.
- Experienced in the Hadoop ecosystem components like Hadoop Map Reduce, Cloudera, Hortonworks, HBase, Oozie, Hive, Sqoop, Pig, Flume, and Cassandra.
- Experience in developing solutions to analyze large data sets efficiently.
- In depth understanding/knowledge of Hadoop Architecture and various components such as HDFS, JobTracker, TaskTracker, NameNode, DataNode and MapReduce concepts.
- Extensive hands on experience in writing complex Mapreduce jobs, Pig Scripts and Hive data modeling.
- Excellent understanding/knowledge of Hadoop Distributed system architecture and design principles.
- Experience in converting MapReduce applications to Spark.
- Good working experience using Sqoop to import data into HDFS from RDBMS and vice - versa.
- Good knowledge in using job scheduling and workflow designing tools like Oozie.
- Experience in working with BI team and transform big data requirements into Hadoop centric technologies.
- Experience in performance tuning the Hadoop cluster by gathering and analyzing the existing infrastructure.
- Experience in Hadoop administration activities such as installation and configuration of clusters using Cloudera Manager and Apache Ambari.
- Have good experience creating real time data streaming solutions using Apache Spark/Spark Streaming/Apache Storm, Kafka and Flume.
- Extending Hive and Pig core functionality by writing custom UDFs.
- Good understanding of Data Mining and Machine Learning techniques.
- Experience in handling messaging services using Apache Kafka.
- Experience in fine-tuning Mapreduce jobs for better scalability and performance.
- Developed various Map Reduce applications to perform ETL workloads on terabytes of data.
- Experienced in developing and implementing web applications using Java, J2EE, JSP, Servlets, JSF, HTML, DHTML, EJB, JavaScript, AJAX, JSON, JQuery, CSS, XML, JDBC and JNDI.
- Experience in writing SQL, PL/SQL queries, Stored Procedures for accessing and managing databases such as Oracle, SQL Server2014/2012 MySQL, and IBM DB2.
- Working experience in Development, Production and QA Environments.
- Involved in all phases of Software Development Life Cycle (SDLC) in large scale enterprise software using Object Oriented Analysis and Design.
- Working experience of control version tools like SVN, CVS, Clear Case and PVCS.
TECHNICAL SKILLS
Languages: Java (Core Java, Networking, Threads, Swing), XML, XSD, XSL, JavaScript
J2EE Technologies: J2EE, Java Mail API
Web servers: Apache Tomcat Server, IBM Websphere Application Server 5.0/ 6.0, Weblogic application server, JBOSS4.x
Server Side: JSP, Servlets, EJB, JDBC.
Frameworks/ Components: Spring, Spring Batch, Struts, Hibernate.
Big Data: Hadoop, Map Reduce, Hive, Pig, Storm, Sqoop, Oozie andMR Unit, HDFS, HBase, Mahout, Falcon, Kafka, Accumulo, Zookeeper.
Databases: SQL, MySQL, SQL Server, Oracle, DB2, Confidential Access
Unit Testing: Junit, Rational
Methodologies: OOAD, RUP, UML, Design Patterns.
OS: Windows 2000/XP/Vista, WindowsNT4.0, Windows 03, Linux, UNIX.
Markup Languages: HTML, XML, DHTML
Open Source API: Apache Commons-io/file upload/net.
PROFESSIONAL EXPERIENCE
Confidential, Cleveland, OH
Sr. Hadoop Developer/Admin
Responsibilities:
- Installed and Configured Apache Hadoop clusters for application development and Hadoop tools like Hive, Pig, HBase, Zookeeper and Sqoop.
- Implemented multiple Map Reduce Jobs in java for data cleansing and pre-processing.
- Worked with the team to increase cluster from 28 nodes to 42 nodes, the configuration for additional data nodes was done by Commissioning process in Hadoop.
- Responsible for Cluster maintenance, adding and removing cluster nodes, Cluster Monitoring and Troubleshooting, manage and review data backups and log files.
- Worked with systems engineering team to plan and deploy new Hadoop environments and expand existing Hadoop clusters.
- Managed and scheduled Jobs on a Hadoop cluster.
- Involved in defining job flows, managing and reviewing log files.
- Installed Oozie workflow engine to run multiple Map Reduce, Hive HQL and Pig jobs.
- Collected the log data from web servers and integrated into HDFS using Flume.
- Cassandra developer: Set-up configured and optimized the Cassandra cluster. Developed real-time java based application to work along with the Cassandra database.
- Responsible to manage data coming from different sources.
- Created and maintained Technical documentation for launching HADOOP Clusters and for executing Hive queries and Pig Scripts.
- Participated in requirement gathering form the Experts and Business Partners and converting the requirements into technical specifications
- Constructed System components and developed server side part using Java, EJB, and Spring Frame work. Involved in designing the data model for the system.
- Used J2EE design patterns like DAO, MODEL, Service Locator, MVC and Business Delegate.
- Defined Interface Mapping between JDBC Layer and Oracle Stored Procedures.
- Experience in managing and reviewing Hadoop log files.
- Installed Oozie workflow engine to run multiple Hive and pig jobs.
- Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
- Supported in setting up QA environment and updating configurations for implementing scripts with Pig and Sqoop. Worked on tuning the performance Pig queries.
- Implemented a script to transmit sysprin information from Oracle to Hbase using Sqoop.
- Implemented best income logic using Pig scripts and UDFs.
- Implemented test scripts to support test driven development and continuous integration.
Environment: Hadoop, Map Reduce, Spark, shark, Kafka, HDFS, Zoo Keeper, Hive, Pig, Oozie, Core Java, Eclipse, Hbase, Sqoop, Flume, Oracle 11g, Cassandra, UNIX Shell Scripting.
Confidential, Houston, TX
Hadoop Developer
Responsibilities:
- Responsible to manage data coming from different sources and involved in HDFS maintenance and loading of structured and unstructured data.
- Developed data pipeline using Flume, Sqoop, Pig and Java MapReduce to ingest behavioral data into HDFS for analysis.
- Responsible for importing log files from various sources into HDFS using Flume.
- Imported data using Sqoop to load data from MySQL to HDFS on regular basis.
- Extracted files from MongoDB through Sqoop and placed in HDFS and processed.
- Created customized BI tool for manager team that perform Query analytics using Hive QL.
- Created Partitions, Buckets based on State to further process using Bucket based Hive joins.
- Estimated the hardware requirements for NameNode and DataNodes & planning the cluster.
- Created Hive Generic UDF's, UDAF's, UDTF's in python to process business logic that varies based on policy.
- Moved Relational Database data using Sqoop into Hive Dynamic partition tables using staging tables.
- Optimizing the Hive queries using Partitioning and Bucketing techniques, for controlling the data distribution.
- Worked with Kafka for the proof of concept for carrying out log processing on a distributed system. Worked with NoSQL database Hbase to create tables and store data.
- Worked on custom Pig Loaders and storage classes to work with variety of data formats such as JSON and XML file formats.
- Involved in Cassandra Data Modeling and Analysis and CQL (Cassandra Query Language).
- Experience in Upgrading Apache Ambari, CDH and HDP Cluster.
- Configured and Maintained different topologies in Storm cluster and deployed them on regular basis.
- Experienced with different kind of compression techniques like LZO, GZip, and Snappy.
- Used Oozie workflow engine to manage interdependent Hadoop jobs and to automate several types of Hadoop jobs such as Java map-reduce Hive, Pig, and Sqoop.
- Created Data Pipeline of Map Reduce programs using Chained Mappers.
- Implemented Optimized join base by joining different data sets to get top claims based on state using Map Reduce.
- Implemented map reduce programs to perform joins on the Map side using Distributed Cache in Java. Developed Unit test cases using Junit, Easy Mock and MRUnit testing frameworks.
- Experience in Upgrading hadoop cluster hbase/zookeeper from CDH3 to CDH4.
- Create a complete processing engine, based on Cloudera's distribution, enhanced to performance.
- Experienced in Monitoring Cluster using Cloudera manager.
Environment: Hadoop, HDFS, HBase, MapReduce, Java, JDK 1.5, J2EE 1.4, Struts 1.3, Hive, Pig, Sqoop, Flume, Kafka, Oozie, Hue, Hortonworks, Storm, Zookeeper, AVRO Files, SQL, ETL, Cloudera Manager, MySQL, MongoDB.
Confidential, Baskin Ridge, NJ
Hadoop Developer
Responsibilities:
- Responsible for building scalable distributed data solutions using Hadoop.
- Responsible for Cluster maintenance, adding and removing cluster nodes, Cluster Monitoring and Troubleshooting, Manage and review data backups and log files.
- Analyzed data using Hadoop components Hive and Pig.
- Responsible for running Hadoop streaming jobs to process terabytes of xml's data.
- Load and transform large sets of structured, semi structured and unstructured data using Hadoop/Big Data concepts.
- Analyzed the web log data using the HiveQL to extract number of unique visitors per day, page views, visit duration, most purchased product on website.
- Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports by our BI team.
- Responsible to manage data coming from different sources.
- Integrated Oozie with the rest of the Hadoop stack supporting several types of Hadoop jobs out of the box (such as Map-Reduce, Pig, Hive, and Sqoop) as well as system specific jobs (such as Java programs and shell scripts).
- Involved in loading data from UNIX file system to HDFS.
- Responsible for creating Hive tables, loading data and writing hive queries.
- Handled importing data from various data sources, performed transformations using Hive, Map Reduce, and loaded data into HDFS.
- Extracted the data from Teradata into HDFS using the Sqoop.
- Exported the patterns analyzed back to Teradata using Sqoop.
- Installed Oozie workflow engine to run multiple Hive and Pig jobs which run independently with time and data availability.
Environment: Hadoop Cluster, HDFS, Hive, Pig, Sqoop, Linux, Hadoop Map Reduce, HBase, Shell Scripting. Linux, UNIX Shell Scripting and Big Data.
Confidential, Charlotte, NC
Sr. Java / J2EE Developer
Responsibilities:
- As a programmer, involved in designing and implementation of MVC pattern.
- Extensively used XML where in process details are stored in the database and used the stored XML whenever needed.
- Part of core team to develop process engine.
- Developed Action Classes & Validation Struts framework.
- Created project related documentations like user guides based on role.
- Implemented modules like Client Management, Vendor Management.
- Implemented Access Control Mechanism to provide various access levels to the user.
- Designed and developed the application using J2EE, JSP, XML, Struts, Hibernate, Spring technologies.
- Coded DAO and hibernate implementation Class for data access.
- Coded Springs Services Class and Transfer Objects to pass the data between layers.
- Designed the Database for the Jeevica in MS-SQL server 2010.
- Implemented Web Services using Axis
- Used different features of Struts like MVC, Validation framework and tag library.
- Created detail design document, Use cases, and Class Diagrams using UML
- Written ANT scripts to build JAR, WAR and EAR files.
- Developed Standalone Java Component that will interact with Crystal Reports on Crystal Enterprise Server in order to view Reports as well Scheduling of Reports as well storing data as XML and sending data to consumers using SOAP.
- Deployed the application and tested on Websphere Application Servers.
- Developed JavaScript for client side validations in JSP.
- Developed JSPs with Struts taglibs for the presentation layer.
- Coordinated with the onsite, offshore and QA team to facilitate the quality delivery from offshore on schedule.
Environment: Java 1.5, Spring, Spring WebService, JSP, JavaScript, Hibernate, SOAP, CSS, Struts, Websphere, MQ Series, JUnit, Apache, Windows XP and Linux
Confidential
Java Developer
Responsibilities:
- Designed a system and developed a framework using J2EE technologies based on MVC architecture.
- Involved in the iterative/incremental development of project application. Participated in the requirement analysis and design meetings.
- Designed and Developed UI’s using JSP by following MVC architecture
- Designed and developed Presentation Tier using Struts framework, JSP, Servlets, TagLibs, HTML and JavaScript.
- Designed the control which includes Class Diagrams and Sequence Diagrams using VISIO.
- Used the STRUTS framework in application. Programmed the views using JSP pages with the struts tag library, Model is a combination of EJB’s and Java classes and web implementation controllers are Servlets.
- Generated XML pages with templates using XSL. Used JSP and Servlets, EJBs on server side.
- Developed a complete External build process and maintained using ANT.
- Implemented Home Interface, Remote Interface, and Bean Implementation class.
- Implemented business logic at server side using Session Bean.
- Extensive usage of XML - Application configuration, Navigation, Task based configuration.
- Designed and developed Unit and integration test cases using Junit.
- Used EJB features effectively- Local interfaces to improve the performance, Abstract persistence schema, CMRs.
- Used Struts web application framework implementation to build the presentation tier.
- Wrote PL/SQLqueries to access data from Oracle database.
- Set up Web sphere Application server and used ANT tool to build the application and deploy the application in Web sphere.
- Prepared test plans and writing test cases
- Implemented JMS for making asynchronous requests
Environment: Java, J2EE, Struts, Hibernate, JSP, Servlets, HTML, CSS, UML, JQuery, Log4J, XML Schema, JUNIT, Tomcat, JavaScript, Oracle 9i, Unix, Eclipse IDE.
Confidential
Java Developer
Responsibilities:
- Understanding andanalyzingthe requirements.
- Implemented server side programs by usingServletsand JSP.
- Designed, developedand validatedUser Interface using HTML, Java Script, XML andCSS.
- Implemented MVC using Struts Framework.
- Handled the database access by implementing Controller Servlet.
- Implemented PL/SQL stored procedures and triggers.
- Used JDBC prepared statements to call from Servlets for database access.
- Designed and documented of the stored procedures
- Widely used HTML for web based design.
- Involved in Unit testing for various components.
- Worked on database interaction layer for insertions, updating and retrieval operations of data from oracle database by writing stored procedures.
- Used Spring Framework for Dependency Injection and integrated with Hibernate.
- Involved in writing JUnit Test Cases.
- Used Log4J for any errors in the application
Environment: Java, J2EE, JSP, Servlets, HTML, DHTML, XML, JavaScript, Struts, Eclipse, WebLogic, PL/SQL and Oracle.
