We provide IT Staff Augmentation Services!

Hadoop Developer Resume

0/5 (Submit Your Rating)

Orlando, FL

SUMMARY

  • Over 8 years of IT experience which includes close to 5 years of work experience in Big Data, Hadoop ecosystem related technologies.
  • Experienced in Agile SCRUM, RUP (Rational Unified Process) and TDD (Test Driven Development) software development methodologies.
  • Excellent understanding/knowledge of Hadoop Ecosystem including HDFS, MapReduce, Hive, Pig, HBase, Oozie, ZooKeeper, Flume and Sqoop based Big Data Platforms.
  • Expertise in design and implementation of Big Data solutions in Banking, Insurance, Telecommunication, Retail and E - commerce domains.
  • Good knowledge on Hadoop MRV1 and Hadoop MRV2 (or) YARN Architecture.
  • Implemented Data Quality, Price Gap Rules in ETL Tool Talend. Extensive past experience working in Informatica Power Center, and Designer.
  • Extensive experience with OLTP/OLAP System and E-R modeling, developing Database Schemas like STAR schema and Snowflake schema used in relational, dimensional and multidimensional modeling.
  • Experience data processing like collecting, aggregating, moving from various sources using Apache Flume and Kafka.
  • Good exposure to performance tuning hive queries, mapreduce jobs, spark jobs.
  • Comprehensive experience in building Web-based applications using J2EE Frame works like Spring, Hibernate, EJB, Struts and JMS.
  • Excellent ability to use analytical tools to mine data and evaluate the underlying patterns.
  • Assisted in Cluster maintenance, Cluster Monitoring and Troubleshooting, Managing and Reviewing data backups and log files.
  • Good working Experience in client-side development with HTML, XHTML, CSS, JavaScript, JQuery, JSON and AJAX.
  • Good understanding of application servers WebLogic, WebSphere and XML methodologies (XML, XSL, XSD) including Web Services like SOAP and REST.
  • Hands on experience in NOSQL databases like HBase, Cassandra, and MongoDB.
  • Worked on various Hadoop Distributions (Cloudera, Hortonworks, Amazon AWS) to implement and make use of those.
  • Hands on experience in developing MapReduce programs using Apache Hadoop for analyzing the Big Data.
  • Expertise in optimizing traffic across network using Combiners, joining multiple schema datasets using Joins and organizing data using Partitioners and Buckets.
  • Experience in writing Custom Counters for analysing the data and testing using MRUnit framework.
  • Experienced in writing complex MapReduce programs that work with different file formats like Text, Sequence, Xml and Avro.
  • Expertise in composing MapReduce Pipelines with many user-defined functions using Apache Crunch.
  • Expertise in writing ad-hoc MapReduce programs using Pig Scripts.
  • Used Pig as ETL tool to do transformations, event joins, filter and some pre-aggregations.
  • Implemented business logic by writing Pig Latin UDFs in Java and used various UDFs from Piggybanks and other sources.
  • Expertise in Hive Query Language (HiveQL), Hive Security and debugging Hive issues.
  • Responsible for performing extensive data validation using HIVE Dynamic Partitioning and Bucketing.
  • Experience in developing custom UDFs for Pig and Hive to in corporate methods and functionality of Python/Java into Pig Latin and HQL (Hive QL).
  • Analyzed the data by performing Hive queries and used HIVE UDFs for complex querying.
  • Expert database engineer, NoSQL and relational data modeling.
  • Expertise in HBase Cluster Setup, Configurations, HBase Implementation and HBase Client API.
  • Worked on importing data into HBase using HBase Shell and HBase Client API.
  • Expertise in several J2EE technologies like JDBC, Servlets, JSP,Struts, Spring, Hibernate, JPA, JSF, EJB, JMS, JAX-WS, SOAP, JQuery, AJAX, XML, JSON, HTML5/HTML, XHTML, Maven, and Ant.
  • Expert knowledge over J2EE Design Patterns like MVC Architecture, Front Controller, Session Facade, Business Delegate and Data Access Object for building J2EE Applications.
  • Extensive experience in developing Internet and Intranet related applications using J2EE, Servlets, JSP, Jboss, WebLogic, Tomcat, and Struts Frame Work.
  • Extensive experience with database DB2 (Database Design, and SQL Queries).
  • Good experience in SQL, PL/SQL, Perl Scripting, Shell Scripting, Partitioning, Data modeling, OLAP, Logical and Physical Database Design, Backup and Recovery procedures.
  • Experienced with build tool Maven, Ant and continuous integrations like Jenkins.
  • Experience in Administering, Installation, Configuration, Troubleshooting, Security, Backup, Performance Monitoring and Fine-tuning of Linux Red Hat.
  • Developed Unit test cases using JUnit testing framework.

TECHNICAL SKILLS

Big Data Technology: Hadoop, Teradata, Map Reduce, Spark, HDFS, HBase, Pig, Hive, Sqoop, Oozie, Storm, Kafka and Flume

Spark Streaming Technologies: Spark Streaming, Storm

Java/J2EE Technology: JSP, JSF, Servlets, EJB, JDBC, Struts, Spring, Spring MVC, Spring Portlet, Spring Web Flow, Hibernate, iBATIS, JMS, MQ, JCA, JNDI, Java Beans, JAX-RPC, JAX-WS, RMI, RMI-IIOP, EAD4J, Axis, Castor, SOAP, WSDL, UDDI, JiBX, JAXB, DOM, SAX, MyFaces(Tomahawk), Facelets, JPA, Portal, Portlet, JSR 168/286, LifeRay, WebLogic Portal, LDAP, JUnit.NET

Hadoop Distribution: Cloudera, Hortonworks, IBM Big Insights

Cloud Computing Service: AWS (Amazon Web Services)

Scripting Languages: Python, Bash, Java Scripting, HTML5, CSS3

Programming Languages: Java (1.4/5/6), C/C++, Swing, SQL, HTML, CSS, i18n, l10n, DHTML, XML, XSD, XHTML, XSL, XSLT, XPath, XQuery, SQL, PL/SQL, UML, JavaScript, AJAX(DWR), jQuery, Dojo, ExtJS, Shell Scripts, Perl

Development Framework/IDE: RAD 8.x/7.x/6.0, IBM WebSphere Integration Developer 6.1, WSAD 5.x, Eclipse Galileo/Europa/3.x/2.x, MyEclipse 3.x/2.x, NetBeans 7.x/6.x, IntelliJ 7.x, Workshop 8.1/6.1, Adobe Photoshop, Adobe Dreamweaver, Adobe Flash, Ant, Maven, Rational Rose, RSA, MS Visio, OpenMake Meister

Web/Application Servers: WebSphere Application Server 8.x/ 7.0/6.1/5.1/5.0 , WebSphere Portal Server 7.0/6.1, WebSphere Process Server 6.1, WebLogic Application Server 8.1/6.1, JBoss 5.x/3.x, Apache 2.x, Tomcat 7.x/6.x/5.x/4.x, MS IIS, IBM HTTP Server

Databases: NoSQL, Oracle 11g/10g/9i/8i, DB2 9.x/8.x, MS SQL Server 2008/2005/2000 , MySQL

NoSQL: HBase, Cassandra, MongoDB

ETL Tools: Talend, Informatica

Reporting/Analysis Tools: Tableau, SAS

Operating Systems: Windows XP, 2K, MS-DOS, Linux (Red Hat), Unix (Solaris), HP UX, IBM AIX

Version Control: CVS, SourceSafe, ClearCase, Subversion

Monitoring Tools: Embarcadero J Optimizer 2009, TPTP, IBM Heap Analyzer, Wily Introscope, JMeter

Other: JBoss Drools 4.x, REST, IBM Lotus WCM, MS ISA,CA SiteMinder, BMC WAM, Mingle

PROFESSIONAL EXPERIENCE

Confidential - Orlando, FL

Hadoop Developer

Responsibilities:

  • Developed MapReduce programs to parse the raw data, populate staging tables and store the refined data in partitioned tables in the EDW.
  • Created Hive queries that helped market analysts spot emerging trends by comparing fresh data with EDW reference tables and historical metrics.
  • Enabled speedy reviews and first mover advantages by using Oozie to automate data loading into the Hadoop Distributed File System and PIG to pre-process the data.
  • Provided design recommendations and thought leadership to sponsors/stakeholders that improved review processes and resolved technical problems.
  • Managed and reviewed Hadoop log files.
  • Tested raw data and executed performance scripts.
  • Shared responsibility for administration of Hadoop, Hive and Pig.
  • Built wrapper shell scripts to hold this Oozie workflow.
  • Involved in collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis.
  • Involved in creating Hadoop streaming jobs using Python.
  • Provided ad-hoc queries and data metrics to the Business Users using Hive, Pig.
  • Developed PIG Latin scripts to extract the data from the web server output files to load into HDFS.
  • Used Pig as ETL tool to do transformations, event joins and some pre-aggregations before storing the data onto HDFS.
  • Worked on Mapreduce Joins in querying multiple semi-structured data as per analytic needs.
  • Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
  • Created many Java UDF and UDAFs in hive for functions that were not preexisting in Hive like the rank, Csum, etc.
  • Used Hive and created Hive tables and involved in data loading and writing Hive UDFs.
  • Developed POC for Apache Kafka.
  • Exploring with the Spark for improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN.
  • Gained knowledge on building Apache Spark applications using Scala.
  • Do various performance optimizations like using distributed cache for small datasets, partition and bucketing in hive, doing mapside joins etc.
  • Storing and loading the data from HDFS to Amazon S3 and backing up the Namespace data into NFS Filers.
  • Created concurrent access for hive tables with shared and exclusive locking that can be enabled in hive with the help of Zookeeper implementation in the cluster.
  • Created And Implemented Business, validation and coverage, Price gap Rules in Talend on Hive, using Talend Tool.
  • Involved in development of Talend components to validate the data quality across different data sources.
  • Involved in analysis of business validation rules and finding options for the implementation of the rules in Talend.
  • Automated and Scheduling the Rules on Weekly, Monthly Basis in TAC (Talend Administration Centre).
  • Wrote the shell scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
  • Familiarity with NoSQL databases including HBase, MongoDB.
  • Wrote shell scripts for rolling day-to-day processes and it is automated.

Environment: Hadoop, MapReduce, YARN, Hive, HDFS, PIG, Sqoop, Oozie, Cloudera, Flume, HBase, ZooKeeper, Talend, CDH3, MongoDB, Cassandra, Oracle, NoSQL and Unix/Linux, Spark, Kafka, Amazon web services.

Confidential - Kenilworth, NJ

Hadoop Developer

Responsibilities:

  • Installed and configured Hadoop MapReduce, HDFS, developed multiple MapReduce jobs in Java for datacleaning and preprocessing.
  • Importing and exporting data into HDFS from Oracle 10.2 database and vice versa using SQOOP.
  • Experienced in defining and coordination of job flows.
  • Gained experience in reviewing and managing Hadoop log files.
  • Extracted files from NoSQL database (MongoDB), HBase through sqoop and placed in HDFS for processing.
  • Involved in Writing Data Refinement Pig Scripts and Hive Queries.
  • Good knowledge in running Hadoop streaming jobs to process terabytes of xml format data.
  • Load and transform large sets of structured, semi structured and unstructured data.
  • Coordinated cluster services using ZooKeeper.
  • Used XML Technologies like DOM for transferring data.
  • Object relational mapping and Persistence mechanism is executed using Hibernate ORM.
  • Developed custom validator in Struts and implemented server side validations using annotations.
  • Created struts-config.xml file for the Action Servlet to extract the data from specified Action form so as to send it to specified instance of action class.
  • Used Oracle for the database and WebLogic as the application server.
  • Involved in coding for DAO Objects using JDBC (using DAO pattern).
  • Used Flume to transport logs to HDFS.
  • Experienced in moving data from Hive tables into Cassandra for real time analytics on hive tables.
  • Organize documents in more useable clusters using Mahout.
  • Configured connection between HDFS and Tableau using Impala for Tableau developer team.
  • Responsible to manage data coming from different sources.
  • Got good experience with various NoSQL databases.
  • Experienced with handling administration activations using Cloudera manager.
  • Supported MapReduce programs those are running on the cluster.
  • Involved in loading data from UNIX file system to HDFS.
  • Installed and configured Hive and also written Hive UDFs.
  • Involved in creating Hive tables, loading with data and writing Hive queries which will run internally in mapreduce way.
  • Worked on Talend ETL tool, developed and scheduled jobs in Talend integration suite.
  • Modified reports and Talend ETL jobs based on the feedback from QA testers and Users in development and staging environments.
  • Worked on visualization tool tableau for visually analyzing the data.

Environment: Apache Hadoop, Java, JDK1.6, J2EE, JDBC, Servlets, JSP, Linux, XML, WebLogic, SOAP, WSDL, HBaseHive, Pig, Sqoop, ZooKeeper, NoSQL, HBase, R, MAHOUT Map-Reduce, Cloudera, HDFS, Flume, Impala, Tableau, Talend, MySQL, HTML5, CSS, MongoDB

Confidential, Pittsburgh, PA

Hadoop Developer

Responsibilities:

  • Responsible for building scalable distributed data solutions using Hadoop.
  • Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and Extracted the data from MySQL into HDFS using Sqoop.
  • Implementing MapReduce programs to analyze large datasets in warehouse for business intelligence purpose.
  • Used default MapReduce Input and Output Formats.
  • Developed HQL queries to implement the select, insert, update and operations to the database by creating HQL named queries.
  • Installed and configured Hive, Pig, Sqoop, Flume and Oozie on the Hadoop cluster.
  • Developed simple to complex Map/Reduce jobs using Java, and scripts using Hive and Pig.
  • Analyzed the data by performing Hive queries (HiveQL) and running Pig scripts (Pig Latin) for data ingestion and egress.
  • Implemented business logic by writing UDFs in Java and used various UDFs from other sources.
  • Experienced on loading and transforming of large sets of structured and semi structured data.
  • Managing and Reviewing Hadoop Log Files, deploy and Maintaining Hadoop Cluster.
  • Export filtered data into HBase for fast query.
  • Involved in creating Hive tables, loading with data and writing Hive queries.
  • Created data-models for customer data using the Cassandra Query Language.
  • Ran many performance tests using the Cassandra-stress tool in order to measure and improve the read and write performance of the cluster.
  • Involved in developing Shell scripts to orchestrate execution of all other scripts (Pig, Hive, and MapReduce) and move the data files within and outside of HDFS.
  • Queried and analyzed data from Datastax Cassandra for quick searching, sorting and grouping.
  • Supported in setting up QA environment and updating configurations for implementing scripts with Pig, Hive and Sqoop.

Environment: Apache Hadoop (Cloudera), Hbase, Hive, Pig, Map Reduce, Sqoop, Oozie, Eclipse, Java

Confidential - New York, NY

Java Developer/Oracle Developer

Responsibilities:

  • Estimation, design, and development of various modules.
  • Implemented MVC architecture.
  • Responsible for developing use case, class diagrams and sequence diagrams for the modules using UML.
  • Responsible for re-engineering Confidential legal eCommerce Java/J2EE/JEE based Portal applications.
  • Designed, developed and tested Java/J2EE/JEE/Portal applications using Spring, Spring IoC, Spring MVC, Spring Portlet, Hibernate, and WebSphere Portal.
  • Designed, developed and modified UI components that used JSP, JSF, JavaScript, jQuery, DWR (AJAX), CSS, HTML, XHTML, XML, and velocity.
  • Created batch print component that converted MS Word documents to PDF and sent the merged document Stream to client side for printing using Aspose.Words for Java and iText.
  • Configured Spring and Hibernate components.
  • Designed and developed business and persistence layer components using Spring, Spring IoC and Hibernate.
  • Wrote complex SQL queries to interact with backend Oracle 11g/10 databases.
  • Created test cases and performed Unit and Integration testing using Spring Test API.
  • Built, deployed and tested developed components on WebSphere Portal Server 6.1
  • Worked on Agile software development environment.
  • Involved in development of user interface modules using HTML, CSS, JSP.
  • Designed the applications using MVC framework for easy maintainability.
  • Involved in writing many Scriptlets in JSP where the requirement has to be met.
  • Developed notification and customer classes.
  • Involved in writing SQL queries.
  • Used technologies like JDBC for accessing related data from database.
  • Handled backend data by creating optimal stored procedures in Oracle database.
  • Migrated manual work of employees done on excel sheet to automated system developed.
  • Optimized project efficiency by reducing 4 hours/day spent in manual excel sheet maintenance.
  • Developed Servlets as controllers to perform requisite functions.
  • Worked with and utilized Core java, MySQL and HTML daily.
  • Fixing/Troubleshooting bugs and issues with modules regularly.
  • Design and development of Vendor portal application, to keep track of shipping information for the orders requested.
  • Worked on enhancements of the File Processors for sending orders to drop shippers and updating orders from drop shippers for the orders shipment status Released/Scheduled based on requirements

Environment: HTML, CSS, AJAX, JQuery, Javascript, Flash, Core Java, J2EE, Struts 2.0, Servlets, JSP, JSTL, XML, MyEclipse 9.0. Jboss 4.0, Oracle

Confidential

Java Developer/(DW/BI) Developer

Responsibilities:

  • Involved in Architecture/Designing the State Portal Application.
  • Involved in Functional and Detailed Designs.
  • Involved in Presentation Development using Struts Framework.
  • Involved in the analysis, design, and development and testing phases of Software Development Lifecycle (SDLC) using Agile development methodology.
  • Involved in business requirement gathering and technical specifications.
  • Implemented J2EE standards, MVC2 architecture using Struts Framework.
  • Implemented Servlets, JSP and Ajax to design the user interface.
  • Presentation Tier is built using the Struts framework.
  • Implemented and configured various Action classes for handling the client requests using Struts 2 framework.
  • Used EJBs (Stateless Session beans) to implement the business logic, MDBs (JMS) for asynchronous communication internal and external to the system.
  • All the Business logic in all the modules is written in core Java.
  • Workflow (Order Flow) is built using JMS technology.
  • Developed WebServices using SOAP for sending and getting data from the external interface.
  • Used Source Integrity tool to build and deploy the application.
  • Used Design patterns such as Business delegate, Service locator, Model View Controller, Session façade, DAO.
  • Involved in implementing the JMS (Java messaging service) for asynchronous communication.
  • Involved in using JMS Queues and JMS Topics for one-to-one and one-to-may communication in the application.
  • Backend application layer is implemented using EJB (Enterprise Java Bean) in WebSphere Application Server environment.
  • Created Stored procedures using PL-SQL for data modification (Using DML insert, update, delete) in Oracle.
  • Interaction with Oracle database is implemented using Hibernate.
  • Designed a for business intelligence module of the project.
  • Developed stored procedures in PostgreSql to support analytical reports.
  • Integrated business intelligence module in the existing Reporting Framework (Java Based).
  • Used SAS Visual Analytics for report generation and SAS Data Integrator for transforming data from OLTP to OLAP environment.

Environment: J2EE, EJB, WebServices, XML, XSD, RUP, Microsoft Visio, Clear Case, Source Integrity, Oracle 10g, WebSphere 10.3, JMS, SOA, LDAP, RAD, LOG4j, Servlets, JSP, Unix, Struts 2.0, Hibernate, Informatica, SAS.

We'd love your feedback!