Hadoop Developer Resume Orlando, FL - Hire IT People

SUMMARY

Over 8 years of IT experience which includes close to 5 years of work experience in Big Data, Hadoop ecosystem related technologies.
Experienced in Agile SCRUM, RUP (Rational Unified Process) and TDD (Test Driven Development) software development methodologies.
Excellent understanding/knowledge of Hadoop Ecosystem including HDFS, MapReduce, Hive, Pig, HBase, Oozie, ZooKeeper, Flume and Sqoop based Big Data Platforms.
Expertise in design and implementation of Big Data solutions in Banking, Insurance, Telecommunication, Retail and E - commerce domains.
Good knowledge on Hadoop MRV1 and Hadoop MRV2 (or) YARN Architecture.
Implemented Data Quality, Price Gap Rules in ETL Tool Talend. Extensive past experience working in Informatica Power Center, and Designer.
Extensive experience with OLTP/OLAP System and E-R modeling, developing Database Schemas like STAR schema and Snowflake schema used in relational, dimensional and multidimensional modeling.
Experience data processing like collecting, aggregating, moving from various sources using Apache Flume and Kafka.
Good exposure to performance tuning hive queries, mapreduce jobs, spark jobs.
Comprehensive experience in building Web-based applications using J2EE Frame works like Spring, Hibernate, EJB, Struts and JMS.
Excellent ability to use analytical tools to mine data and evaluate the underlying patterns.
Assisted in Cluster maintenance, Cluster Monitoring and Troubleshooting, Managing and Reviewing data backups and log files.
Good working Experience in client-side development with HTML, XHTML, CSS, JavaScript, JQuery, JSON and AJAX.
Good understanding of application servers WebLogic, WebSphere and XML methodologies (XML, XSL, XSD) including Web Services like SOAP and REST.
Hands on experience in NOSQL databases like HBase, Cassandra, and MongoDB.
Worked on various Hadoop Distributions (Cloudera, Hortonworks, Amazon AWS) to implement and make use of those.
Hands on experience in developing MapReduce programs using Apache Hadoop for analyzing the Big Data.
Expertise in optimizing traffic across network using Combiners, joining multiple schema datasets using Joins and organizing data using Partitioners and Buckets.
Experience in writing Custom Counters for analysing the data and testing using MRUnit framework.
Experienced in writing complex MapReduce programs that work with different file formats like Text, Sequence, Xml and Avro.
Expertise in composing MapReduce Pipelines with many user-defined functions using Apache Crunch.
Expertise in writing ad-hoc MapReduce programs using Pig Scripts.
Used Pig as ETL tool to do transformations, event joins, filter and some pre-aggregations.
Implemented business logic by writing Pig Latin UDFs in Java and used various UDFs from Piggybanks and other sources.
Expertise in Hive Query Language (HiveQL), Hive Security and debugging Hive issues.
Responsible for performing extensive data validation using HIVE Dynamic Partitioning and Bucketing.
Experience in developing custom UDFs for Pig and Hive to in corporate methods and functionality of Python/Java into Pig Latin and HQL (Hive QL).
Analyzed the data by performing Hive queries and used HIVE UDFs for complex querying.
Expert database engineer, NoSQL and relational data modeling.
Expertise in HBase Cluster Setup, Configurations, HBase Implementation and HBase Client API.
Worked on importing data into HBase using HBase Shell and HBase Client API.
Expertise in several J2EE technologies like JDBC, Servlets, JSP,Struts, Spring, Hibernate, JPA, JSF, EJB, JMS, JAX-WS, SOAP, JQuery, AJAX, XML, JSON, HTML5/HTML, XHTML, Maven, and Ant.
Expert knowledge over J2EE Design Patterns like MVC Architecture, Front Controller, Session Facade, Business Delegate and Data Access Object for building J2EE Applications.
Extensive experience in developing Internet and Intranet related applications using J2EE, Servlets, JSP, Jboss, WebLogic, Tomcat, and Struts Frame Work.
Extensive experience with database DB2 (Database Design, and SQL Queries).
Good experience in SQL, PL/SQL, Perl Scripting, Shell Scripting, Partitioning, Data modeling, OLAP, Logical and Physical Database Design, Backup and Recovery procedures.
Experienced with build tool Maven, Ant and continuous integrations like Jenkins.
Experience in Administering, Installation, Configuration, Troubleshooting, Security, Backup, Performance Monitoring and Fine-tuning of Linux Red Hat.
Developed Unit test cases using JUnit testing framework.

TECHNICAL SKILLS

Big Data Technology: Hadoop, Teradata, Map Reduce, Spark, HDFS, HBase, Pig, Hive, Sqoop, Oozie, Storm, Kafka and Flume

Spark Streaming Technologies: Spark Streaming, Storm

Java/J2EE Technology: JSP, JSF, Servlets, EJB, JDBC, Struts, Spring, Spring MVC, Spring Portlet, Spring Web Flow, Hibernate, iBATIS, JMS, MQ, JCA, JNDI, Java Beans, JAX-RPC, JAX-WS, RMI, RMI-IIOP, EAD4J, Axis, Castor, SOAP, WSDL, UDDI, JiBX, JAXB, DOM, SAX, MyFaces(Tomahawk), Facelets, JPA, Portal, Portlet, JSR 168/286, LifeRay, WebLogic Portal, LDAP, JUnit.NET

Hadoop Distribution: Cloudera, Hortonworks, IBM Big Insights

Cloud Computing Service: AWS (Amazon Web Services)

Scripting Languages: Python, Bash, Java Scripting, HTML5, CSS3

Programming Languages: Java (1.4/5/6), C/C++, Swing, SQL, HTML, CSS, i18n, l10n, DHTML, XML, XSD, XHTML, XSL, XSLT, XPath, XQuery, SQL, PL/SQL, UML, JavaScript, AJAX(DWR), jQuery, Dojo, ExtJS, Shell Scripts, Perl

Development Framework/IDE: RAD 8.x/7.x/6.0, IBM WebSphere Integration Developer 6.1, WSAD 5.x, Eclipse Galileo/Europa/3.x/2.x, MyEclipse 3.x/2.x, NetBeans 7.x/6.x, IntelliJ 7.x, Workshop 8.1/6.1, Adobe Photoshop, Adobe Dreamweaver, Adobe Flash, Ant, Maven, Rational Rose, RSA, MS Visio, OpenMake Meister

Web/Application Servers: WebSphere Application Server 8.x/ 7.0/6.1/5.1/5.0 , WebSphere Portal Server 7.0/6.1, WebSphere Process Server 6.1, WebLogic Application Server 8.1/6.1, JBoss 5.x/3.x, Apache 2.x, Tomcat 7.x/6.x/5.x/4.x, MS IIS, IBM HTTP Server

Databases: NoSQL, Oracle 11g/10g/9i/8i, DB2 9.x/8.x, MS SQL Server 2008/2005/2000 , MySQL

NoSQL: HBase, Cassandra, MongoDB

ETL Tools: Talend, Informatica

Reporting/Analysis Tools: Tableau, SAS

Operating Systems: Windows XP, 2K, MS-DOS, Linux (Red Hat), Unix (Solaris), HP UX, IBM AIX

Version Control: CVS, SourceSafe, ClearCase, Subversion

Monitoring Tools: Embarcadero J Optimizer 2009, TPTP, IBM Heap Analyzer, Wily Introscope, JMeter

Other: JBoss Drools 4.x, REST, IBM Lotus WCM, MS ISA,CA SiteMinder, BMC WAM, Mingle

PROFESSIONAL EXPERIENCE

Confidential - Orlando, FL

Hadoop Developer

Responsibilities:

Developed MapReduce programs to parse the raw data, populate staging tables and store the refined data in partitioned tables in the EDW.
Created Hive queries that helped market analysts spot emerging trends by comparing fresh data with EDW reference tables and historical metrics.
Enabled speedy reviews and first mover advantages by using Oozie to automate data loading into the Hadoop Distributed File System and PIG to pre-process the data.
Provided design recommendations and thought leadership to sponsors/stakeholders that improved review processes and resolved technical problems.
Managed and reviewed Hadoop log files.
Tested raw data and executed performance scripts.
Shared responsibility for administration of Hadoop, Hive and Pig.
Built wrapper shell scripts to hold this Oozie workflow.
Involved in collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis.
Involved in creating Hadoop streaming jobs using Python.
Provided ad-hoc queries and data metrics to the Business Users using Hive, Pig.
Developed PIG Latin scripts to extract the data from the web server output files to load into HDFS.
Used Pig as ETL tool to do transformations, event joins and some pre-aggregations before storing the data onto HDFS.
Worked on Mapreduce Joins in querying multiple semi-structured data as per analytic needs.
Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
Created many Java UDF and UDAFs in hive for functions that were not preexisting in Hive like the rank, Csum, etc.
Used Hive and created Hive tables and involved in data loading and writing Hive UDFs.
Developed POC for Apache Kafka.
Exploring with the Spark for improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN.
Gained knowledge on building Apache Spark applications using Scala.
Do various performance optimizations like using distributed cache for small datasets, partition and bucketing in hive, doing mapside joins etc.
Storing and loading the data from HDFS to Amazon S3 and backing up the Namespace data into NFS Filers.
Created concurrent access for hive tables with shared and exclusive locking that can be enabled in hive with the help of Zookeeper implementation in the cluster.
Created And Implemented Business, validation and coverage, Price gap Rules in Talend on Hive, using Talend Tool.
Involved in development of Talend components to validate the data quality across different data sources.
Involved in analysis of business validation rules and finding options for the implementation of the rules in Talend.
Automated and Scheduling the Rules on Weekly, Monthly Basis in TAC (Talend Administration Centre).
Wrote the shell scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
Familiarity with NoSQL databases including HBase, MongoDB.
Wrote shell scripts for rolling day-to-day processes and it is automated.

Environment: Hadoop, MapReduce, YARN, Hive, HDFS, PIG, Sqoop, Oozie, Cloudera, Flume, HBase, ZooKeeper, Talend, CDH3, MongoDB, Cassandra, Oracle, NoSQL and Unix/Linux, Spark, Kafka, Amazon web services.

Confidential - Kenilworth, NJ

Hadoop Developer

Responsibilities:

Installed and configured Hadoop MapReduce, HDFS, developed multiple MapReduce jobs in Java for datacleaning and preprocessing.
Importing and exporting data into HDFS from Oracle 10.2 database and vice versa using SQOOP.
Experienced in defining and coordination of job flows.
Gained experience in reviewing and managing Hadoop log files.
Extracted files from NoSQL database (MongoDB), HBase through sqoop and placed in HDFS for processing.
Involved in Writing Data Refinement Pig Scripts and Hive Queries.
Good knowledge in running Hadoop streaming jobs to process terabytes of xml format data.
Load and transform large sets of structured, semi structured and unstructured data.
Coordinated cluster services using ZooKeeper.
Used XML Technologies like DOM for transferring data.
Object relational mapping and Persistence mechanism is executed using Hibernate ORM.
Developed custom validator in Struts and implemented server side validations using annotations.
Created struts-config.xml file for the Action Servlet to extract the data from specified Action form so as to send it to specified instance of action class.
Used Oracle for the database and WebLogic as the application server.
Involved in coding for DAO Objects using JDBC (using DAO pattern).
Used Flume to transport logs to HDFS.
Experienced in moving data from Hive tables into Cassandra for real time analytics on hive tables.
Organize documents in more useable clusters using Mahout.
Configured connection between HDFS and Tableau using Impala for Tableau developer team.
Responsible to manage data coming from different sources.
Got good experience with various NoSQL databases.
Experienced with handling administration activations using Cloudera manager.
Supported MapReduce programs those are running on the cluster.
Involved in loading data from UNIX file system to HDFS.
Installed and configured Hive and also written Hive UDFs.
Involved in creating Hive tables, loading with data and writing Hive queries which will run internally in mapreduce way.
Worked on Talend ETL tool, developed and scheduled jobs in Talend integration suite.
Modified reports and Talend ETL jobs based on the feedback from QA testers and Users in development and staging environments.
Worked on visualization tool tableau for visually analyzing the data.

Environment: Apache Hadoop, Java, JDK1.6, J2EE, JDBC, Servlets, JSP, Linux, XML, WebLogic, SOAP, WSDL, HBaseHive, Pig, Sqoop, ZooKeeper, NoSQL, HBase, R, MAHOUT Map-Reduce, Cloudera, HDFS, Flume, Impala, Tableau, Talend, MySQL, HTML5, CSS, MongoDB

Confidential, Pittsburgh, PA

Hadoop Developer

Responsibilities:

Responsible for building scalable distributed data solutions using Hadoop.
Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and Extracted the data from MySQL into HDFS using Sqoop.
Implementing MapReduce programs to analyze large datasets in warehouse for business intelligence purpose.
Used default MapReduce Input and Output Formats.
Developed HQL queries to implement the select, insert, update and operations to the database by creating HQL named queries.
Installed and configured Hive, Pig, Sqoop, Flume and Oozie on the Hadoop cluster.
Developed simple to complex Map/Reduce jobs using Java, and scripts using Hive and Pig.
Analyzed the data by performing Hive queries (HiveQL) and running Pig scripts (Pig Latin) for data ingestion and egress.
Implemented business logic by writing UDFs in Java and used various UDFs from other sources.
Experienced on loading and transforming of large sets of structured and semi structured data.
Managing and Reviewing Hadoop Log Files, deploy and Maintaining Hadoop Cluster.
Export filtered data into HBase for fast query.
Involved in creating Hive tables, loading with data and writing Hive queries.
Created data-models for customer data using the Cassandra Query Language.
Ran many performance tests using the Cassandra-stress tool in order to measure and improve the read and write performance of the cluster.
Involved in developing Shell scripts to orchestrate execution of all other scripts (Pig, Hive, and MapReduce) and move the data files within and outside of HDFS.
Queried and analyzed data from Datastax Cassandra for quick searching, sorting and grouping.
Supported in setting up QA environment and updating configurations for implementing scripts with Pig, Hive and Sqoop.

Environment: Apache Hadoop (Cloudera), Hbase, Hive, Pig, Map Reduce, Sqoop, Oozie, Eclipse, Java

Confidential - New York, NY

Java Developer/Oracle Developer

Responsibilities:

Estimation, design, and development of various modules.
Implemented MVC architecture.
Responsible for developing use case, class diagrams and sequence diagrams for the modules using UML.
Responsible for re-engineering Confidential legal eCommerce Java/J2EE/JEE based Portal applications.
Designed, developed and tested Java/J2EE/JEE/Portal applications using Spring, Spring IoC, Spring MVC, Spring Portlet, Hibernate, and WebSphere Portal.
Designed, developed and modified UI components that used JSP, JSF, JavaScript, jQuery, DWR (AJAX), CSS, HTML, XHTML, XML, and velocity.
Created batch print component that converted MS Word documents to PDF and sent the merged document Stream to client side for printing using Aspose.Words for Java and iText.
Configured Spring and Hibernate components.
Designed and developed business and persistence layer components using Spring, Spring IoC and Hibernate.
Wrote complex SQL queries to interact with backend Oracle 11g/10 databases.
Created test cases and performed Unit and Integration testing using Spring Test API.
Built, deployed and tested developed components on WebSphere Portal Server 6.1
Worked on Agile software development environment.
Involved in development of user interface modules using HTML, CSS, JSP.
Designed the applications using MVC framework for easy maintainability.
Involved in writing many Scriptlets in JSP where the requirement has to be met.
Developed notification and customer classes.
Involved in writing SQL queries.
Used technologies like JDBC for accessing related data from database.
Handled backend data by creating optimal stored procedures in Oracle database.
Migrated manual work of employees done on excel sheet to automated system developed.
Optimized project efficiency by reducing 4 hours/day spent in manual excel sheet maintenance.
Developed Servlets as controllers to perform requisite functions.
Worked with and utilized Core java, MySQL and HTML daily.
Fixing/Troubleshooting bugs and issues with modules regularly.
Design and development of Vendor portal application, to keep track of shipping information for the orders requested.
Worked on enhancements of the File Processors for sending orders to drop shippers and updating orders from drop shippers for the orders shipment status Released/Scheduled based on requirements

Environment: HTML, CSS, AJAX, JQuery, Javascript, Flash, Core Java, J2EE, Struts 2.0, Servlets, JSP, JSTL, XML, MyEclipse 9.0. Jboss 4.0, Oracle

Confidential

Java Developer/(DW/BI) Developer

Responsibilities:

Involved in Architecture/Designing the State Portal Application.
Involved in Functional and Detailed Designs.
Involved in Presentation Development using Struts Framework.
Involved in the analysis, design, and development and testing phases of Software Development Lifecycle (SDLC) using Agile development methodology.
Involved in business requirement gathering and technical specifications.
Implemented J2EE standards, MVC2 architecture using Struts Framework.
Implemented Servlets, JSP and Ajax to design the user interface.
Presentation Tier is built using the Struts framework.
Implemented and configured various Action classes for handling the client requests using Struts 2 framework.
Used EJBs (Stateless Session beans) to implement the business logic, MDBs (JMS) for asynchronous communication internal and external to the system.
All the Business logic in all the modules is written in core Java.
Workflow (Order Flow) is built using JMS technology.
Developed WebServices using SOAP for sending and getting data from the external interface.
Used Source Integrity tool to build and deploy the application.
Used Design patterns such as Business delegate, Service locator, Model View Controller, Session façade, DAO.
Involved in implementing the JMS (Java messaging service) for asynchronous communication.
Involved in using JMS Queues and JMS Topics for one-to-one and one-to-may communication in the application.
Backend application layer is implemented using EJB (Enterprise Java Bean) in WebSphere Application Server environment.
Created Stored procedures using PL-SQL for data modification (Using DML insert, update, delete) in Oracle.
Interaction with Oracle database is implemented using Hibernate.
Designed a for business intelligence module of the project.
Developed stored procedures in PostgreSql to support analytical reports.
Integrated business intelligence module in the existing Reporting Framework (Java Based).
Used SAS Visual Analytics for report generation and SAS Data Integrator for transforming data from OLTP to OLAP environment.

Environment: J2EE, EJB, WebServices, XML, XSD, RUP, Microsoft Visio, Clear Case, Source Integrity, Oracle 10g, WebSphere 10.3, JMS, SOA, LDAP, RAD, LOG4j, Servlets, JSP, Unix, Struts 2.0, Hibernate, Informatica, SAS.

We provide IT Staff Augmentation Services!

Hadoop Developer Resume

Orlando, FL

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship