Hadoop Developer Resume
Englewood Cliff, NJ
SUMMARY
- 7+ years of IT experience in software Development and support wif experience in developing strategic methods for deploying big data technologies specifically Hadoop to efficiently solve Big Data processing requirement.
- 3 years of hands on experience in Hadoop Framework and its ecosystem including but not limited to Hadoop Map Reduce, HDFS, H base, Zoo Keeper, Oozie, Hive, Cassandra, Sqoop, Pig, Flume, Avro, Thrift etc.
- Worked extensively on Insurance, Communication,Healthcare, Telecom industry domains.
- Highly skilled in Planning, Designing, developing and deploying big projects
- Familiarity wif "productionalizing" Hadoop applications (e.g. administration, configuration management, monitoring, debugging, and performance tuning).
- Hands on using Sqoopto import and export data into HDFS from RDBMS and vice - versa.
- Well Experienced in analyzing data using HiveQL, Pig Latin, HBase and custom MapReduceprograms in Java and Using custom UDF’s to extend HIVE and PIG core functionality..
- Experience in application development using teh technologies Java, RDBMS, Linux/Unix shell scripting and Linux internals.
- Excellent knowledge on Hadoop Architecture and ecosystems such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce.
- Used different Hive Serde's like Regex Serde and HBaseSerde.
- Extensive knowledge on data serialization techniques like AVRO, sequence files, JSON, SerDe.
- Excellent understanding and knowledge of NoSQL databases like MongoDB, HBase, and Cassandra.
- Experience in managing Hadoop clusters using Cloudera Manager tool.
- Good practical understanding of technologies like MapR, Solr, ElasticSearch.
- Experienced in integration of various data sources like Java, RDBMS, Shell Scripting, Spreadsheets, and Text files.
- Experience in designing both time driven and data driven automated workflows using Oozieorder to run jobs of Hadoop MapReduce and Pig.
- Worked on debugging tools such as Dtrace, Truss and Top.
- Experienced in setting up SSH, SCP, SFTP connectivity between UNIX hosts.
- Development experience wif Java/J2EE applications including JSP, EJB, Servlets, JDBC, Java Beans, HTML, JavaScript, XML, DHTML, CSS, complex SQL queries, Web Services, SOAPand data analysis.
- Familiarity in working wif popular frameworks like Hibernate, spring and MVC.
- Well experienced in using application servers like WebLogic, Web Sphere and Java tools in client server
- Work experience wif cloud infrastructure like Amazon Web Services (AWS).
- Followed Test driven development of Agile, Water Fall and RUP Methodology to produce high quality software
- Strong oral and written communication, initiation, interpersonal, learning and organizing skills matched wif teh ability to manage time and people TEMPeffectively.
TECHNICAL SKILLS
Hadoop/Big Data: HDFS, MapReduce, HBase, Pig, Hive, Sqoop, Flume, Avro and Oozie
NoSQL Databases: HBase, Cassandra, MongoDB
Java & J2EE/Frameworks: Java Servlets. Junit, Java Database Connectivity (JDBC), J2EE, JSP, Spring, Hibernate, AJAX.
IDE Tools: Eclipse, Cygwin, Putty languages: C, C++, Java, Python, Linux shell scripts
Databases: Oracle 11g/10g/9i, MySQL, PlSQL,DB2, MS-SQL Server, Teradata
Operating Systems: Windows, Macintosh, Ubuntu (Linux), RedHat
Web Technologies: HTML, XML, JavaScript, JSP, JDBC
Testing: HIVE Testing, HADOOP Testing, Quality Center (QC), MR Unit Testing, Junit Testing
ETL Tools: Informatica, Pentaho
PROFESSIONAL EXPERIENCE
Confidential - Englewood Cliff, NJ
Hadoop Developer
Responsibilities:
- Developed data pipeline using Flume, Sqoop, Pig and Java map reduce to ingest customer behavioral data and financial histories into HDFS for analysis.
- Worked on importing and exporting data from Oracle and DB2 into HDFS and HIVE using Sqoop for analysis, visualization and to generate reports.
- Developed multiple MapReduce jobs in java for data cleaning.
- Developed Hive UDF to parse teh staged raw data to get teh Hit Times of teh claims from a specific branch for a particular insurance type code.
- Schedule these jobs wif workflow engine like Oozie. Actions can be performed both sequentially and parallely using Oozie.
- Built wrapper shell scripts to hold these Oozieworkflow.
- Involved in collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis.
- Involved in creating Hadoop streaming jobs using Python.
- Provided ad-hoc queries and data metrics to teh Business Users using Hive, Pig.
- Developed PIG Latin scripts to extract teh data from teh web server output files to load into HDFS.
- Used Pig as ETL tool to do transformations, event joins and some pre-aggregations before storing teh data onto HDFS.
- Worked on MapreduceJoins in querying multiple semi-structured data as per analytic needs.
- Used Hive to analyze teh partitioned and bucketed data and compute various metrics for reporting.
- Created many Java UDF and UDAFs in hive for functions dat were not preexisting in Hive like teh rank, Csum, etc.
- Used Hive and created Hive tables and involved in data loading and writing Hive UDFs.
- Developed POC for Apache Kafka.
- Do various performance optimizations like using distributed cache for small datasets, partition and bucketing in hive, doing mapside joins etc..
- Storing and loading teh data from HDFS to Amazon S3 and backing up teh Namespace data into NFS Filers.
- Created concurrent access for hive tables wif shared and exclusive locking dat can be enabled in hive wif teh halp of Zookeeper implementation in teh cluster.
- Wrote teh shell scripts to monitor teh health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
- Familiarity wifNoSQL databases including HBase, MongoDB.
- Wrote shell scripts for rolling day-to-day processes and it is automated.
Environment: Hadoop, MapReduce, Hive, HDFS, PIG, Sqoop, Oozie, Cloudera, Flume, HBase, ZooKeeper, CDH3, MongoDB, Cassandra, Oracle, NoSQL and Unix/Linux,Kafka, Amazon web services.
Confidential - Madison, WI
Hadoop Developer
Responsibilities:
- Developed data pipeline using Sqoop, Pig and Java Map Reduce to collect viewer pattern data and historical watching patterns data into HDFS for analysis.
- Collected teh logs data from web servers and integrated in to HDFS using Flume.
- Setup and Management of teh N-node Hadoop Cluster including institution of TEMPeffective Monitoring and Alerting architecture using Ganglia, Nagios.
- Worked extensively in creating MapReduce jobs to power data for search and aggregation.
- Experienced in analyzing data wif Hive Query Language and Pig Latin Script.
- Installed and configured Hive and also written Hive UDF's to analyze teh Data & Involved in creating Hive tables, loading wif data and writingHive queries which will run internally in map reduce way.
- Developed MapReduce programs to store Data in GPFS and data cleaning and transformation.
- Developed and optimized Pig and Hive UDFs (User-Defined Functions) to implement teh functionality of external languages as and when required.
- Followed Pig and Hive best practices for tuning.
- Involved in Cluster coordination services through Zookeeper and Adding new nodes to an existing cluster.
- Supported MapReduce programs those are running on teh cluster & Developed Java UDF's for operational assist.
- Tested teh scripts in local mode before running them against teh cluster
- Created HBasetables to store variable data formats of PII data coming from different portfolios.
- Scheduled Oozie workflow engine to run multiple Hive and Pig jobs, which independently run wif time and data availability.
- Exported teh analyzed data to teh relational databases using Sqoop for visualization and to generate reports for teh BI team.
- Worked on different file formats like Text files, Sequence Files, Avro, Record columnar files (RC).
- Developed several shell scripts, which acts as wrapper to start these Hadoop jobs and set teh configuration parameters.
- Implemented Fair schedulers on teh Job tracker to share teh resources of teh Cluster for teh MapReduce jobs given by teh users.
- Developed workflow in Oozie to automate teh tasks of loading teh data into HDFS and pre-processing wif Pig.
- Moved data from Hadoop to Cassandra using Bulk output format class.
- Implemented teh Change Data capture (CDC) in Hive
- Used Compression Techniques (snappy) wif file formats to leverage teh storage in HDFS.
- Writing Java UDF in Hive and PIG, which are not present in teh Hadoop stack.
- Worked on custom Pig Loaders and Storage classes to work wif a variety of data formats such as JSON, Compressed CSV, etc.
- Automated teh work flow using shell scripts.
Environment: MapReduce, HDFS Sqoop, Flume, LINUX, Oozie, Hadoop, Pig, Hive, Hbase, Cassandra, Hadoop Cluster, Amazon Web Services
Confidential - Cambridge, MA
Java/Hadoop Developer
Responsibilities:
- Exported data from DB2 to HDFS using Sqoop.
- Developed MapReduce jobs using Java API.
- Installed and configured Pig and also wrote Pig Latin scripts.
- Wrote MapReduce jobs using Pig Latin.
- Developed workflow using Oozie for running MapReduce jobs and Hive Queries.
- Worked on Cluster coordination services through Zookeeper.
- Worked on loading log data directly into HDFS using Flume.
- Involved in loading data from LINUX file system to HDFS.
- Responsible for managing data from multiple sources.
- Experienced in running Hadoop streaming jobs to process terabytes of xml format data.
- Responsible to manage data coming from different sources.
- Assisted in exporting analyzed data to relational databases using Sqoop.
- Implemented JMS for asynchronous auditing purposes.
- Created and maintained Technical documentation for launching Cloudera Hadoop Clusters and for executing Hive queries and Pig Scripts
- Experience in defining, designing and developing Java applications, specially using Hadoop Map/Reduce by leveraging frameworks such as Cascading and Hive.
- Experience in Develop monitoring and performance metrics for Hadoop clusters.
- Experience in Document designs and procedures for building and managing Hadoop clusters.
- Strong Experience in troubleshooting teh operating system, maintaining teh cluster issues and also java related bugs.
- Successfully loaded files to Hive and HDFS from Mongo DB Solar.
- Experience in Automate deployment, management and self-serve troubleshooting applications.
- Define and evolve existing architecture to scale wif growth data volume, users and usage.
- Design and develop JAVA API (Commerce API) which provides functionality to connect to teh Cassandra through Java services.
- Installed and configured Hive and also written Hive UDFs.
- Experience in managing teh CVS and migrating into Subversion.
- Experience in managing development time, bug tracking, project releases, development speed, release forecast, scheduling and many more.
Environment: Hadoop, HDFS, Hive, Flume, Sqoop, HBase, PIG, Eclipse, MySQL and Ubuntu, Zookeeper, Java (JDK 1.6)
Confidential - Atlanta, GA
Java/J2EE Developer
Responsibilities:
- Responsible for gathering business and functional requirements for teh development and support of in-house and vendor developed applications
- Gathered and analyzed information for developing, supporting, and modifying existing web applications based on prioritized business needs
- Played key role in design and development of new application using J2EE, Servlets, and Spring technologies/frameworks using Service Oriented Architecture (SOA)
- Wrote Action classes, Request Processor, Business Delegate, Business Objects, Service classes and JSP pages
- Played a key role in designing teh presentation tier components by customizing teh Spring framework components, which includes configuring web modules, request processors, error handling components, etc.
- Implemented teh Web Services functionality in teh application to allow external applications to access data
- Used Apache Axis as teh Web Service framework for creating and deploying Web Service Clients using SOAP and WSDL
- Worked on Spring to develop different modules to assist teh product in handling different requirements
- Developed validation using Spring's Validation Interface and used Spring Core and MVC develop teh applications and access data
- Implemented Spring Beans using IOC and Transaction management features to handle teh transactions and business logic
- Design and developed different PL/SQL blocks, Stored Procedures in DB2 database
- Involved in writing DAO layer using Hibernate to access teh database
- Involved in deploying and testing teh application using Websphere Application Server
- Developed and implemented several test cases using JUnitframework
- Involved in troubleshoot technical issues, conduct code reviews, and enforce best practices.
Environment: Java SE 6, J2EE 6, JSP 2.1, Servlets 2.5, Java Script, IBM Websphere7, DB2, HTML, XML, Spring 3, Hibernate 3, JUnit, Windows 7, Eclipse 3.5,AJAX, CSS, Javascript.
Confidential -Madison, WI
Java/J2EE Developer
Responsibilities:
- Involved in various phases of Software Development Life Cycle (SDLC) as design development and unit testing.
- Developed and deployed UI layer logics of sites using JSP, XML, JavaScript, HTML/DHTML, and Ajax.cc
- Designed different design specifications for application development dat includes front-end, back-end using design patterns.
- Developed proto-type test screens in HTML and JavaScript.
- Involved in developing JSP for client data presentation and, data validation on teh client side wif in teh forms.
- Developed teh application by using teh Spring MVC framework.
- Collection framework used to transfer objects between teh different layers of teh application.
- Developed data mapping to create a communication bridge between various application interfaces using XML, and XSL.
- Spring IOC being used to inject teh parameter values for teh Dynamic parameters.
- Developed JUnittesting framework for Unit level testing.
- Actively involved in code review and bug fixing for improving teh performance.
- Documented application for its functionality and its enhanced features.
- Created connection through JDBC and used JDBC statements to call stored procedures.
- Created UML diagrams like use cases, class diagrams, interaction diagrams, and activity diagrams.
- Extensively worked on User Interface for few modules using JSPs, JavaScript and Ajax.
- Created Business Logic using Servlets, POJO's and deployed them on Web logic server.
- Wrote complex SQL queries and stored procedures.
- Developed teh XML Schema and Web services for teh data maintenance and structures.
- Implemented teh Web Service client for teh login authentication, credit reports and applicant information using Apache Axis 2 Web Service.
- Designed teh logical and phcysical data model, generated DDL scripts, and wrote DML scripts for Oracle 10g database.
- Used Hibernate ORM framework wif Spring framework for data persistence and transaction management.
- Used struts validation framework for form level validation.
- Wrote test cases in JUnit for unit testing of classes.
- Involved in creating templates and screens in HTML and JavaScript.
- Involved in integrating Web Services using SOAP.
Environment: Spring MVC, Oracle 11g J2EE, Java, JDBC, Servlets, JSP, XML, Design Patterns, CSS, HTML, JavaScript 1.2, Junit, Apache Tomcat, My SQL Server 2008.
Confidential
Java developer
Responsibilities:
- Developed teh application under JEE architecture, developed, Designed dynamic and
- Browser compatible user interfaces using JSP, Custom Tags, HTML, CSS, and JavaScript.
- Deployed & maintained teh JSP, Servlets components on Web logic 8.0
- Developed Application Servers persistence layer using JDBC and SQL.
- Used JDBC to connect teh web applications to Databases.
- Implemented Test First unit testing framework driven using Junit.
- Developed and utilized J2EE Services and JMS components for messaging communication in Web Logic.
- Configured development environment using Web logic application server for developers integration testing.
Environment: Spring MVC, Oracle 11g J2EE, Java, JDBC, Servlets, JSP, XML, DesignPatterns, CSS, HTML, JavaScript 1.2, Junit, Apache Tomcat, My SQL Server 2008, WebLogic 8.0, AJAX.