- Oracle Certified Java Programmer with more than 9+years of IT experience in developing, delivering of software using wide variety of technologies in all phases of the development life cycle. Expertise in Java /Big data technologies as an engineer, proven ability in project based leadership, teamwork and good communication skills.
- Very Good Knowledge inObject - oriented concepts with complete software development life cycle experience - Requirements gathering, Conceptual Design, Analysis, Detail design, Development, Mentoring, System and User Acceptance Testing.
- Hands-on development and implementation experience in Big Data Management Platform (BMP) using HDFS, Map Reduce, Spark, Hive, Pig, Oozie, Apache Kite and other Hadoop related eco-systems as a Data Storage and Retrieval systems.
- In depth knowledge on Spark core, SQL, Streaming and MLib API.
- Experience in Data transfer from structured data stores to HDFS using Sqoop.
- Experience in writing Map Reduce programs to perform Data processing and analysis.
- Experience in analyzing data with Hive and Pig.
- Experience in using Oozie for managing Hadoop jobs.
- Experience in cluster coordination using Zookeeper.
- Experience in loading logs from multiple sources directly into HDFS using Flume.
- Developed Batch Processing jobs using Java Map Reduce, Pig and Hive.
- Good Knowledge and experience in Hadoop Administration.
- In depth knowledge on MapReduce API.
- Experience in Installation, Configuration, Testing, Backup, Recovery, Customizing andMaintenance.
- Experience in using Flume to load logs files into HDFS.
- Expertise in using Oozie for configuring job flows.
- Performed Importing and exporting data into HDFS and Hive using Sqoop.
- Experience in designing both time driven and data driven automated workflows using Oozie.
- Hadoop security and access controls (Kerberos, Active Directory)
- Hadoop cluster integration with Nagios and Ganglia.
- Experience in Installing, Upgrading and Configuring Redhat Linux 3.x, 4.x, 5.x using Kickstart Servers and Interactive Installation.
- Strong experience in RDBMS technologies like MySQL, Oracle and Teradata
- Expertise in implementation and designing of disaster recovery plan for Hadoop cluster
- Experience in scripting for automation, and monitoring using Shell & Perl scripts.
- Experience on Puppet and Chef
- Sound understanding of IT Infrastructure Administration with project management skills.
- Good understanding of server hardware and hardware Architecture.
- Team player with good management, analytical, communication and interpersonal skills.
- Technical professional with management skills, excellent business understanding and strong communication skills.
- Good Understanding of distributed systems and parallel processing architectures.
- Excellent verbal and written communication skills.
- Hands on experience in Agile and Scrum methodologies.
- Extensively development experience in different IDE’s like Eclipse, NetBeans, Forte and STS.
Languages: Java, Scala, C, C++, SQL, PL/SQL
Big Data Framework: Hadoop, Spark, MapReduce, YARN, HDFS, Hive, Pig, Flume, Impala, Oozie, Kafka and Sqoop
Databases: Oracle 8i/9i/10g, MySQL
IDE Tools: Eclipse 3.3, NetBeans 6, STS 2.0
Version Control Tools: CVS, SVN
Operating Systems: Windows XP/2000/NT, Linux, UNIX
Tools: Ant, Maven, WinSCP, Putty
Sr Hadoop Developer
Confidential, Charlotte, NC
- Developed end to end-generic components (like Data Sourcing and Data Integrity (for Delimited, External DB), Adjustments, Change Data Capture, Data Quality, Sequence Generator) in Spark to serve the requirements of various Risk application teams in bank.
- Developed a generic Hadoop Logger API (in Java, Hive and Pig UDFs writing the logs as an Avro file on which an external hive table to see the logs, instead of Job Tracker), which is used wide across the bank and playing a key role during audits.
- Developed end to end Operational Control Module (OCM) using Impala to track the data flow in the risk application
- Playing a key role in setting up the data lake called Credit Risk Platform (CRP), a common platform for both retail and wholesale risk data.
- Involved in the development of migrating the existing distribution point from Netezza to Hadoop.
- Developed sourcing framework to source various kinds of flat files.
- Moving data from Netezza to HDFS and vice-versa using SQOOP.
- Part of Enterprise Credit Risk Core Systems Engineering team, which plays a key role in developing the generic components to risk applications.
- Developed Hive queries and UDFS to analyze/transform the data in HDFS.
- Understanding the existing Talend source code and transforming the application functionality in to Hadoop.
- Developed Autosys JIL scripts to launch the dependent jobs
- Extensively involved in developing wrapper scripts in Unix shell scripting for quacking the main jobs.
Environment: Java 8, Scala, CDH 5.x, Spark, Hive, Pig, Kafka, MapReduce, Impala, Oozie, Oracle, Sqoop, Flume, Talend, Autosys.
Confidential, San Mateo, CA
- Involved in extracting customer’s big data from various data sources into Hadoop HDFS.This included data frommainframes, databases and also log data from servers.
- Used Sqoop to efficiently transfer data between databases and HDFS and used Flume to stream the log data from servers.
- Developed Map Reduce programs to cleanse the data in HDFS obtained from heterogeneous data sources to make it suitable for ingestion into Hive schema for analysis.
- The Hive tables created as per requirement were internal or external tables defined with appropriate static and dynamic partitions, intended for efficiency.
- Implemented partitioning, bucketing in Hive for better organization of the data.
- Developed UDFs in Pig and Hive
- Used Oozie workflow engine to manage interdependent Hadoop jobs and to automate several types of Hadoop jobs such as Java map-reduce, Hive and Sqoop as well as system specific jobs.
- Worked with BI teams in generating the reports on Tableau
- Installed and configured various components of Hadoop ecosystem and maintained their integrity.
- Implemented Fair scheduler on the job tracker to allocate the fair amount of resources to small jobs.
- UpgradedHadoop Versionsusing automation tools.
- Deployed high availability on the Hadoop cluster quorum journal nodes.
- Implemented automatic failover zookeeper and zookeeper failover controller.
- Managing and Supporting Infrastructure
- Monitoring and Debugging Hadoop jobs/Applications running in production.
- Worked on Providing User support and application support on Hadoop Infrastructure.
- Reviewing ETL application use cases before on boarding to Hadoop.
- Worked on Evaluating, comparing different tools for test data management with Hadoop.
- Helped and directed testing team to get up to speed on Hadoop Application testing.
- Worked on Installing 20 node UAT Hadoop cluster
Environment: Java 7, CDH, Hive, Pig, Map Reduce, Oozie, Sqoop, Flume, Tableau, Eclipse, Putty, WinSCP
Confidential, Beaverton, OR
- Developed Map Reduce jobs in java for data cleansing and preprocessing.
- Moving data from Oracle to HDFS and vice-versa using SQOOP.
- Collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis
- Worked with different file formats and compression techniques to determine standards
- Developed Hive queries and UDFs to analyze/transform the data in HDFS.
- Developed Hive scripts for implementing control tables logic in HDFS.
- Designed and Implemented Partitioning (Static, Dynamic), Buckets in HIVE.
- Developed Pig scripts and UDF’s as per the Business logic.
- Analyzing/Transforming data with Hive and Pig.
- Developed Oozie workflows and they are scheduled through a scheduler on a monthly basis.
- Designed and developed read lock capability in HDFS.
- Implemented Hadoop Float equivalent to the Oracle Decimal.
- Involved in End to End implementation of ETL logic.
- Effective coordination with offshore team and managed project deliverable on time.
- Worked on QA support activities, test data creation and Unit testing activities.
Environment: Java 6, CDH, Hive, Pig, Map reduce, Oozie, Oracle, Sqoop, Flume.
- Worked on Providing Search functionality based on the Employee Responsibility.
- Worked on integrating Antivirus while uploading the document into HVM.
- Implemented Weekly Reports and Daily Reports for any Employee based on the roles.
- Provided timely Support to the client by fixing bugs and resolving Performance issues.
- Provided assistance for III Party Integration.
- Involved in the complete SDLC software development life cycle of the application from requirement analysis to testing.
- Developed the modules based on struts MVC Architecture.
- Created Business Logic using Servlets, Session beans and deployed them on Weblogic server.
- Used MVC struts framework for application design.
- Created complex SQL Queries, PL/SQL Stored procedures, Functions for back end.
- Prepared the Functional, Design and Test case specifications.
- Involved in writing Stored Procedures in Oracle to do some database side validations.
- Performed unit testing, system testing and integration testing
- Developed Unit Test Cases. Used JUNIT for unit testing of the application.
- Provided Technical support for production environments resolving the issues, analyzing the defects, providing and implementing the solution defects. Resolved more priority defects as per the schedule.