Hadoop Developer Resume
PA
SUMMARY:
- Over 7+ years of professional IT experience which includes 2+ years of experience in Big data ecosystem related technologies like Hadoop, Pig, Hive, Sqoop, Spark, Scala and 5 years in Java and Oracle PLSQL development.
- In - depth knowledge of Hadoop architecture and its components like HDFS, Name Node, Data Node, Job Tracker, Application Master, Resource Manager, Task Tracker and Map Reduce programming paradigm.
- Experience in cluster planning, designing, deploying, performance tuning, administering and monitoring Hadoop ecosystem.
- Commendable knowledge / experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems (RDBMS) and vice-versa.
- Experience in developing Map/Reduce jobs to process large data sets utilizing the Map/Reduce programming paradigm.
- Experience working with datasets coming from multiple disparate data sources and customization of SPARK on Hadoop platform.
- Experience writing interactive query on large datasets.
- Efficient in developing Spark topologies to process large set of data for better decision-making.
- Good understanding on Functional Programming.
- Used many partitioning concepts in spark code.
- Have written pattern matching code using scala and used transformations in the scalacode.
- Good knowledge on caching, broadcasting, accumulators, joins concepts in spark.
- Good understanding of spark and scala concepts.
- Used Spark SQL for adhoc data querying of data.
- Developed and executed shell scripts to automate the jobs
- Good understanding of cloud configuration in Amazon web services (AWS).
- Involved in analysis, design, testing phases and responsible for documenting technical specifications
- Experience in database design. Used PL/SQL to write Stored Procedures, Functions, Triggers and strong experience in writing complex queries for Oracle 10g/11g/12C
- Proficient in writing SQL, PL/SQL stored procedures, functions, constraints, packages and triggers.
- Good experience in Hive tables design, loading the data into hive tables.
- Good knowledge on Hadoop Cluster architecture and monitoring the cluster.
- Hadoop Shell commands, Writing Map reduce Programs, Verifying the Hadoop Log Files.
- Exposure on Query Programming Model of Hadoop.
- Experience on System Study, Analysis, of Business requirement, preparation of Technical design, UTP and UTC, Coding, Unit testing, Integration testing, System testing and Implementation.
- Experience in Object Oriented Analysis and Design (OOAD) and development of software using UML methodology.
- Hands on experience with Core Java with Multithreading, Concurrency, Exception Handling, File handling, IO, Generics and Java Collections.
- Experience working with different SDLC methodologies like Waterfall, and Agile (SCRUM)
- Extensive experience working with Oracle database.
- Expertise in writing Packages, Stored Procedures, Functions, Views and Database Triggers using SQL and PL/SQL in Oracle.
- Hands on experience in design, development of end user screens and reports using Oracle Developer/2000 (Forms, Reports), Forms and Reports 9i, Oracle Developer Suite 10g and other front-end tools.
- Worked with query tools like Toad, SQL*Plus, SQL Developer.
- Manipulated Stored Procedures, Triggers, Views, Functions and Packages using TOAD.
- Experience in Performance Tuning & Optimization of SQL statements.
- Good inter personnel and communication skills along with excellent team player and self-starter skills.
TECHNICAL SKILLS:
Programming: Hadoop, Map Reduce, Scala, Spark, HDFS, Hive, Pig, Java, Cassandra,SQL, PL/SQL, Oracle Forms.
Middleware: Apache Tomcat, Maven, Horton Work’s data platform.
Databases: Oracle 12C/11g/10g,HBase, Cassandra.
Querying/Reporting: PL/SQL, SQL, Forms
Oracle Tools: ANSI SQL, Oracle PL/SQL, SQL*Plus, TOAD, iSQL*Plus, SQL*Loader, Oracle procedure Builder
Development Tools: Developer 11g/10g/9i, Forms 9i/10g
Operating Systems: UNIX (Sun Solaris/HP-UX/IBM AIX), Red Hat Linux, Oracle Enterprise Linux and Windows
Tools: SQL TRACE, EXPLAIN PLAN, Eclipse, Toad, Forms D2K, HP ALM, CVS.
PROFESSIONAL EXPERIENCE:
Confidential, PA
Hadoop Developer
Responsibilities:
- Used Agile methodology in developing the application, which included iterative application development, weekly Sprints, stand up meetings and customer reporting backlogs.
- Worked on a live 450 nodes Hadoop cluster running on Horton
- Worked with data of 2 PB in size (TB of data with replication factor of 3)
- Data ingestion from Oracle, Teradata into HDFS using Sqoop.
- Created and worked Sqoop jobs with incremental load to populate Hive tables.
- Developed aggregations on huge flat file data using MR code
- Extensive experience in writing Pig scripts to transform raw data from several data sources into forming baseline data.
- Developed Hive scripts for end user / analyst requirements to perform ad hoc analysis
- Very good understanding of Partitions, Bucketing concepts in Hive and designed both Managed and External tables in Hive to optimize performance
- Solved performance issues in MapReduce jobs.
- Developed UDFs in Java as and when necessary to use in PIG and HIVE queries
- Very good experience with both MapReduce 1 (Job Tracker) and MapReduce 2 (YARN).
- Very good understanding of Partitions, Bucketing concepts in Hive and designed both Managed and External tables in Hive to optimize performance
- Solved performance issues in Hive and Pig scripts with understanding of Joins, Group and aggregation and how does it translate to MapReduce jobs.
- Developed UDFs in Java as and when necessary to use in PIG and HIVE queries
- Monitoring and supporting to the Sqoop schedule jobs on workflow scheduler.
- Used github to store the code.
- Optimized Map Reduce Jobs to use HDFS efficiently by using various compression mechanisms.
- Used ORC and Parquet file formats in Hive.
- Handled importing of data from various data sources, performed transformations using Hive, MapReduce, Spark and loaded data into HDFS.
- Developed spark code using scala for faster testing and processing of data.
- Responsible to migrate map reduce jobs into spark RDD transformations using Scala.
- Performed advanced procedures like text analytics and processing, using the in-memory computing capabilities of Spark using Scala.
- Loaded the data into Spark RDD and do in memory data Computation to generate the Output response.
- Involved in converting Hive/SQL queries into SPARK TRANSFORMATIONS using Spark RDDs, and SCALA.
- Used DATAFRAME API in Scala for converting the distributed collection of data organized into named columns.
- Used SPARK to improve the performance and optimization of the existing algorithms in Hadoop using SPARK CONTEXT, SPARK-SQL, DATA FRAME, PAIR RDD'S, SPARK YARN.
Environment: RHEL, HDFS, Map-Reduce, Hive, Pig, Sqoop, Oozie, TeraData, Oracle SQL,SQL Server,UC4, github, Hortonworks data platform distribution,Spark,Scala, UC4.
Confidential, Oakland, CA
Hadoop Developer
Responsibilities:
- Involved in Requirement gathering, Business Analysis and translated business requirements into Technical design in Hadoop and Big Data.
- Developed data pipeline using Flume, Sqoop, Pig and Java Map Reduce to ingest behavioral data into HDFS for analysis.
- Importing and exporting data into HDFS from database and vice versa using Sqoop.
- Written hive jobs to parse the logs and structure them in tabular format to facilitate effective querying on the log data
- Developed data pipeline using Flume, Sqoop, Pig and Java Map Reduce to ingest behavioral data into HDFS for analysis.
- Set up, monitoring and managing the Hadoop cluster using Hortonworks data platform.
- Developed workflow in Control M to automate tasks of loading data into HDFS and preprocessing with PIG.
- Cluster co-ordination services through ZooKeeper.
- Used Maven extensively for building jar files of Map Reduce programs and deployed to Cluster.
- Created customized BI tool for manager team that perform Query analytics using HiveQL.
- Created Partitions, Buckets based on State to further process using Bucket based Hive joins.
- Developed suit of Unit Test Cases for Mapper, Reducer and Driver classes using MR Testing library.
- Used Oozie workflow engine to manage interdependent Hadoop jobs and to automate several types of Hadoop jobs such as Java map-reduce Hive, Pig, and Sqoop.
- Designed and implemented a Cassandra NoSQL based database that persists high-volume user profile data.
- Migrated high-volume OLTP transactions from Oracle to Cassandra
- Created Data Pipeline of Map Reduce programs using Chained Mappers.
- Implemented Optimized join base by joining different data sets to get top claims based on state using Map Reduce.
- Implemented map reduce programs to perform joins on the Map side using Distributed Cache in Java.
- Modelled Hive partitions extensively for data separation and faster data processing and followed Pig and Hive best practices for tuning.
Environment: RHEL, HDFS, Map-Reduce, Hive, Pig, Sqoop, Flume, Oozie, Mahout, HBase, Hortonworks data platform distribution, Cassandra, UC4.
Confidential, NYC, NY
Hadoop Developer
Responsibilities:
- Responsible for building scalable distributed data solutions using Hadoop
- Installed and configured Hive, Pig, Sqoop, Flume and Oozie on the Hadoop cluster.
- Developed Simple to complex Map/reduce Jobs using Hive and Pig
- Optimized Map/Reduce Jobs to use HDFS efficiently by using various compression mechanisms
- Analyzed the data by performing Hive queries and running Pig scripts to study customer behavior
- Implemented business logic by writing UDFs in Java and used various UDFs from Piggybanks and other sources.
- Continuous monitoring and managing the Hadoop cluster using Cloudera Manager
- Developed data pipeline using Flume, Sqoop, Pig and Java map reduce to ingest customer data and purchase histories into HDFS for analysis.
- Implemented optimization and performance tuning in Hive and Pig.
- Developed job flows in Oozie to automate the workflow for extraction of data from warehouses and weblogs.
- Used Pig as ETL tool to do transformations, event joins, filters and some pre-aggregations before storing the data onto HDFS.
- Optimizing Map reduce code, pig scripts, user interface analysis, performance tuning and analysis.
- Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting on the dashboard.
- Loaded the aggregated data onto DB2 for reporting on the dashboard.
- Implemented Partitioning and bucketing in Hive.
- Experience in managing and reviewing Hadoop log files.
- Extensively used Pig for data cleansing.
- Configured Flume to extract the data from the web server output files to load into HDFS.
- Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required
- Loaded business accounts/customer details into HDFS from RDBMS Database using Sqoop.
- Created POC on the existing HDFS data using Apache spark.
Environment: Hadoop, HDFS, Hive, Pig, Sqoop, Hbase, Spark, Scala, Hue, Linux, Map Reduce, Hadoop distribution of Cloudera 3 and Flume.
Confidential
Oracle PL/SQL Developer
Responsibilities:
- Involved in design and development phases of Software Development Life Cycle (SDLC) using Scrum methodology.
- Generated server side PL/SQL scripts for data manipulation and validation.
- Involved in working with Business to automating many reports using UNIX shell scripts on daily basis.
- Used Bulk Collections for better performance and easy retrieval of data, by reducing context switching between SQL and PL/SQL engines.
- Coded many generic routines (as functions), which could be called from other procedures.
- Worked with the change requests on user interface
- Developed various Reports for the end users as per their requirements and created many reports to suit preprinted format of the company.
- Created user defined Exceptions while handling exceptions.
- Wrote stored procedures, Functions and Database triggers using PL/SQL.
- Designed, developed and maintained data extraction and transformation processes and ensured that data is properly loaded and extracted in and out of our systems.
- Identified and implemented programming enhancements
- Written unit test cases for the modules in order to verify the functionality as per the requirements.
Environment: Oracle 11g, SQL, PL/SQL, .NET Framework, Visual Studio,Toad, HP ALM
Confidential
Oracle PL/SQL Developer
Responsibilities:
- Involved in SDLC gathering requirements from end users.
- Developed views to facilitate easy interface implementation and enforce security on critical customer information.
- Involved in GUI designing using Oracle Developer 10g (Forms 10g and Report 10g).
- Developed stored procedures and triggers to facilitate consistent data entry into the database.
- Written Stored Procedures using PL/SQL and functions and procedure for common utilities.
- Participated in system analysis and data modeling, which included creating tables, views, indexes, synonyms, triggers, functions, procedures, cursors and packages.
- Created programming code using advanced concepts of Records, Collections, and Dynamic SQL.
- Developed Database Triggers for audit and validation purpose.
- Used PL/SQL to validate data and to populate billing tables.
- Developed Installation scripts for all the deliverables.
- Performed functional testing for different Oracle Forms application functionalities.
- Created and manipulated stored procedures, functions, packages and triggers using TOAD.
- Wrote heavy stored procedures using dynamic SQL to populate data into temp tables from fact and dimensional tables.
- Involved in migrating database from oracle 9i to 10g database.
- Involved in developing screens and generating reports.
- Developed Forms and Reports.
Environment: Oracle 9i, 10g SQL, PL/SQL, Forms 9i, SQL*Loader, SQL Navigator, Toad, HP ALM.
Confidential
Oracle PL/SQL Developer
Responsibilities:
- Created and maintained PL/SQL scripts and stored procedures.
- Coded many generic routines (as functions), which could be called from other procedures.
- Developed user interface screens, Master detail relations and Report screens.
- Developed various Reports for the end users as per their requirements and created many reports to suit preprinted format of the company.
- Created user defined Exceptions while handling exceptions.
- Wrote stored procedures, Functions and Database triggers using PL/SQL.
- Conducted training sessions for the end-users.
- Developed the presentation layer using JSP and Servlets.
- Used JavaScript for client side validations.
- Creating XML based configuration, property files for application
- Used JDBC to connect to the database.
- Designed database tables.
- Designed, developed and maintained data extraction and transformation processes and ensured that data is properly loaded and extracted in and out of our systems.
- Identified and implemented programming enhancements
Environment: Oracle 9i,10g SQL*Plus, Forms D2k, Sharepoint,CVS, Core Java, Eclipse, Apache Tomcat.