Hadoop Developer Resume PA - Hire IT People

SUMMARY:

Over 7+ years of professional IT experience which includes 2+ years of experience in Big data ecosystem related technologies like Hadoop, Pig, Hive, Sqoop, Spark, Scala and 5 years in Java and Oracle PLSQL development.
In - depth knowledge of Hadoop architecture and its components like HDFS, Name Node, Data Node, Job Tracker, Application Master, Resource Manager, Task Tracker and Map Reduce programming paradigm.
Experience in cluster planning, designing, deploying, performance tuning, administering and monitoring Hadoop ecosystem.
Commendable knowledge / experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems (RDBMS) and vice-versa.
Experience in developing Map/Reduce jobs to process large data sets utilizing the Map/Reduce programming paradigm.
Experience working with datasets coming from multiple disparate data sources and customization of SPARK on Hadoop platform.
Experience writing interactive query on large datasets.
Efficient in developing Spark topologies to process large set of data for better decision-making.
Good understanding on Functional Programming.
Used many partitioning concepts in spark code.
Have written pattern matching code using scala and used transformations in the scalacode.
Good knowledge on caching, broadcasting, accumulators, joins concepts in spark.
Good understanding of spark and scala concepts.
Used Spark SQL for adhoc data querying of data.
Developed and executed shell scripts to automate the jobs
Good understanding of cloud configuration in Amazon web services (AWS).
Involved in analysis, design, testing phases and responsible for documenting technical specifications
Experience in database design. Used PL/SQL to write Stored Procedures, Functions, Triggers and strong experience in writing complex queries for Oracle 10g/11g/12C
Proficient in writing SQL, PL/SQL stored procedures, functions, constraints, packages and triggers.
Good experience in Hive tables design, loading the data into hive tables.
Good knowledge on Hadoop Cluster architecture and monitoring the cluster.
Hadoop Shell commands, Writing Map reduce Programs, Verifying the Hadoop Log Files.
Exposure on Query Programming Model of Hadoop.
Experience on System Study, Analysis, of Business requirement, preparation of Technical design, UTP and UTC, Coding, Unit testing, Integration testing, System testing and Implementation.
Experience in Object Oriented Analysis and Design (OOAD) and development of software using UML methodology.
Hands on experience with Core Java with Multithreading, Concurrency, Exception Handling, File handling, IO, Generics and Java Collections.
Experience working with different SDLC methodologies like Waterfall, and Agile (SCRUM)
Extensive experience working with Oracle database.
Expertise in writing Packages, Stored Procedures, Functions, Views and Database Triggers using SQL and PL/SQL in Oracle.
Hands on experience in design, development of end user screens and reports using Oracle Developer/2000 (Forms, Reports), Forms and Reports 9i, Oracle Developer Suite 10g and other front-end tools.
Worked with query tools like Toad, SQL*Plus, SQL Developer.
Manipulated Stored Procedures, Triggers, Views, Functions and Packages using TOAD.
Experience in Performance Tuning & Optimization of SQL statements.
Good inter personnel and communication skills along with excellent team player and self-starter skills.

TECHNICAL SKILLS:

Programming: Hadoop, Map Reduce, Scala, Spark, HDFS, Hive, Pig, Java, Cassandra,SQL, PL/SQL, Oracle Forms.

Middleware: Apache Tomcat, Maven, Horton Work’s data platform.

Databases: Oracle 12C/11g/10g,HBase, Cassandra.

Querying/Reporting: PL/SQL, SQL, Forms

Oracle Tools: ANSI SQL, Oracle PL/SQL, SQL*Plus, TOAD, iSQL*Plus, SQL*Loader, Oracle procedure Builder

Development Tools: Developer 11g/10g/9i, Forms 9i/10g

Operating Systems: UNIX (Sun Solaris/HP-UX/IBM AIX), Red Hat Linux, Oracle Enterprise Linux and Windows

Tools: SQL TRACE, EXPLAIN PLAN, Eclipse, Toad, Forms D2K, HP ALM, CVS.

PROFESSIONAL EXPERIENCE:

Confidential, PA

Hadoop Developer

Responsibilities:

Used Agile methodology in developing the application, which included iterative application development, weekly Sprints, stand up meetings and customer reporting backlogs.
Worked on a live 450 nodes Hadoop cluster running on Horton
Worked with data of 2 PB in size (TB of data with replication factor of 3)
Data ingestion from Oracle, Teradata into HDFS using Sqoop.
Created and worked Sqoop jobs with incremental load to populate Hive tables.
Developed aggregations on huge flat file data using MR code
Extensive experience in writing Pig scripts to transform raw data from several data sources into forming baseline data.
Developed Hive scripts for end user / analyst requirements to perform ad hoc analysis
Very good understanding of Partitions, Bucketing concepts in Hive and designed both Managed and External tables in Hive to optimize performance
Solved performance issues in MapReduce jobs.
Developed UDFs in Java as and when necessary to use in PIG and HIVE queries
Very good experience with both MapReduce 1 (Job Tracker) and MapReduce 2 (YARN).
Very good understanding of Partitions, Bucketing concepts in Hive and designed both Managed and External tables in Hive to optimize performance
Solved performance issues in Hive and Pig scripts with understanding of Joins, Group and aggregation and how does it translate to MapReduce jobs.
Developed UDFs in Java as and when necessary to use in PIG and HIVE queries
Monitoring and supporting to the Sqoop schedule jobs on workflow scheduler.
Used github to store the code.
Optimized Map Reduce Jobs to use HDFS efficiently by using various compression mechanisms.
Used ORC and Parquet file formats in Hive.
Handled importing of data from various data sources, performed transformations using Hive, MapReduce, Spark and loaded data into HDFS.
Developed spark code using scala for faster testing and processing of data.
Responsible to migrate map reduce jobs into spark RDD transformations using Scala.
Performed advanced procedures like text analytics and processing, using the in-memory computing capabilities of Spark using Scala.
Loaded the data into Spark RDD and do in memory data Computation to generate the Output response.
Involved in converting Hive/SQL queries into SPARK TRANSFORMATIONS using Spark RDDs, and SCALA.
Used DATAFRAME API in Scala for converting the distributed collection of data organized into named columns.
Used SPARK to improve the performance and optimization of the existing algorithms in Hadoop using SPARK CONTEXT, SPARK-SQL, DATA FRAME, PAIR RDD'S, SPARK YARN.

Environment: RHEL, HDFS, Map-Reduce, Hive, Pig, Sqoop, Oozie, TeraData, Oracle SQL,SQL Server,UC4, github, Hortonworks data platform distribution,Spark,Scala, UC4.

Confidential, Oakland, CA

Hadoop Developer

Responsibilities:

Involved in Requirement gathering, Business Analysis and translated business requirements into Technical design in Hadoop and Big Data.
Developed data pipeline using Flume, Sqoop, Pig and Java Map Reduce to ingest behavioral data into HDFS for analysis.
Importing and exporting data into HDFS from database and vice versa using Sqoop.
Written hive jobs to parse the logs and structure them in tabular format to facilitate effective querying on the log data
Developed data pipeline using Flume, Sqoop, Pig and Java Map Reduce to ingest behavioral data into HDFS for analysis.
Set up, monitoring and managing the Hadoop cluster using Hortonworks data platform.
Developed workflow in Control M to automate tasks of loading data into HDFS and preprocessing with PIG.
Cluster co-ordination services through ZooKeeper.
Used Maven extensively for building jar files of Map Reduce programs and deployed to Cluster.
Created customized BI tool for manager team that perform Query analytics using HiveQL.
Created Partitions, Buckets based on State to further process using Bucket based Hive joins.
Developed suit of Unit Test Cases for Mapper, Reducer and Driver classes using MR Testing library.
Used Oozie workflow engine to manage interdependent Hadoop jobs and to automate several types of Hadoop jobs such as Java map-reduce Hive, Pig, and Sqoop.
Designed and implemented a Cassandra NoSQL based database that persists high-volume user profile data.
Migrated high-volume OLTP transactions from Oracle to Cassandra
Created Data Pipeline of Map Reduce programs using Chained Mappers.
Implemented Optimized join base by joining different data sets to get top claims based on state using Map Reduce.
Implemented map reduce programs to perform joins on the Map side using Distributed Cache in Java.
Modelled Hive partitions extensively for data separation and faster data processing and followed Pig and Hive best practices for tuning.

Environment: RHEL, HDFS, Map-Reduce, Hive, Pig, Sqoop, Flume, Oozie, Mahout, HBase, Hortonworks data platform distribution, Cassandra, UC4.

Confidential, NYC, NY

Hadoop Developer

Responsibilities:

Responsible for building scalable distributed data solutions using Hadoop
Installed and configured Hive, Pig, Sqoop, Flume and Oozie on the Hadoop cluster.
Developed Simple to complex Map/reduce Jobs using Hive and Pig
Optimized Map/Reduce Jobs to use HDFS efficiently by using various compression mechanisms
Analyzed the data by performing Hive queries and running Pig scripts to study customer behavior
Implemented business logic by writing UDFs in Java and used various UDFs from Piggybanks and other sources.
Continuous monitoring and managing the Hadoop cluster using Cloudera Manager
Developed data pipeline using Flume, Sqoop, Pig and Java map reduce to ingest customer data and purchase histories into HDFS for analysis.
Implemented optimization and performance tuning in Hive and Pig.
Developed job flows in Oozie to automate the workflow for extraction of data from warehouses and weblogs.
Used Pig as ETL tool to do transformations, event joins, filters and some pre-aggregations before storing the data onto HDFS.
Optimizing Map reduce code, pig scripts, user interface analysis, performance tuning and analysis.
Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting on the dashboard.
Loaded the aggregated data onto DB2 for reporting on the dashboard.
Implemented Partitioning and bucketing in Hive.
Experience in managing and reviewing Hadoop log files.
Extensively used Pig for data cleansing.
Configured Flume to extract the data from the web server output files to load into HDFS.
Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required
Loaded business accounts/customer details into HDFS from RDBMS Database using Sqoop.
Created POC on the existing HDFS data using Apache spark.

Environment: Hadoop, HDFS, Hive, Pig, Sqoop, Hbase, Spark, Scala, Hue, Linux, Map Reduce, Hadoop distribution of Cloudera 3 and Flume.

Confidential

Oracle PL/SQL Developer

Responsibilities:

Involved in design and development phases of Software Development Life Cycle (SDLC) using Scrum methodology.
Generated server side PL/SQL scripts for data manipulation and validation.
Involved in working with Business to automating many reports using UNIX shell scripts on daily basis.
Used Bulk Collections for better performance and easy retrieval of data, by reducing context switching between SQL and PL/SQL engines.
Coded many generic routines (as functions), which could be called from other procedures.
Worked with the change requests on user interface
Developed various Reports for the end users as per their requirements and created many reports to suit preprinted format of the company.
Created user defined Exceptions while handling exceptions.
Wrote stored procedures, Functions and Database triggers using PL/SQL.
Designed, developed and maintained data extraction and transformation processes and ensured that data is properly loaded and extracted in and out of our systems.
Identified and implemented programming enhancements
Written unit test cases for the modules in order to verify the functionality as per the requirements.

Environment: Oracle 11g, SQL, PL/SQL, .NET Framework, Visual Studio,Toad, HP ALM

Confidential

Oracle PL/SQL Developer

Responsibilities:

Involved in SDLC gathering requirements from end users.
Developed views to facilitate easy interface implementation and enforce security on critical customer information.
Involved in GUI designing using Oracle Developer 10g (Forms 10g and Report 10g).
Developed stored procedures and triggers to facilitate consistent data entry into the database.
Written Stored Procedures using PL/SQL and functions and procedure for common utilities.
Participated in system analysis and data modeling, which included creating tables, views, indexes, synonyms, triggers, functions, procedures, cursors and packages.
Created programming code using advanced concepts of Records, Collections, and Dynamic SQL.
Developed Database Triggers for audit and validation purpose.
Used PL/SQL to validate data and to populate billing tables.
Developed Installation scripts for all the deliverables.
Performed functional testing for different Oracle Forms application functionalities.
Created and manipulated stored procedures, functions, packages and triggers using TOAD.
Wrote heavy stored procedures using dynamic SQL to populate data into temp tables from fact and dimensional tables.
Involved in migrating database from oracle 9i to 10g database.
Involved in developing screens and generating reports.
Developed Forms and Reports.

Environment: Oracle 9i, 10g SQL, PL/SQL, Forms 9i, SQL*Loader, SQL Navigator, Toad, HP ALM.

Confidential

Oracle PL/SQL Developer

Responsibilities:

Created and maintained PL/SQL scripts and stored procedures.
Coded many generic routines (as functions), which could be called from other procedures.
Developed user interface screens, Master detail relations and Report screens.
Developed various Reports for the end users as per their requirements and created many reports to suit preprinted format of the company.
Created user defined Exceptions while handling exceptions.
Wrote stored procedures, Functions and Database triggers using PL/SQL.
Conducted training sessions for the end-users.
Developed the presentation layer using JSP and Servlets.
Used JavaScript for client side validations.
Creating XML based configuration, property files for application
Used JDBC to connect to the database.
Designed database tables.
Designed, developed and maintained data extraction and transformation processes and ensured that data is properly loaded and extracted in and out of our systems.
Identified and implemented programming enhancements

Environment: Oracle 9i,10g SQL*Plus, Forms D2k, Sharepoint,CVS, Core Java, Eclipse, Apache Tomcat.

We provide IT Staff Augmentation Services!

Hadoop Developer Resume

PA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship