Technology Lead/Scrum Architect Resume

SUMMARY

9 years of experience in IT industry which includes 36 months of experience in development using Big Data and Hadoop ecosystem tools.
Extensive domain expertize in Healthcare Insurance.
Excellent understanding of HDFS architecture and strong knowledge on Mapreduce2/YARN, MapR, Tez, Spark frameworks.
Experience in installing and configuring Cloudera distributions on Linux clusters.
Experience in Hive, Pig scripts and extended core functionality by writing UDFs in Java.
Performed importing and exporting data into HDFS and Hive using SQOOP.
Experience with working with HBase.
Strong understanding of Hadoop data storage formats like Sequence files, ORC, Avro and compression codec.
Experience in working with Cloudera and Hortonworks distribution.
Hortonworks HDP Certified Developer (HDPCD).
Excellent experience in code development using Java.
Good experience in designing, developing database to create its Objects like Tables, Stored Procedures, Triggers and Cursors using PL/SQL.
Good experience in Data Modeling, understanding of business dataflow and data relation.
Proficient in analyzing and translating business requirements to functional/technical requirements and application architecture.
Experience in requirement gathering, analysis, planning, designing, coding and unit testing.
Strong problem solving skills, good communication, interpersonal skills and a good team player.
Experience in working on agile methodology.

TECHNICAL SKILLS

Languages: Java, PL/SQL, Scala and Python.

Big Data Eco System: HDFS, MapReduce 1.x, 2.x, Hive, Pig, Sqoop, HBase, Oozie, Flume, Kafka and Apache Spark.

Scripting Languages: VBScript, Shell Script

Operating system: Windows, Linux and AIX

DBMS / RDBMS: Oracle 11g, SQL Server

Tools: MS SQL Server, SSIS/DTS, Splunk, BMC Control - M, Informatica, Talend

Version Control: SubVersion, CVS

PROFESSIONAL EXPERIENCE

Confidential

Technology Lead/Scrum Architect

Environment: Linux, MapR, Talend, Hadoop, SQOOP, PIG, Hive, HBase, Spark, SQL Server, Splunk

Responsibilities:

Design, Develop and Unit Test Provision framework to transfer data between HDFS to downstream systems like SQL Server, Analytical tools.
Created SQOOP scripts for the provision framework.
Created Spark jobs in Scala for provision framework to handle different record delimiters.
Design and Develop the framework to do the dimension and fact table processing.
Designed HBase tables and created a process to bulk load the dimension data to the HBase.
Create Spark jobs to integrate HBase dimension data and HDFS/Hive fact data and perform lookup process.
Created Hive Scripts to joins multiple source files to generate files for Input to the SQOOP scripts.
Created PIG Scripts to transform the data according to the Business rules.
Worked with solution architects to design the Project level architecture.
Provided technical leadership to the three agile based scrum teams.
Used Subversion as version control.

Confidential

Sr Hadoop Developer

Environment: Linux, Cloudera, HDFS, SQOOP, PIG, Hive, SQL Server, DB2, Splunk

Responsibilities:

Analyzed business requirement and developed framework.
Install and configure Cloudera distribution on Linux clusters.
Develop MapReduce Jobs in java for data cleansing and preprocessing.
Setup monitoring solution for Hadoop services on clusters
Experienced in managing and reviewing Hadoop log file.
Managed Hadoop clusters include adding and removing cluster nodes for maintenance and capacity needs.
Extensively used Apache Sqoop for efficiently transferring bulk data between Apache Hadoop and relational databases (SQL Server,Mainframe,db2)
Designed and develop PIG scripts to process data in HDFS and load into HIVE tables.
Analyzed different datasets and created HiveQL queries.
Extending Hive and Pig core functionality by writing UDFs.
Designed and developed Oozie workflows for MapReduce, Hive and Sqoop jobs.
Developed business application and involved in automating the process.
Implemented splunk for real time searches and developed dashboards.
Used Subversion as version control.

Confidential

Hadoop Developer

Environment: Linux, HDFS, SQOOP, PIG, Hive, SQL Server, DB2

Responsibilities:

Implemented two node clusters as POC on the Hadoop for data processing.
Installed and configured eco system tools PIG, Hive, Sqoop.
Worked as data modeler to understand existing sql server data objects and map it to the HIVE data types.
Created Hive tables and set up HIVE as primary metadata source.
Design and developed SQOOP jobs to import the claim and member data to the HDFS.
Import data from SQL Server and DB2.
Run Hive to analyze the claim data.
Schedule SQOOP jobs using shell scripts
Involved in requirement gathering/analysis, Design, Development, Testing and Production rollover of Reporting and Analysis projects.
Design and Develop monitoring apps using Java, PL\SQL. This will be extensively used to alert any slowness, job failures, etc.

Confidential

Data Analyst

Environment: Java, PL/SQL, Informatica, BMC Control-M, SQL Server, SSIS

Responsibilities:

Resolving all application issues with regards to Claim processing engine QNXT.
Identify claims that are pending in the system and analyze them.
Set up claim processing rules based on CMS guidelines.
Created queries to report data to the business.
Analyzed accuracy of data.
Design and developed SSIS packages to export claim data and send it to reporting systems.
Providing cycle support to the QNXT application through BMC Control-M Tool.
Troubleshoot and resolve DTS package failures.