Sr. Hadoop Developer Resume
Charlotte, NC
SUMMARY:
- Experience in all phases of Software Development Life Cycle (SDLC) including analysis, specification, software and database administration, development, maintenance, testing and documentation
- Hands on experience with implementation of Bigdata ecosystem which includes Hadoop MapReduce, NoSQL, Spark, Python, Hive, Impala, Sqoop, Flume, Kafka, Azure, Oozie
- Proficient knowledge and hands on experience with writing shell scripts in Linux
- Experience in analysing the data using Hive User Defined Functions
- Expertise in all layers of Hadoop Framework - Storage (HDFS), Analysis (Spark, Hive and Impala), Engineering (Oozie Jobs and Workflows)
- Extensive usage of Sqoop, Flume, Oozie for data ingestion into HDFS & Hive warehouse
- Hands on performance improvement techniques for data processing in Hive, Impala, Spark using methods like dynamic partitioning, bucketing, file compression.
- Expertise in Cloudera & Hortonworks distributions
- Experience with different data formats like Json, Avro, parquet, RC and ORC formats and compressions like snappy & bzip
- Hands on experience in Spark architecture and its integrations like Spark SQL, Data set, Data Frames
- Worked on Spark for fastening the execution time of current processing in Hadoop utilizing Spark Context, Spark-SQL, Data Frames and RDDs
- Hands on experience with Real time streaming using Flume, Kafka & Spark into HDFS
- Hands on experience with Spark Application development using Python
- Worked on On-Prem servers to Azure Cloud migration POC
- Excellent ETL proficiency which includes Data Extraction, Transformation and Loading, Database Modeling and Data Warehouse tools and technologies
- Worked extensively with Dimensional Modeling (Data Marts, Facts and Dimensions), Data Migration, loading high volume of data, Data Cleansing, ETL Processes
- Strong technical expertise in creating and maintaining database objects - Tables, Views, Materialized Views, Indexes, Sequences, Synonyms, Database Links and Database Directories
- Expertise in Performance Tuning and Query Optimization using various types of Hints, Partitioning, Bulking techniques and Indexes
- Working knowledge on different data sources ranging from Flat files, Excel, Oracle, SQL server and DB2 databases
- Expertise in Data Migration, Data loading and Exporting using Import Export, *SQL Loader and UTL File Utilities
- Strong team player, ability to work independently and in a team as well, ability to adapt to a rapidly changing environment, commitment towards learning, Possess excellent communication, project management, documentation, interpersonal skills
TECHNICAL SKILLS:
Big Data: Hadoop, HDFS, MapReduce, Hive, Impala, Tez, Spark, Sqoop, Pig, HBase, Flume, Kafka, Oozie
Database: Oracle, SQL Server, Netezza, DB2, HBase, Mongo DB
Languages: SQL, HTML, Java, Python, UNIX Shell Scripting
ETL/BI Tools: SSIS, Informatica 8.x, OBIEE 10.1.3.x / 11.1.1.x, SSRS
Query Tools: SQL Developer, SQL Navigator, SQL* Plus, TSQL, SQL Explorer, TOAD, SQL Loader
Operating Systems: Windows Server 98/2000/2003, Solaris7.0/8.0, UNIX, LINUX
Packages: MS Office Suite, MS Visio, MS Project Professional2003
Other Tools: PyCharm, Autosys, Syncsort, Anthill, Maven, GitHub, Tortoise SVN, TIBCO DataSynapse, IBM Symphony
PROFESSIONAL EXPERIENCE:
Confidential, Charlotte, NC
Sr. Hadoop Developer
- Highly knowledgeable in end to end functioning of Capital Markets - Counterparty Credit Risk Management Process.
- Worked on Potential Future Exposure of Collateral Securities and Asset Management project; using multiple source systems like ENDUR, CALYPSO, FENICS, GMI OTC, BROADRIDGE, etc.
- Worked on designing, development and delivery of software solutions in the FX Derivatives group for use by business users.
- Coordinated with Business Team and Source Teams to get the requirements
- Used SQOOP to import data from RDBMS source system, Spark for data cleansing and loaded data into Hive staging and base tables. Developed permanent connectors to automate this process
- Handled different kinds of files types like JSON, XML, Flat Files and CSV by using appropriate SERDES or Parsing logic for loading into Hive tables
- Implemented Hive UDF's and did performance tuning using Partitioning and bucketing for better results
- Analysed the data by using Impala queries and Spark to view transaction information and validate the data
- Implemented optimized map joins to get data from different sources to perform cleaning operations before applying the algorithms
- Implemented Spark job to extract data from RDBMS systems which reduced the job process time
- Implemented test scripts to support test driven development and continuous integration
- Developed workflow in OOZIE to manage and schedule jobs on Hadoop cluster for extracting data from source on daily, weekly, monthly, quarterly and annually basis
- Used Autosys scheduler to schedule the OOZIE workflows
- Analysed the production jobs in case of abends and fixing the issues
- Reduced the daily batch cycle time for some systems to less than 30 mins using forks and join concept in Oozie
- Worked on POC to migrate existing big data platform (on premises Cloudera) to Azure
- Exported data from Hive table to Netezza using SQOOP
- Created staging tables, developed work flow to extract data from different source systems in Hadoop and load data into these tables. The data from these staging table is exported using SFTP to third party system to execute data models
- Used JIRA, TFS and KANBAN to update tasks, code check in, code deployment and maintain documentation
- Worked in Agile development environment in monthly sprint cycles by dividing and organizing tasks. Participated in daily scrum and other design related meetings
- Participated in CDH updates and tested the regions once they are done
- Used SQL explorer, TOAD to view source data in DB2, Oracle, Netezza
Environment: Hadoop, CDH, Map Reduce, Hive, Pig, SQOOP, Kafka, Java, Spark, OOZIE, Python, UNIX, Shell scripting, RDBMS, Azure, Autosys
Confidential, Charlotte, NC
Sr. Hadoop Developer
Responsibilities:
- Implemented Hive Ad-hoc queries to handle Member hospital data from different data sources such as Epic and Centricity
- Implemented Hive UDF's and did performance tuning for better results
- Analysed the data by performing Hive queries and running Pig Scripts to study patient and practitioner behaviour
- Implemented optimized map joins to get data from different sources to perform cleaning operations before applying the algorithms
- Experience in using Sqoop to import and export the data from Netezza and Oracle DB into HDFS and HIVE
- Implemented POC to introduce Spark Transformations
- Worked in transforming data from map reduce into HBase on bulk operations
- Implemented CRUD operations on HBase data using thrift API to get real time insights
- Developed workflow in Oozie to manage and schedule jobs on Hadoop cluster for generating reports on nightly, weekly and monthly basis
- Implemented test scripts to support test driven development and continuous integration
- Used JIRA and Confluence to update tasks and maintain documentation
- Worked in Agile development environment in sprint cycles of two weeks by dividing and organizing tasks. Participated in daily scrum and other design related meetings
- Used SQOOP to export the analysed data to relational database for analysis by data analytics team
Environment: Hadoop, CDH, Map Reduce, Hive, Pig, Sqoop, Flume, HBase, Java, Spark, Oozie, Linux, Python, UNIX
Confidential, Charlotte, NC
Sr. SQL Developer
Responsibilities:
- Worked in Capital Markets and gained expertise on Counterparty Credit Risk Management
- Performed Midyear and Annual CCAR exercises as per the Stress Values provided by Federal Government
- Involved in gathering and analysing the requirements; and preparing business rules
- Coordinated with the front-end design team to provide them with the necessary stored procedures, packages and necessary insight into the data
- Wrote Unix Shell Scripts to process the files on daily basis like renaming the file, extracting date from the file, unzipping the file and to remove the characters from the file before loading them into base tables
- Involved in the continuous enhancements and fixing of production problems
- Partitioned the fact tables and materialized views to enhance the performance
- Handled errors using exception handling and added check points extensively for the ease of debugging and displaying the error messages in the application
- Used Tortoise SVN to Check in all the code changes to repository and generated Build life using Anthill Pro; later deployed the code onto Grid and Coherence Machines
- Extracted data from multiple data sources such as .bin, .dat and xml files; loaded the data into stage tables using Autosys jobs and Unix Pearl/Shell scripts
- Created new simulation and valuation models for calculating exposure values used in credit risk process
- Received Shared Success Award for my contributions to the team
Environment: Oracle 9i/10g, SQL Server 2012, UNIX, HP ALM, Autosys, Tibco DataSynapse, Anthill, Tortoise SVN
Confidential
Team Lead - Oracle Development
Responsibilities:
- Interacting with users to gather Business requirements and Analysing, Designing and Developing the Data feed and Reporting systems
- Designed and developed the Package, Stored Procedures, functions efficiently in loading, validating and cleansing the data. Also worked on creating users and roles as needed for the applications
- Created Cursors, Collections and database triggers for maintaining complex integrity constraints and implementing the complex business rules
- Worked on Performance tuning using the Partitioning and indexing concepts (Local and Global indexes on partition tables)
- Created scripts to create new tables, views, queries for new enhancement in the application using TOAD to create new workflows for Company
- Performed Data Extraction, Transformation and Loading (ETL process) from Source to target systems
- Worked on Windows Batch scripting, scheduling jobs and monitoring logs
- Created UNIX Shell Scripts for Informatica feeds and to execute Oracle procedures
- Designed processes to extract, transform, and load data to the Data Mart
- Performed ETL Process by using Informatica Power Center
- Optimized SQL used in Reports and Files to improve performance drastically
- Used various transformations like aggregator, Update strategy, Lookup, Expression, Stored procedure Filter, Source Qualifier, Sequence generator, Router, Update strategy etc
Environment: Oracle 11g, Informatica, SQL, PL/SQL, SQL Loader, SQL*Plus, Autosys, Tortoise SVN, TOAD and UNIX
Confidential
Sr. Oracle Developer
Responsibilities:
- Worked closely with architects, designers and developers to translate data requirements into the physical schema definitions for Oracle
- Trapped errors such as INVALID CURSOR, VALUE ERROR using exception handler
- Tuning Database and SQL scripts for Optimal Performance, Redesign and build the schemas to meet Optimal Performance measures
- Extensively Involved in Performance tuning and Optimization of the SQL queries
- Wrote Stored Procedures, Functions, Packages and triggers using SQL to implement business rules and processes and also performed code debugging using TOAD
- Set up batch and production jobs through Autosys
- Created Shell scripts to access and setup runtime environment, and to run Oracle Stored Procedures, Packages
- Executed Batch files for loading database tables from flat files using SQL*loader
- Created UNIX Shell Scripts for automating the execution process
Environment: Oracle 9i, SQL, PL/SQL, TOAD, Unix, SQL Server 2003, XML
Confidential
Senior Systems Engineer - Oracle Development
Responsibilities:
- Created database objects such as tables, views, indexes, sequences using Oracle9i.
- Developed SQL stored procedures, functions and packages
- Fine-tuned procedures for maximum efficiency in various schemas across databases using Oracle Hints, execution plan
- Worked as Production support for the backend database and reporting applications
- Created Debug Triggers and Breakpoints
- Developed row level and statement level triggers for auditing tables
- Wrote complex queries involving multiple-joins to generate user reports
- Tested, installed, documented and provided on call support for the applications
- Received Client Appreciation Award for my contributions to the project
Environment: Oracle 9i, SQL, PL/SQL, TOAD, Unix