Hadoop Developer Resume Profile
SUMMARY
- 7 Years of extensive experience including 1 years in BigData and BigData analytics.
- Experienced in installing, configuring Hadoop cluster of major Hadoop distributions.
- Have hands on experience in writing MapReduce jobs in Java.
- Hands on experience in installing, configuring and using ecosystem components like Hadoop MapReduce, HDFS, HBase, Zoo Keeper, Oozie, Hive, Cassandra, Sqoop, Pig, Flume, Avro and Horton Works Talend.
- Hands on Experience in working with ecosystems like Hive, Pig, Sqoop, Map Reduce.
- Strong Knowledge of Hadoop and Hive and Hive's analytical functions.
- Efficient in building Hive, pig and map Reduce scripts.
- Implemented on Hadoop stack and different big data analytic tools, migration from different databases SQL Server2008 R2, Oracle, MYSQL to Hadoop.
- Successfully loaded files to Hive and HDFS from MYSQL.
- Loaded the dataset into Hive for ETL Operation.
- Good knowledge on Hadoop Cluster architecture and monitoring the cluster.
- Good understanding of cloud configuration in Amazon web services AWS .
- Experience in using Zoo keeper and Horton works Hue and HDP.
- In-depth understanding of Data Structure and Algorithms.
- Experience in deploying applications in heterogeneous Application Servers TOMCAT, WebLogic, IBM WebSphere and Oracle Application Server.
- Strong Communication skills of written, oral, interpersonal and presentation
- Implemented Unit Testing using JUNIT testing during the projects.
- Ability to perform at a high level, meet deadlines, adaptable to ever changing priorities.
TECHNICAL SUMARY:
- Big Data :Hadoop, MapReduce, HDFS, HBase, Zookeeper, Hive, Pig, Sqoop,
- Cassandra, Oozie, Flume.
- ETL Tools : Informatica, Talend.
- Java Technologies : Java 6, Java Help API
- Methodologies : Agile, waterfall, UML, Design Patterns
- Database : Oracle 10g,11g, MySQL, SQL Server 2008 R2, HBase.
- Application Server : Apache Tomcat 5.x, 6.0.
- Web Tools : HTML, XML, DTD, Schemas.
- Tools : SQL developer, Toad, SQL Loader.
- Operating System : Windows 7, Linux Ubuntu.
- Testing API : JUNIT
Professional Experience:
Confidential
Hadoop Developer
Description : Confidential is a leader in innovative medicines that address the unmet needs and challenges of people living with debilitating diseases. As a fully integrated global biopharmaceutical company, Alkermes applies our scientific expertise, proprietary technologies, and global resources to develop products that are designed to make a meaningful difference in the way patients manage their disease.
Responsibilities:
- Setting up Amazon EC2 Servers and installing required database servers and ETL tools.
- Amazon Elastic MapReduce EMR Services for analyzing log file and data warehousing.
- Installed Oracle Server 10g and SQL Server 2008 R2.
- Importing and Exporting database dumps through Oracle Server 10g and 11g using IMPDP and EXPDP , SQL Server 2008 R2.
- Installing and deploying IBM Web-sphere.
- Install and utilized TALEND utility tool.
- Migrated Oracle Server from 10g to 11g.
- Create, validate and maintain scripts to load data from and into tables in Oracle PL/SQL and in SQL Server 2008 R2.
- Wrote Store Procedures and Triggers.
- Converting , testing and validating Oracle scripts to SQL Server.
- Upgraded IBM Maximo database from 5.2 to 7.5.
- Through understanding of the MIF Maximo Integration Framework configuration.
- Analyze, validate and document the changed records for IBM Maximo web application.
- Importing data from MySQL database to HiveQL using Scoop.
- Implemented OASISBI.
- Develop, validate and maintain HiveQL queries.
- Running reports in Pig and Hive Queries.
- Analyzing data with Hive, Pig.
- Designed Hive tables to load data to and from external files.
- Wrote and Implemented Apache PIG scripts to load data from and to store data into Hive.
Environments : Amazon EC2 Server, Oracle Server 10g, 11g, SQL Server 2008 R2,IBM Web-sphere, IBM Maximo, Talend, OASISBI.
Confidential
Description : confidential is developing a Big Data Analytic Workbench called OASIS-BI for an open source community Saarus for various business cases, it's announcing the release of OASIS-BI, which leverages Horton works Data Platform to integrate data from IBM Maximo, ERP and various Assets Machines . OASIS-BI uses source application data structure and captures every insert, update delete. Using OASIS-BI Time Machine, which allows organizations to ask any question without doing dimensional modeling at any point of time.
Responsibilities:
- Setting up Amazon EC2 Servers and installing required database servers and ETL tools.
- Installed Oracle Server 10g and SQL Server 2008 R2.
- Importing and Exporting database dumps through Oracle Server 10g and 11g using IMPDP and EXPDP , SQL Server 2008.
- Create, validate and maintain scripts to load data from and into tables in Oracle PL/SQL and in SQL Server 2008 R2
- Create validate and maintain Oracle scripts, store procedures and triggers for monitoring DML operations.
- Converting , testing and validating Oracle scripts to SQL Server.
- Installing and deploying IBM Web-sphere.
- Install and utilized TALEND utility tool.
- Migrated Oracle Server from 10g to 11g.
- Wrote Store Procedures and Triggers.
- Upgraded IBM Maximo database from 5.2 to 7.5.
- Through understanding of the MIF Maximo Integration Framework configuration.
- Analyze, validate and document the changed records for IBM Maximo web application.
- Importing data from MySQL database to HiveQL using Scoop.
- Develop, validate and maintain HiveQL queries.
- Running reports in Pig and Hive Queries.
- Analyzing data with Hive, Pig.
- Designed Hive tables to load data to and from external files.
- Wrote and Implemented Apache PIG scripts to load data from and to store data into Hive.
- Installed 10 node multi cluster on Horton works platform
- Installed Hue on the clusters.
- Designed business models using SpagoBI, an analytic platform.
- Creating QBE Query By Example models and calculating KPI Key Performance Indicators for business purposes on SpagoBI server.
- Creating backend tables for OASIS-BI to capture change records for storing data.
- Encrypting sensitive data in tables using MySQL hashing and slating methods .
Environments: Amazon EC2 Server, Amazon Elastic server, HDP 1.3, 2.0, Oracle Server 10g, 11g, SQL Server 2008 R2, MySQL, Apache PIG, Hive 2.0, Scoop, SpagoBI 4.0, 4.1, OASIS-BI, IBM Web-sphere, IBM Maximo, Talend.
Confidential
Hadoop Developer
Responsibilities:
- Involved with ingesting data received from various relational database providers, on HDFS for analysis and other big data operations.
- Wrote MapReduce jobs to perform operations like copying data on HDFS and defining job flows on EC2 server, load and transform large sets of structured, semi-structured and unstructured data.
- Creating Hive tables to import large data sets from various relational databases using Sqoop and export the analyzed data back for visualization and report generation by the BI team.
- Involved in managing and reviewing Hadoop log files and uploading the final results and written queries to analyze them.
- Used Zookeeper for various types of centralized configurations, SVN for version control, Maven for project management, Jira for internal bug/defect management, MapReduce
- Junit for unit testing.
- Worked on installing cluster, commissioning decommissioning of datanode, namenode recovery, capacity planning and slots configuration.
Environments: Hadoop 1.3, HDFS, Hive, Java jdk1.6 , EC2, SOLR, Zookeeper, SVN, Maven.
Confidential
ETL Developer
Description : confidential is one of the oldest financial institutions in the confidential. With a history dating back over 200 years, and are a leading global financial services firm with assets of 2.3 trillion and operating in more than 60 countries with more than 240,000 employees. Serving millions of consumers, small businesses and many of the world's most prominent corporate, institutional and government clients.
Responsibilities:
- Analyze business needs and translate them into reports.
- Provide the business with thorough analysis on systems/risks.
- Research, analyze and document data issues.
- Design, build, create and validate Reports, Test Process Matrixes, Macros
- Performing SQL code debugging.
- Automating reports through batch scripting.
- Design, create and maintain VBA and VBS scripts for automation of the reports.
- Writing Stored Procedures, Packages and Triggers.
- Design and build DSS using SQL Script.
- Design, write and maintain data requests and/or queries to support the business.
- Thorough understanding of Mortgage banking industry with the emphasis on Default MIS Management Information System .
- Performing data analysis, patterns and trends.
- Improve data processes/feeds by building control reports research, validate the issues.
- Loading data into tables using SQL Loader.
- UAT testing.
Environments: Oracle 10g, 11g, TOAD 10.4.1.8, SQL Loader, Windows 7 i-Space Generation X VDI on VM ware, Oracle Applications R 11.5.10, JIRA.
Confidential
ETL Developer
- confidential is an international biotechnology company headquartered in the Newbury Park section of Thousand Oaks, California. Data is extracted from various sources and is loaded into staging tables from where it is consolidated and after calculated and is finally loaded into the data mart. Responsibilities
- Worked with business analysts to identify appropriate sources for Data warehouse and to document business needs for decision support data.
- Implementing ETL processes using Informatica to load data from Flat Files to target Oracle Data Warehouse database.
- Performed data manipulations using various Informatica .
- Involved in creating logical and physical data modeling with STAR and SNOWFLAKE schema.
- Written SQL overrides in Source Qualifier according to business requirements.
- Written pre-session and post session scripts in mappings.
- Created Sessions and Workflow for designed mappings.
- Redesigned some of the existing mappings in the system to meet new functionality.
- Used Workflow Manager for Creating, Validating, Testing and running the sequential and concurrent Batches and Sessions and scheduling Them.
- Extensively worked in the performance tuning of the programs, ETL Procedures and processes.
- Developed PL/SQL procedures for processing business logic in the database.
- Produced documentation as per the company standards and SDLC.
Environments: Informatica 9.x, Teradata, Oracle 10g, Windows 7, UNIX Shell Programming.
Confidential
ETL Developer
The project is to create an efficient Data Warehouse in confidential than the data warehouse that was previously built on Teradata. This warehouse reports the historical data stored in various databases and Flat Files. Data from different sources was brought into Netezza using Informatica ETL.
Responsibilities:
- Analyzed the source data coming from Oracle, Flatfile.
- Used Informatica Designer to create mappings using different transformations to move data to a Data Warehouse. Developed complex mappings in Informatica to load the data from various sources into the Data Warehouse.
- Responsible for creating different sessions and workflows to load the data to Data Warehouse using Informatica Workflow Manager.
- Involved in identifying the bottlenecks in Sources, Targets Mappings and accordingly optimized them.
- Worked with NZLoad to load flat file data into Netezza tables.
- Good understanding about Netezza architecture.
- Assist DBA to identify proper distribution keys for Netezza tables.
- Created mappings using pushdown optimization to achieve good performance in loading data into Netezza.
- Created and Configured Workflows, Worklets, and Sessions to transport the data to target warehouse Netezza tables using Informatica Workflow Manager.
Environment: Informatica Power Center 8.x, Flat files, Netezza 4x, Oracle , UNIX, WinSQL Shell Scripting.
