Hadoop Analyst/developer Resume
Houston, TX
SUMMARY:
- Over 10+ years of experience in data management, architecture, design and development of complex IT applications in large enterprises by using Hadoop Bigdata, Informatica PowerCenter, SQL, Unix etc.
- 3+ Years of experience in Hadoop Bigdata environment including design and development
- Experience in Big Data development using open source tools including Hadoop Core, Sqoop, PIG, Hive, Map Reduce, Scala and Spark
- Experience in creating Hive Internal/External tables, creating Oozie Workflows, pulling structured semistructured,unstructured and streaming data from different sources and loading .
- 6 Years of experience in ETL Informatica,Oracle and Unix environment including design, development, testing and scheduling.
- Worked extensively with Informatica complex mappings using different transformations like Expressions, Filters, Joiners, Routers, Union, Lookup, Stored Procedure, Aggregator, Update Strategy, Normalizer, Sorter etc.
- Good Experience in identifying the Performance bottleneck and experience in Performance tuning of Source, Target, Mappings, Sessions and System resources bottlenecks.
- Experience in Data Modeling, Dimensional modeling of large databases
- Good amount of experience in Core Java Development Environment
- Working Experience with SQL,PL/SQL, Informatica MDM, Business Objects, Hyperion Essbase
- Good Knowledge in creating statistic models and data analysis using R Programming
- Good Knowledge of Data algorithms like top - n, sentiment analysis, naive Bayes, k-mean clustering algorithms for Data Analytics.
- Fair knowledge of building Machine Learning algorithms like Linear Regression, Predictive Analysis, Decision Trees, Logistic Regression.
TECHNICAL SKILLS:
Hadoop Eco Systems: HDFS, Hive, HBase, Spark, Kafka, Flume, Sqoop, Pig, Oozie, Zookeeper, HDFS, YARN, R, Mahout, Map Reducer
OLAP/Reporting Tools: Business Objects/ Hyperion Essbase
Database Tools: Hive, Oracle,Toad, SQL Server
ETL Tools: Informatica 9.1, Informatica MDM
Programming Languages: Unix,Java, Python
Operating Systems: Windows 2000, Window Server 2003, Windows XP, Linux, UnixCore Skills: Requirements gathering, Design and redesign techniques, Database Modelling, Development, Testing and Performance tuning
PROFESSIONAL EXPERIENCE:
Confidential, Houston, TX
Hadoop Analyst/Developer
Responsibilities:
- Defined templates & guidelines, configure tools for several data extraction/ingestion strategies
- Develop jobs for data ingestion in hadoop data lake, apply transformation and loading data into reporting systems
- Design data model in Hive & created common data pipelines using Hive and PIG Latin scripts
- Stored thousands of flat files in Hadoop file systems and archived old data in low cost data lake
- Created POC for data access from HDFS using Hive, Pig, Hbase, Scala and Sqoop
- Query performance improvement using RDD in SQL/Hive Context in Spark environment
- Used RDD in Join, Filter, reduceByKey and Aggregation functionality in Python Spark
Environment: Hortonworks HDP 2.1 Hive, HBase,Pig Oozie, Sqoop, Kafka, Spark, Scala,Map Reducer
Confidential, Cincinatti, OH
Hadoop Developer
Responsibilities:
- Set-up Cloudera based Hadoop ecosystem for data ingestion in Hadoop file distribution network
- Defined the centralized data ingestion strategy for the acquiring data from several source systems
- Developed End-To-End solution for data ingestion into EDW/Data Lakes
- Design data models & developed common design patterns/data pipelines using Pig and Hive
- Develop data pipeline using big data open system tools using PIG, Sqoop, Hive and Unix Shell for data ingestion in HDFS data lake
- Develop jobs for data ingestion in hadoop data store, applied transformation and loaded data into reporting platforms
Environment: Cloudera Hadoop, HDFS, Sqoop, Spark, Pig, Hive, Oracle, Cloudera and R Programming
Confidential, New York, NY
Informatica Lead
Responsibilities:
- Developing Mapping/Mapplets/Workflows involving various ETL processes using Informatica tools for Concurrent workflows project.
- Configuring concurrent workflows for all data marts.
- Coordinating with Client and Leading team in efficient manner.
- Bug Analysis and Bug fixing when running the concurrent workflows communicating with Onshore team for the issue clarification, status updates etc;
- Conducting and Participating Peer Code review of the project deliverables.
- Contributed to Performance Tuning of the mappings/sessions/workflows
Environment: Informatica 8.1, Oracle, Unix
Confidential, Chicago, IL
Java Developer
Responsibilities:
- Role in this project is as a Developer.environments and expand Hadoop cluster
- Coding using Struts framework, JSP, Java Servlets.
- Interact to client meetings and checking test cases.
- Preparing the LLD, Bug fixing at ST, and UAT stages.
- Coding using Struts framework, JSP, Java Servlets.
- Involved in resolving the Production Issues
- Interact to client meetings and checking test cases
Environment: Java,Servlets, JSP, Struts, Hibernate, SQL Server 2008, Jboss
Confidential, Cincinatti,OH
Informatica MDM Tester
Responsibilities:
- Involved in Extraction, Transformation and Loading of data.
- Involved in requirement gathering
- Involved in designing HLD and LLD for 4 milestones
- Has got working knowledge in developing the mappings
- Completed the end to end testing for entire engagement without any escalations.
- Got appreciations from the business for our on time deliverables.
- Involved in developing the Python Scripts .
Environment: Informatica MDM 9.1, Oracle 10g, Python 3.1
Confidential, Greensboro, NC
ETL Developer
Responsibilities:
- Worked with the business users to get the business rule modifications in development and testing phases and analyzed claims data through rigorous evaluation methodology.
- Wrote PL/SQL, stored procedures & triggers for implementing business rules and transformations.
- Worked on Informatica tool Source Analyzer, Data Warehousing designer, Mapping Designer, Transformations, Informatica Repository Manager and Informatica Server Manager.
- Contributed to Performance Tuning of the mappings/sessions/workflows
- Communicating with Onshore team for the issue clarification, status updates etc;
- Conducting and participating in the Peer review of the project deliverables.
- Documented the Informatica Mappings and workflow process, error handling of ETL procedures
- Used Business Objects for reporting. Interacted with Users for analyzing various Reports.
Environment: Informatica Power center 8.1/8.6, Microsoft SQL Server 7, SQL, Oracle10g
Confidential, New York, NY
Informatica Developer
Responsibilities:
- Did extensive analysis on the business requirements and gathered information for the development of several small applications.
- Developing ETL mappings to extract data from the Oracle and load it into the Natezza database.
- Do the error validation of the data moving from Oracle to the Natezza database.
- Test the mappings and check the quality of the deliverables.
- Communicating with Onshore team for the issue clarification, status updates etc;
- Conducting and participating in the Peer review of the project deliverables.
- Bug Analysis and fixing.
- Contributed to Performance Tuning of the mappings/sessions/workflows
Environment: Informatica Power Center 7.6, SQL Server 2005, Oracle 9g, TOAD, SQL Plus and Windows
Confidential
Production Support Engineer
Responsibilities:
- Daily loads and handling production failure issues for Alizes application.
- Checking for the Data Quality on a daily basis.
- Participating in daily Onsite/Offshore co-ordination call meeting.
- Production Monitoring and Support.
Environment: Informatica 8.1,Oracle, Unix