- Over 8+ years of IT experience with 3 years of experience in Software development, Data warehousing and Analytics and Data engineering using Hadoop, MapReduce, Pig, Hive and other open source tools/technologies.
- Worked on ETL tool Informatica, Oracle Database and PL/SQL, Python and Shell Scripts.
- Hands on experience in creating and writing documents using MS Office products like Word, Excel, PowerPoint.
- Sound knowledge of Business Intelligence and Reporting. Preparation of Dashboards using Tableau.
- Team player and a quick learner with strong relationship building & interpersonal skills.
- Good understanding of Operating System concepts, Multimedia and Web Design.
- Sound knowledge in Data Analytics, Database Management Systems and Object Oriented Analysis (OOA) and Design through UML.
- Experience in developing NoSQL database by using CRUD, Sharding, Indexing and Replication.
- Experience with ETL working with Hive and Map - Reduce.
- Experience in developing the complex SQL queries, unions and multiple table joins and experience with views.
- Knowledge of HIPPA EDI transactions and implementation of solutions in a health care setting using HIPPA-EDI.
- Involved in database design, creating Tables, Views, Stored Procedures, Functions, Triggers and Indexes. Strong experience in Data Warehousing and ETL using Datastage.
- Experience with various Business Intelligence tools and SQL databases.
- Expertise in all the stages of the Software development Life Cycle (SDLC) namely Requirement analysis, Design specifications, Coding, Debugging, Testing.
- Well-Versed with Agile/SCRUM and Waterfall methodologies.
- Experience in interacting with customers and working at client locations for real time field testing of products and services.
- Designed and implemented data ingestion patterns using Sqoop, Flume and Kafka.
- Very good experience in customer specification study, requirements gathering, system architectural design and turning the requirements into final product.
- Experience in installation, configuration, supporting and managing Hadoop clusters.
- Extensively used ODI (Oracle Data Integrator) to perform ELT from heterogeneous sources using ODI tools - Security Manger, Topology Manager, Designer and Operator.
- Implemented standards and processes for Hadoop based application design and implementation.
- Expertise in all components of Hadoop Ecosystem- Hive, Hue, Pig, Sqoop, HBase, Flume, Zookeeper, Oozie, and Apache Spark.
- Responsible for writing MapReduce programs using Java.
- Logical Implementation and interaction with HBase.
- Developed MapReduce jobs to automate transfer of data from HBase.
- Experience in developing Pig scripts and Hive Query Language.
- Performed data analysis using Hive and Pig.
- Used Hbase in accordance with PIG/Hive as and when required for real time low latency queries.
- Managing and scheduling batch Jobs on a Hadoop Cluster using Oozie.
- Fluent with the core Java concepts like I/O, Multi-threading, Exceptions, RegEx, Collections, Data-structures and serialization.
- Experience in Object Oriented Analysis Design (OOAD) and development of software using UML Methodology, good knowledge of J2EE design patterns and Core Java design patterns.
Applications: Tableau 9.0, MS Office, Excel, Word, PowerPoint, Eclipse.
Operating System: Windows 98/2000/XP/NT/Vista, Linux.
Languages: Python, PHP, C, JAVA 1.6, XML, CSS.
BI/ETL Tools: OBIEE 12c, 11g,Informatica 9.1, DataStage.
Hadoop Ecosystem: Hadoop MapReduce, Hive, Pig, HBase, HDFS, Zookeeper, Oozie, Sqoop.
Database: NoSQL, MySQL, Oracle 11g/10g.
Confidential, San Jose, CA
Hadoop Developer / Hadoop Admin
- Coordinated with business customers to gather business requirements.
- Install and maintain the Hadoop Cluster and Cloudera Manager Cluster.
- Importing and exporting data into HDFS from database and vice versa using Sqoop.
- Responsible for managing data coming from different sources.
- Worked on analyzing Hadoop cluster and different Big Data analytic tools including Pig, Hbase database and Sqoop.
- Load and transform large sets of structured and semi structured data.
- Collecting and aggregating large amounts of log data using Apache and staging data in HDFS for further analysis.
- Analyzed data using Hadoop components Hive and Pig.
- Involved in running Hadoop streaming jobs to process terabytes of data.
- Gained experience in managing and reviewing Hadoop log files.
- Involved in writing Hive/Impala queries for data analysis to meet the business requirements.
- Worked on streaming the analyzed data to the existing relational databases using Sqoop for making it available for visualization and report generation by the BI team.
- Involved in creating the workflow to run multiple Hive and Pig jobs, which run independently with time and data availability.
- Developed Spark scripts by using Python as per the requirement.
- Developed Pig Latin scripts for the analysis of semi structured data.
- Imported data using Sqoop to load data from MySQL to HDFS on regular basis.
- Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs, Spark SQL.
Environment: Hadoop, Cloudera Manager, HDFS, Hive, Pig, HBase, Sqoop, SQL, Java (jdk 1.6), Eclipse, Python.
Confidential, San Francisco, CA
- Hands on using log files and to copy them into HDFS.
- Hands on writing Map Reduce code to make unstructured data as structured data and for inserting data into HBase from HDFS.
- Migrated the data from cluster into the AWS environment.
- Launching and Setup of Hadoop Cluster on AWS.
- Created tables on top of data on AWS S3 obtained from different data sources.
- Experience in creating integration between Hive and HBase.
- Experience in defining job flows and wrote some simple to complex Map Reduce jobs as per the requirement.
- Involved in creating Hive tables, loading with data and writing hive queries.
- Implemented business logic by writing Pig and Hive UDFs for some aggregative operations and to get the results from them.
- Hands on experience in exporting the results into relational databases using Sqoop for visualization and to generate reports for the BI team.
- Experienced with NoSQL database and handled using the queries.
- Monitored the health of Map Reduce Programs which are running on the cluster.
- Involved in loading data from UNIX file system to HDFS.
- Installed and configured Hadoop Map Reduce, HDFS and Hive, Pig, Sqoop and Oozie on the Hadoop cluster are installed and configured.
- Cloudera Manger was used to monitor and manage the Hadoop Cluster.
Environment: Hadoop, Map Reducer,Cassandra, Cloudera Manager, HDFS, Hive, Pig, HBase, Sqoop, Oozie, AWS, SQL, Java (JDK 1.6), Eclipse.
Confidential, Boston, MA
- Hadoop Installation and configuration on Cloudera platform.
- Written queries in Hive QL and Pig Scripts for reporting purpose.
- Developed the Map-Reduce programs and defined the job flows.
- Manage and review Hadoop log files.
- Supported/Troubleshoot Map-Reduce programs running on the cluster.
- Loaded the data from Linux/UNIX file system into HDFS.
- Installed and configured Hive and written Hive UDFs.
- Created the tables, loaded the data, and written the queries in Hive.
- Developed the scripts to automate routine DBA tasks using Linux/UNIX Shell Scripts (i.e. database refresh, backups, monitoring etc.).
- Modifying SQL queries for batch and online processes.
- Managed the cluster through performance tuning and enhancement.
Environment: CDH - 4.1.2, HDFS, HBase, MapReduce, Hive, PIG, Oozie, Eclipse.
Confidential, Houston, TX
- Involved in the Analysis, functional and technical specifications, development, deployment and testing of the project.
- Gathered business needs by Interacting with the business users for analytical data requirements.
- Requirement validation of the reports and drilldowns.
- Assisted in designing repository based on business requirements, followed design best practices for the RPD and Dashboard designs. Implemented Star schema/snow flake schema methodologies.
- Customized the OBIEE Repository (physical, BMM, and presentation layers) and worked on the design of logical data model.
- Worked on repository and session variables.
- Performed production support for the project.
- Created groups in the repository and added users to the groups and granted privileges explicitly and through group inheritance.
- Generated Reports and Dashboards by using Report features like Pivot tables, charts and view selector.
- Exposure to Medical code sets: ICD, CPT and HCPCS.
- Proficiency in Informatica Designer Components (source analyzer, warehouse designer, mapping designer, mapplet designer, transformation developer, workflow manager and workflow monitor).
- Experience in Installing, configuring and customizing DAC(execution plans, subject areas, tables, task) and monitoring ETL process using DAC &Informatica workflow manager.
Environment: OBIEE 220.127.116.11, Oracle11g, SQL, Oracle SQL developer, ODI.
Confidential, Warren, NJ
- Developed proof of concept to check the functional feasibility of the project.
- Used Informatica Power Center for Financial Data Extraction, Data Mapping and Data Conversion.
- Created mappings using various transformations like Source Qualifier, Lookup, Update Strategy, Router, Filter, Sequence Generator, and Joiner on the extracted source data according to the business rules and technical specifications.
- Developed simple and complex mappings for financial data using Informatica to load Dimension and Fact tables as per STAR schema techniques.
- Developed a number of Informatica Mappings, Mapplets and Transformations to load data from relational and flat file sources into the data warehouse.
- Designed and developed the OBIEE Metadata Repository (.RPD) of financial analytics using OBIEE Admin tool by importing the required objects (Dimensions and Facts) with integrity constraints into Physical Layer using connection pool, developing multiple Dimensions (Drill-Down, Hierarchies) and Logical Facts / Measures objects in Business Model Layer, and creating the Presentation catalogs in Presentation Layer.
- Identified granularity level of the Financial data required to be available for analysis.
- Created Security settings to setup groups, access privileges, query privileges and Implemented Object level as well as Date level security for the end users using OBIEE Admin tool.
- Worked extensively on OBIEE Answers to create the Financial Statement reports and Intelligence Dashboards (Financials) as per the detailed design requirements.
- Extensively used page prompts and dashboard prompts for filtering data values in financial reports.
- Created drill down charts and drill down tables to gather more information on Financial Analytics using navigation.
- Developed BI-Publisher end user reports which have financial formatting capabilities meeting the end-user needs and integrated Prompts between the BI-Publisher and OBIEE using the Presentation Variables.
- Used SQL queries and database programming using PL/SQL (writing Packages, Stored Procedures/Functions, and Database Triggers).
Environment: OBIEE 10.1.3.4, DAC, Oracle BI Apps 7.9.5/7.9.6, Informatica 8.1.1, Oracle EBS, BI Publisher, Windows XP.
- Designing ETL jobs as per business requirements.
- Developing ETL jobs with organization and project defined standards and processes
- Develop Unit Test Cases.
- Implemented UNIX shell scripts to invoke the Datastage jobs.
- Assisted Systems Administrator in DataStage installation and maintenance.
- Deployment to QA and hence Production.
- Unit Testing
- Coordinating with onshore to ensure an issueless delivery.
Environment: DB2, UNIX Shell Scripting, Oracle 10g, PL/SQL, Data Stage.
- Understanding requirement and developing modules.
- Preparing detailed test plans, acceptance criteria and test scenarios for each project.
- Involved in Performance Testing, Scalability / Stress and Load Testing.
- Testing full product suite's, identifying problems & resolving them with the team.
- Worked with Complex SQL queries, Functions and Stored Procedures.
- Involved in coding, maintaining, and administering Servlets and JSP components to be deployed on a WebSphere application server.
- Developed programming module for loading data into data warehouse.
Environment: Java, SQL Server, JBuilder.
- Developing and maintaining a detailed project plan.
- Code maintenance - bug fixing and enhancements, designing and implementing new codes as per the business requirements.
- Technical support for business issues.
- Responsible for designing and developing GUI and Protocol Utilisation Modules.
- Implemented a Statistics-Based Packet Filtering Scheme against Distributed Denial-of-Service Attacks.
- Defining requirements for detailed project plan.
- Recording and managing project and escalating where necessary.
Environment: Java, Winpcap, Swings, MySQL.