- Overall 6+ years of IT experience on BIG Data - Hadoop, Data Modeler & Data Analyst in the Architecture, Design and Development.
- Solid Understanding of big data technologies like Hadoop and Hive.
- Develop efficient ETL jobs using Spark SQL and scripts to full fill data preparation requirements for Business Intelligence.
- Experience working with business to determine needs and explain concepts to get support for implementing solutions.
- Experience with assisting Supple Chain Management (SCM) in implementing and managing the ongoing process.
- Good working knowledge in multi-tiered distributed environment, good understanding of Software Development Lifecycle (SDLC)-Agile and Waterfall Methodologies.
- Strong experience in using MS Excel and MS Access to dump the data and analyze based on business needs.
- Experience in Designing and implementing data structures and commonly used data business intelligence tools for data analysis.
- Strong experience writing stored procedures, functions, triggers, and adhoc queries using PL/SQL
- Experience with data visualization tools or dashboard tools, including Tableau
- Experienced in generating and documenting Metadata while designing OLTP and OLAP systems environment.
- Experienced in developing Conceptual, Logical and Physical models using ERWIN, Sybase Power designer.
- Excellent understanding and working experience of industry standard methodologies like Software Development Life Cycle (SDLC), as per Rational Unified Process (RUP), Agile and Waterfall Methodologies.
- Experienced in integration of various relational and non-relational sources such as DB2, Oracle, Netezza, SQLServer, NoSQL, COBOL, XML and Flat Files, to Netezza database.
- Highly proficient in writing SQL for a relational datastore.
- Extensive experienced in Normalization (1NF, 2NF, 3NF and BCNF) and De-normalization techniques for improved database performance Data Warehouse/Data Mart environments.
- Shared tableau data source with multiple calculated fields into Tableau Portal for further report development.
- Experience in using Teradata ETL tools and utilizes such as BTEQ, MLOAD, FASTLOAD, TPT, FastExport.
- Experience in using Informatica ETL tools and utilizes such as Server, Client, BDM and IDQ.
- Proficient in oracle tools and utilities such as TOAD, SQL*PLUS and SQL developer.
- Strong expertise in Metadata, Data Quality, Master Data Management (MDM) and Data Governance.
Programming Languages: SQL, C, C++, UNIX Shell Scripting.
SDLC Methodologies: Agile/SCRUM, Waterfall
Operating Systems: Windows, Linux, Unix
Data Modeling Tools: Erwin 9.x, MS Visio, SAP Power Designer
Reporting Tools: Crystal reports XI, Tableau, Informatica Power Center BDM, IDQ, Server & Client
Other Tools: MS Office suits(word, Excel, MS Project & Outlook), TOAD, BTEQ, SQL Assistant
Databases: Microsoft SQL Server, MySQL, Oracle, Teradata, Sybase, Hive
Version Controls: Git, GitHub, Bitbucket
Confidential, Westchester, PA
- Created documentation for onboarding new sites.
- Working on multiple projects and work initiatives.
- Review and analysis of business requirements for identification of data sources and target systems.
- Queried SQL Server for data validation along with developing validation worksheets in Excel to validate.
- Developed test strategies and plans for Unit, Integration, System and User Acceptance Testing.
- Defined and Created OLAP Schema's - star schemas for use in building Cubes, based on Multiple Data Stores for use SSRS Reports.
- Experienced with a variety of statistical and database programs (e.g., STATA, SAS, R and/or SQL)
- Pipelined (ingest/clean/munge/transform) data for feature extraction toward downstream classification.
- Analyze source systems for data acquisition architect data feathering logic across different source systems.
- Designed logical and physical data models using Power Designer tool in medium to high complexity, according to company standards.
- Design and train key WMS users to effectively utilize the system and enhance performance.
- Provide ongoing systems and software support to all functions within the facility.
- Communicated strategies and process around data modeling and data architecture to cross functional groups.
- Specifies overall Data Architecture for all areas and domains of the enterprise, including DataAcquisition, ODS, MDM, Data Warehouse, Data Provisioning, ETL and BI.
- Generated reports and dashboards with Tableau Desktop to identify insights and in-depth view for end client.
- Created action filters, parameters and calculated sets for preparing dashboards and worksheets in Tableau.
- Utilized Tableau server to publish and share the reports with the business users.
- Created reports with Crystal Reports and scheduled to run daily basis.
- Developed leading routines and data extracts with Informatica Unix SAS and Oracle procedures.
- Generated process flow designs conducted tuning for SQL and PL procedures Informatica objects and Views.
- Done data migration from an RDBMS to a NoSQL database, and gives the whole picture for data deployed in various data systems.
- Coordinated with Data Architects and Data Modelers to create new schemas to improve reports execution time, worked on creating optimized DataMart reports.
- Used the Agile Scrum methodology to build the different phases of Software development life cycle.
- Gathered and analyzed existing physical data models for in scope applications and proposed the changes to the data models according to the requirements.
- Used Informatica tool for profiling the tables as it helps us understand content, structure, relationships, etc. about the data in the data source we are analyzing.
- Was using in-house tool for profiling the tables using Spark.
Environment: Hadoop, Hive, MDM, PL/SQL,oracle 12c, Sql Server 2015, Nosql, MS Office (Word, Power Point, Access, Excel, Outlook), SAP Power Designer, MS Visio, Crystal Reports, Informatica Server, Client, BDM and IDQ, Tableau.
- Completely involved in the requirement analysis phase.
- Involved in the Clients/Business Calls.
- Preparing Data Flow Diagrams.
- Supported analytics onshore team by generating data sets from data, coming from various systems, by writing Hive Query language scripts.
- Migrated oracle data to Hadoop cluster and transformed existing SQL scripts to Hive Query.
- Evaluated various Hive functionalities like partitioning, file formats to bench mark the Hive performance.
- Purpose of data-loading utility was to migrate constant and moving data sets, like transactional snapshot or accumulating snapshot, from oracle to Hadoop cluster.
- Familiarized with data analytics by taking a use case, to find thresholds of line failure, and generating different views of data.
- Worked on Data Mining and data validation to ensure the accuracy of the data between the warehouse and source systems.
- Developed SQL Queries to fetch complex data from different tables in databases using joins, database links.
- Involved in designing and developing SQL server objects such as Tables, Views, Indexes (Clustered and Non-Clustered), Stored Procedures and Functions in Transact-SQL.
- Participated in JAD sessions, gathered information from Business Analysts, end users and other stakeholders to determine the requirements.
- Performed Data analysis of existing data base to understand the data flow and business rules applied to Different data bases using SQL.
- Performed data analysis and data profiling using complex SQL on various sources systems and answered complex business questions by providing data to business users.
- Responsible for daily deliverables.
- Worked as Data Analyst for requirements gathering, business analysis and project coordination.
- Worked with the Data Analysis team to gathering the Data Profiling information
- Responsible for the analysis of business requirements and design implementation of the business solution.
- Developed Servlets and Java Server Pages (JSP) for employee portal. Developed PL SQL queries to generate reports based on client requirements.
- Coded JDBC calls in the Servlets to access the Oracle database tables.
Environment: Hadoop Cloudera 5.X and Ecosystems, Stash, Putty, WinSCP, SQL, SQL server, PL/SQL, MS Visio, T-SQL, SSIS, SSRS