Big Data Architect (hadoop, Teradata Aster) Resume
L A, CA
SUMMARY
- 16 years of extensive experience in the complete Software Development Life Cycle (SDLC) covering Requirements Management, Data Analysis, Data Profiling, Data Modeling, System Analysis, Architecture and Design, Development, Testing and Deployment of business applications.
- Strong Data modeling experience in ER diagram, Dimensional data modeling, Conceptual /Logical/ Physical Design, Star Schema modeling, Snow - flake modeling using tools like Erwin & ER-Studio.
- Extensive experience in gathering business requirements, implementing business processes, identifying risks, impact analysis and JAD sessions to identify requirement feasibility
- Created DDL scripts for implementing Data Modeling changes. Published Data model in model mart, created naming convention files, co-coordinated with DBAs’ to apply the data model changes.
- Possess strong Documentation skill and knowledge sharing among Team, conducted data modeling review sessions user groups, participated in requirement sessions to identify requirement feasibility.
- Experienced in Business requirements conformation, data analysis, data modeling, logical and physical database design and implementation.
- Proven knowledge in capturing data lineage, table and column data definitions, valid values and others necessary information in the data models.
- Experienced in preparing Business Process Re-engineering Models.
- Created/maintained functional, HLD, ETL and data base design document with detailed description of logical entities and physical tables.
- Hands-on Big Data Architect, responsible for the architecture, design and development of data-centric solutions
- Experience working with HDFS, Mapreduce, Pig, Hive, NoSQL techniques on big data.
- Strong understanding of the principles of Data warehousing, Fact Tables, Dimension Tables, star and snowflake schema modeling.
- Experience in backend programming including schema and table design, stored procedures, Triggers, Views, and Indexes.
- Experienced in database performance tuning Tuning in reducing the Cost and Cardinality of the SQL and data access optimization, table partitioning, indexing, writing complex SQL queries and PL/SQL blocks like stored procedures, Functions, Triggers, Cursors and ETL packages.
- Experienced in Normalization and Demoralization processes, Logical and Physical data modeling techniques.
- Experience in Big data concepts, Hadoop framework architecture, Hadoop Map Reduce, HDFS, Hbase, Zoo Keeper, Oozie, Hive,Sqoop, Pig, Flume and Greenplum.
- Possess a strong analytical, verbal, inter-personal skill that helps in communicating with business, SME, senior management and team members.
- Energetic and self-motivated team player with excellent communication, interpersonal, and leadership skills, thrives in both independent and collaborative work environments
TECHNICAL SKILLS
Databases: Hadoop(HDFS,Hive,Impala),Tera data, Oracle, MS-Access, DB2, SQL Server
Date warehousing/ ETL tools: Sqoop,Informatica Power Center, Business object, Teradata Bteq, Mload, Fexp, FLoad, Tpump scripts
Reporting Tools: Qlikview, Cognos, Busiess Objects
GUI/Client/Server Tools: Erwin, TOAD, Showcase, Visio, Ms-office, Ms-project, SAP R/3, Sap Bw Bex explorer, Oozie
PROFESSIONAL EXPERIENCE
Big Data Architect (Hadoop, Teradata Aster)
Confidential, L.A, CA
Responsibilities:
- NGTV Phase 1: DTV customer experience data via Adobe analytics and audience manager server.
- NFL OTT: capturing NFL order and transaction, refund chargeback data from Vindicia via API call.
- CD1: Agent Vendor compensation data for DTV agents selling ATT product
- CX: Self Care Access from Mobile Apps Master: capturing user action related to billing activity on m.com
- Adobe Products for DTVE Master: sending contractual data to Fox and Disney
- Roles and Responsibilities:
- Worked with engineering and digitial entertainment team to understand requirement and document incoming data to produce HLD, LOE, data profile, data model and ETL specs.
- Develop mapping specs to load data into Hadoop system’s detail tables(Hive) and sqoop monthly/weekly aggregated data into EDW(Teradata) for star schema model.
- Work with ETL team implanting solution on HBase tables to load large sets of structured, semi-structured and unstructured data coming from UNIX, NoSQL and a variety of portfolios
- Develop UI specs for reporting team to create report in Qlikview and/or Cognos.
- Mapping Specs to Send data extract to SAS for analytics purpose.
- Conduct status meetings with the project team on progress/issues/UAT/risk.
Environment: Erwin, Hadoop(Hue,hdfs,hive,Pig,flume,sqoop,oozie), Qlikview, Cognos, SAS, Teradata, Aster, UNIX, Informatica, Autosys Scheduler.
EBI Data Architect
Confidential, L.A
Responsibilities:
- Finance transformation: The current G/L system (FCS) is being replaced by PeopleSoft G/L (PGL)
- DDA Large Item Report: For activity for items over $50K. The report is printed and distributed to 10 people in the Community Banking Risk Support (CBRS) group to identify fraudulent activity.
- Maturity Loan: Data extracts from S&P for the loan activity file. Replace CIF numbers with customer names.Sort list by RC.Create separate files for each RC and Email each file to the appropriate Market President and RM.
- Tandem replacement project: decommission Tandem and replace with in-house system database residing on Greenplum db . Created star schema for customer, account & users for wire and stop transactions.
- Roles and Responsibilities:
- Worked with source team to understand new data and to confirm and document incoming data
- Create data definition specification and did Data Profiling.
- Worked closely with ETL team to develop mapping specs for informatica jobs.
- Identified anomalies in the data received from the source in integration and UAT testing.
- Create new data models as par with existing model.
- De-Normalized tables(start schema) for better query performance.
- Worked with business user for data validations in UAT testing.
- Worked on HP Quality Center to track the defect logged against the data issue during system integration and UAT testing.
- Weekly/Daily risks/issues/status meetings with the project team.
Environment: Erwin 8.2, Greenplum, Oracle11g, Toad, HP - UNIX, Sun OS, Erwin, PL/SQL, Informatica, Autosys Scheduler.
Teradata Architect
Confidential, Beaverton, OR
Responsibilities:
- Created Data models using Erwin by understanding the business requirement.
- Interact with business users and source system (SAP R/3, SAP BW ) IT teams to define, agree and document incoming Source to Target data mapping specifications from provisioning to consumption layer.
- Worked closely with ETL team to develop kornshell scripts for bteq and informatica jobs.
- Worked on POC to explore implementation on different solution on big data (Hadoop).
- Identified anomalies in the data received from the source SAP BW.
- Reverse engineered complex databases which consisting of more than 200 Tables.
- Modified the existing data models for the enhancement of the project.
- Normalized tables for better query performance.
- Worked on Export and Import operations in Teradata for data validations.
- Created a script which automatically does Data Profiling.
- Teradata performance tuning via Explain, PPI, and AJI, Indices, collect statistics or rewriting of the code.
- Performance tuning of ETL and Cognos queries on consumption layer.
- Worked on HP Quality Center to track the defect logged against the physical model and data issues.
- Weekly risks / issues meetings with the stakeholders.
Environment: Erwin 8.2, Teradata R13, SAP R/3, Sap Bw, Oracle10g HP - UNIX, Sun OS, Erwin, PL/SQL, Autosys Scheduler.
Sr. Teradata Architect
Confidential, Atlanta, GA
Responsibilities:
- Involved in project life cycle - from analysis to production implementation and support.
- Created and Modify LogicalandPhysical Data Model for the ECDW.
- Extensively worked in defining process for Data Extraction, Transformation and Loading from flat files, Oracle, XML, Teradata Sources into Teradata using BTEQ, FastLoad, MultiLoad, Korn shell scripts and Informatica.
- Created technical specifications for efficient ETL development in a Teradata environment Accomplished data feed to Customer Profitability Analysis application (SQL Server application) using FastExports.
- Created Technical Specs documentation, and ETL documentations, Data Profiling, Data Mapping, and unit test Specs conforming to internal guidelines
- Accomplished Teradata performance tuning via Explain, PPI, and AJI, Indices, collect statistics or rewriting of the code.
- Performed data quality and integrity end to end testing and Reversed engineer and documented existing ETL program codes
- Assisted SIT/UAT testers in optimizing testing scripts and to ensure data accuracy and integrity and Provided production support
Environment: Erwin 7.1. Informatica Power Center 8.5.1/8.6.1 , ETL, Teradata V6R6 & R12, Oracle10, HP - UNIX, Sun OS, Sql Server, Maestro Scheduler.
Tech Lead
Confidential
Responsibilities:
- Worked with business users and other data source owners to identify requirements, source file structure, and frequency of loading, implications and resolve data quality issues.
- Prepared Power point presentations, Test cases, Data flow diagrams.
- MentoredDevelopment teams, program code development, test plan development, testingandresult documentation, analyzing defects, bug fixing
- Documented Program Specifications, UnitandIntegration Test Specifications, Test Results
- Coded test SQLsandanalyzed results and Analyzed test case resultsanddocumented test cases/plans
- Accomplished data movement process that load data from DB2 intoTeradataby the development of Korn Shell scripts, usingTeradataSQLand utilities such as Bteq, Fastload, Fastexport, Multiload andQueryman
- Improved Informatica performance by identifying bottlenecks Confidential source, target, mappings and session level.
- Provided expertiseandtrained less experienced developers, creating the necessary training materials.
- Worked closely with Project Managerand client to assess goals and kept accurate, generatedand/or maintained moduleandstatus reports
Environment: Informatica Power Center 7.4 (Designer, Repository Manager, Workflow Manager), TOAD, DB2, Oracle 9i, Teradata V2R5, VISIO, PowerPoint, UNIX, Windows.
Manager
Confidential
Responsibilities:
- Worked with cross-functional teams and prepared detailed design documents for integrating Confidential system to customer’s ERP systems.
- Create vendor quotation/offer for computer hardware and software.
- Create LOE for project IT cost.
- Lead a team to develop in-house socket control system to handshake and communicate between Lower level system(PLC,WCS,SCADA) to end client systems(WMS,ERP). Which saves company between 500k-1million each projects.
- WCS system development offloads to third party vendor which saves company 250k each project.
- Move from readymade SCADA to in house object based system which cut customization cost to minimal. Saving company around 200k per project.