Big Data Architect Resume
PROFESSIONAL PROFILE:
- Currently working as a Sr. Big Data Solutions Architect, I have over 19 years of experience in the architecture, design, construction, and implementation of Information system.
- Worked as Information Architect, Data Architect, Integration Architect, Database Architect, Data Services Specialist, Data migration Specialist, Data Modeler, DBA.
- Extensive cross - industry domain experience, mainly, Financial Services, Retail, Insurance, Manufacturing, Utilities.
- Extensive knowledge in BI, Big Data tools, Data Modeling, ETL, noSQL, J2EE, EAI, SOA, SOI, BI, Web Services, BPEL, JSON, XML, XSD, WS-*.
PROFESSIONAL EXPERIENCE:
Confidential
Big Data Architect
Responsibilities:
- Worked as a Big Data Architect to build Enterprise Data Lake to enable supply channel management to access and analyze the data from various corporate systems.
- Architect/design Confidential BigInsight and Infosphere Datastage to build enterprise data pipe line to ingest the data from various source systems to the Data Lake platform.
- Architect/design data lake ingestion flow to extract data from various corporate data storage systems like db2 tables, VSAM data sets and flat files.
- Design big data authentication solution using LDAP/Kerberos and Authorization using UNIX groups and HDFS ACLs.
- Architect the Hadoop data security using DEZ, Data Encryption Zone, and control data access using UNIX groups and HDFS ACLs.
- Architect/design the Integration solution using Oozie coordinators and Control M.
- Worked with the technology manager and business stake holders to demonstrate the strategic value of the data lake platform.
- Lead the platform and data migration from Big Insight 4.1 to Big Insight 4.2.
- Designed code generation framework using UNIX shell and python to automate the Hadoop code artifacts (BigSQL, hive, HBase, Oozie coordinator, Oozie workflow).
- Designed data analysis and visualization using BigSQL DSM and Confidential Big Sheet.
- Worked with the Confidential Bluemix support to solve the platform issue and apply required patches.
Technology Used: Confidential BigInsight 4.1/4.2, Confidential InfoSphere Datastage, Python, shell script, Hive, HBase, BigSQL, Spark, Pig, BigSheet, Confidential DSM, Control-M, Oozie, DB2, OS/360, Linux
Confidential
Sr. Data Architect
Responsibilities:
- Worked as a data architect in the various agile sprint projects to architect and design data platform services to provision data access layer for the company existing and new products.
- Architect, design and build consolidated data store from multiple data sources.
- Design and build Data Model for the Alert and Trend application, design batch jobs to automate Construction Cost Index (CCI) process which reduces the current index publication time from monthly to daily.
- Design and build Material Unit Cost, Square foot model estimation data mart and cube to provision data APIs for the web and mobile applications RSMO alert and trend.
- Design and build data mart in Hive managed repositories using the ELT design pattern.
- Develop pig script to build job execution and data flow to map the data from source to hive warehouse
- Architect and Design and code Python/BeautifulSoup to web-scrap material, equipment, labor rate and construction cost data from public websites( like HomeDepot, yellowpage, wdol.gov etc)
- Design and Build python/scoop script to exchange data from Hive warehouse to Elasticsearch to service API layer.
- Architect ELK (elasticsearch, LogStash/FileBeat, Kibana) to aggregate application data and web and application log files for 360-degree view of the customer insight analysis.
Technology used: Apache Pig, Apache Hive, Apache SQOOP Microsoft SSIS, SQL Server 2014, Python, Elasticsearch, RabbitMQ, Logstash forwarder, Filebeat, Kibana, Power BI, HDInsight, SSDT, pyodbc
Confidential
Sr. Enterprise Data and BigData Architect
Responsibilities:
- Develop and design bigdata architecture and supporting technology stacks.
- Provided hardware architectural guidance, planning and estimating cluster capacity, and created roadmap for Hadoop cluster deployment.
- Provisioned, installed, configured, monitored, and maintained Hadoop components, mainly, HDFS, Pig, Hive, Sqoop, Solr, Hue and Oozie.
- Done recovery from node failures and troubleshooting common Hadoop cluster issues.
- Patched and upgraded Cloudera cluster from CDH4 to CDH 5.
- Provided design and technical assistance with challenging issues that Big Data Integration spans multiple systems beyond Hadoop.
- Supported Hadoop developers and assisting in optimization of map reduce jobs, Hive Scripts, and other data ingest and extract activities.
- Defined and managed continual service improvement to Hadoop platform and surrounds
- Installed rHadoop packages in the Hadoop cluster to build the environment for the data science group to do prescriptive and descriptive modelling.
- Worked with implementation vendor team to Architect, design and lead the following Innovation projects web/akamai/app server logs analysis and operational dashboard using solrCloud/Hunk/Tableau.
- Offload batch jobs from mainframe to Hadoop environment.
- Build the hive/impala schema for the hourly sales transaction to support the EDW reports. data wrangle and data prep from the various data sources for tableau extract. build system to efficiently archive and retrieve large image files. migrate the oracle data to MongoDB
- Architect, design, and implement db2 9.7 and oracle11g RAC database system involving SAS, Oracle Applications, MDM software.
- Build the architecture and technology stack for the Enterprise Information data governance.
- Architect, design and Integrate the Enterprise Business Reporting tool and environment.
- Design and implement Integration/Framework data hub layer Data model.
- Support and augment Oracle Apps/SAS database system.
- Application/Infrastructure planning and job design and performance tuning for the Data Integration using Confidential InfoSphere suite of tools.
- Design and implement product MDM using Info sphere Collaboration server 9.x
- Design and model data integration and replication using Confidential ISS CDC and Oracle GoldenGate.
Technology Used: Cloudera CDH 5.1.3, Hortonworks HDP 2.0, Syncsort, MongoDB, Python, shell script, Java Map reduce, Hive, Impala, pig, scoop, rHadoop, solrCloud, Tableau, Splunk/Hunk, Python, Oracle 11g RAC, DB2 10.5, IIS 11.3, Autosys, Oracle BI, AIX, J2EE, Confidential IIS 11.3, Confidential InfoSphere Information Governance Catalog, Confidential IIS CDC,Oracle GoldenGate
Confidential
Integration/Data Architect
Technology Used: ACE, Erwin 4.5, Confidential WebSphere Message Broker 7.0.1, Confidential WebSphere 7.0, JDK 1.6, Confidential IS DataStage, Oracle 10.x,J2EE, JSF,ARTS
Responsibilities:
- Build POS solution Architecture using Confidential ACE and Confidential Middleware technologies.
- Designed Enterprise Integration Layer (IL) development to convert the proprietary data format (TLOG) to the retail industry standard (PosLog) XML formal.
- Build Data model of the Operation Data Store (ODS) using ARTS modeling standards to build a central sales transaction repository.
- Designed the ETL layer to interface with the legacy back-end accounting and cash settlement systems. Solution and design the web application for using JSF framework to store and manage store attribute data.
Confidential
Information Architect, Sr. Data Modeler, Team lead
Technology Used: Optim, Erwin 4.5, Confidential WebSphere 6.0, JDK 1.6, JIBX, CXF, Hibernate 3.2, Tibco EMS, Qarbon’s ViewletBuilder, ClearQuest, DB2 UDB
Responsibilities:
- Design and Develop Test Data Management strategy domain areas including processes, approaches, procedures, and tools and environment.
- Build Test Data Catalogue Solution Architecture.
- Design the data model for the Test Data Catalogue.
- Architect, Design, and Lead the Test Data Request application using Confidential ClearQuest and DB2.
- Develop Test Data Catalog viewlets for the and knowledge transfer to the test data managers and leads.
Confidential
Sr. Data Modeler, Information Architect, Team Lead, Data Analyst
Technology Used: Oracle RAC 9i, Erwin 4.5, AbInitio 1.15.6, Confidential WebSphere 6.0, JDK 1.6, JIBX, CXF, Hibernate 3.2, Tibco EMS.
Responsibilities:
- Architect and design the data provisioning hub from CAS to the end system using SOAP Request-Response MEP using Tibco/Hibernate.
- Architect, design, and implement the complex data archiving and purging solution.
- Design the Payment Object Model and Data Persistence layer using J2EE/Hibernate/CXF/JIBX.
- Architecture, design, administration and performance tuning of Oracle 9.x RAC (three node)
- Status tracking, reporting and problem resolution, Root cause analysis of issues, issue escalation and resolution
Confidential
Information Architect, Team Lead, Data Analyst, Data Modeler
Technology Used: Db2 LUW, Oracle 10g, WebSphere 5.0, WebSphere Message Broker, EJB 2.0, iBATIS, Oracle 360 Commerce, Sterling Commerce Yantra, WebSphere Portal, WPC, POSLog.
Responsibilities:
- Worked as an information architect for POS applications which includes application packages like 360 Commerce (now Oracle retail), Sterling Commerce Yantra, and StreamServe, including J2EE, ESB, Web-Services, DB2, SOI, SOA, BPEL, XSD, XML, and XPATH. Also, responsible for leading a project team in delivering solution to the customer. Also, responsible for managing scope, planning, tracking, change control, aspects of the project.
- Lead teams of DBAs and Data Analyst; responsible for Oracle Retail, Database and technical architecture, administration and performance tuning of a 7x24 mission critical applications, DB2 Replication.
- Implement WPC‘s MDM integration with the POS.
- Design and implement data Archive and purge Technical architecture for the store and central databases.
- Design store and central Database physical and logical design using ARTS data framework, Capacity and Scalability Planning, Backup and Recovery.
- Implemented Operation data reporting from different customer and sale data sources using an ODS layer.
- Review and provide recommendations to the Enterprise technical architecture strategy.
- Plan the data and software migrations including phase roll out and cut over.
- Translate customer requirements into formal requirements and design documents, establish specific solutions, and leading the efforts including programming and testing that culminate in client acceptance of the results.
Confidential
Data Analyst, Data Modeler, and Data Architect, Data Steward, Data Warehouse Expert
Technology Used: Db2 LUW, AbInitio, Cognos, Erwin, Confidential BDW (Banking Data Warehouse), Erwin.
Responsibilities:
- Analyzed and data modeled the EDW (Enterprise Data warehouse) and Data Mart for Basel II Accord.
- Analyzed 30+ Source System data received from clients to ensure accuracy and completeness. Profiled Source system data using AbInitio Data Profiler and SAS DataFlux. Resolved data discrepancies and data cleansing as necessary. Defined Data Quality rules, participated in Data Steward and Data Governance.
- Build TPR central database from 30 plus desperate source system across LOB, these involved more than 15,000 tables; 65,000 data elements and; terabyte of data.
- Designed Enterprise Schema for Leasing application and build Logical and Physical Data Model containing 1500+ tables.
- Designed and developed Data Mapping Application for 30+ disparate source systems (COBOL, MS Sql Server, Oracle, and Mainframe DB2), using MS Access, and UNIX Korn script.
- Produced source to target mapping documents for 65,000 data elements based on data discovery results. Design the data masking application for 30+ source systems.
- Tuned Data Mart Reports using optimized query and Materialized Query Tables.
- Designed ETL Logic for EDW and Credit Risk Data Mart.
Confidential
Data Architect
Technology Used: Oracle 10g RAC, Linux.
Responsibilities:
- Performed TPC-C Oracle 10G RAC performance benchmark for NAS and SAN storage on Red Hat Enterprise Linux OS.
- Proposed database architecture design based on benchmark.
Confidential
Data Service Architect, Database Designer, Database, and Application Servers tuning specialist
Technology Used: Oracle 9i RAC, Erwin, J2EE, and Spring Hibernate.
Responsibilities:
- As Database Tuning specialist Administrator/ Database Architecture, he is involved on the development/Testing/Integration and production databases for Client Project.
- As Performance Specialist he performed the application baseline and reviewed the client Oracle 9.x database architecture in respect with ORACLE-SUN-EMC Best practice methodology.
- As Performance Specialist he helped in designing, installation, and instrumentation Veritas i3 (7.2) and TeamQuest (9.2).
- Optimized database using different SQL queries and materialized views.
Confidential
Database Architect, Database administrator, Data Modeler
Technology Used: Db2 for Z (OS390), Oracle 9i, Erwin, InfoSphere DataStage, MicroStrategy.
Responsibilities:
- Designed and maintained the ODS, CDW, and Data Mart’s logical and physical data model to support NHTSA critical reporting needs.
- Implemented detail and efficient ODS and Data Mart.
- Administrated Databases Oracle 9.x (Sun Solaris) and DB2 7.x (OS390)) on the development/Testing/Integration and production databases for TREAD Project.
- Extrapolated, designed, and monitored database storage capacity planning.
Confidential
Database designer, Database administrator, Programmer, Data Analyst
Responsibilities:
- Designed database architecture for the development/Testing/Integration and production databases for nine client projects
- Lead a team for planning and Installing Oracle Database and client software across geography.
- Lead an effort to implemented application and database Security using database program, database roles, profiles, privileges, and views.
Confidential
Sr. Oracle Database Administrator, Data Modeler
Responsibilities:
- Managed five member DBA team to administered, monitored and supported about 85 Oracle database instances in a large data center to support Internet development, PeopleSoft HR, Optum Warehouse management, Data warehouse, Rational Pro and custom web database systems. 96 - Aug 99 (Thermax Software and Systems, Arthur Anderson LLP): Oracle Database Administrator, SAP Basis, ABAP-4 Programmer.
- Worked as an SAP DBA and BASIS Programmer SAP R/3 Material Management (MM), Production Planning (PP
- Administered Oracle database running on Sun machines to support Oracle database for Arthur Andersen Worldwide Risk Management and Treasury Applications. The tools used are Oracle Enterprise Manager, Oracle Server Manager, Erwin 3.5.2, SAP Development Workbench (ABAP/4), Powerbuilder, Oracle Browser, SQL Plus 3.3, SQL Loader and Bourne shell scripts.
- Done Oracle Server installation, Database planning, Database creation, Configuration of middleware in Server & Client machine, Configuration of Client Machine, Server cache optimization, Space management, Monitoring of space usage, Security administration, Database Backup, Database Recovery, Monitored, Tuned database performance.
- Created Logical Model, Physical Model, Domain Dictionary, Trigger, Indexes, View, Procedure/Function, Pre/Post scripts using Erwin Data Modeling Tool.