We provide IT Staff Augmentation Services!

Hadoop Big-data Architect Resume

3.00/5 (Submit Your Rating)

Melville, NY

SUMMARY

  • Having 14+ years of Strong experience leading and Architecting Solutions in Big Data, Data Warehouse, BI and Analytics.
  • Strong Knowledge and experience in Data Architect, Data Modeling, Metadata, Data Migration, Data mining, Data Science, Data Profiling, MDM, Data Governance, Data Cleansing, Transformation, Integration, Data Import, and Data Export.
  • Extensive experience in architecting, loading and analyzing large datasets with Hadoop framework (MapReduce, HDFS, pig, hive, Flume, Sqoop, spark, Scala) and NoSQL databases.
  • Strong management experience. Critical and assertive analyst, able to evaluate complex business problems, to define solutions, and to establish project scope and requirements, execute and maintain project on time and budget.
  • Solid record of achievements as Data Architect and Data Modeler including very large Databases, Data Warehouses and Data Marts. Defined Conceptual, Logical and Physical data design, including data warehouses features like partitions, Materialized Views and bitmap indexes.
  • Experience in Hadoop, analytics and relational like Databases such as Hive, Phoenix, Impala and Aster.
  • Strong ETL background, working with different ETL tools like SQL Server Integration Services, Informatica, IBM information server and in house job controls using IBM, C and shell scripting.
  • Always working with Modeling, BI and Analytic tools like IBM Cognos, SAS, Erwin, tableau, Power BI and Qlikview and programing languages such as python, R, Scala, Java, C, C++, VB6, HTML, Java, JavaScript, C# and shell script.
  • Hands on experience in Normalization (1NF, 2NF, 3NF and BCNF) Denormalization techniques for effective and optimum performance in OLTP and OLAP environments.
  • Experience in distributed computing architectures and NoSQL Databases such as Cassandra, Mongo DB and HBase to solve big data type problems.
  • Experienced in relational Databases such as Informix, Oracle, DB2, SQL Server and Teradata to solve and create models for Data warehouses and the complete integration with Big Data, ODS and the Data Lake.
  • Developing and designing POC's using Web services, ETL, storage, virtualization, application servers, data security, data bases, and DRP, BI and Analytics tools.
  • Expertise in integration of data from different formats like Spreadsheets, Text files, JSON, XML files, sequential, vsam, logs, structured, semi structured and no structured. From different sources as, applications, DMs, core systems, EUC, external, media and RDBMS.
  • Expertise in Data warehousing, Data Marts, Operational Data Store (ODS), Dimensional Data Modeling (Star Schema Modeling, Snow - Flake Modeling for FACT and Dimensions Tables), including designing, developing and implementation of data models for enterprise-level applications and systems.
  • Strong technical noledge on Oracle Suite products like OBIEE, e-Business Suite and Siebel.
  • Good experience working with Oracle 9i, 10g and 11g versions using PL/SQL; creating Tables, Views, Materialized Views, Stored Procedures, Functions and Packages using tools like Toad for Oracle 12.0.
  • High experience in data architecture, business architecture, application architecture and technology architecture as per TOGAF Framework for Enterprise Architecture.
  • Deep noledge of Data Management, Data Governance and Database Administration (DBA), with extensive practice designing and developing using RDBMS databases like ORACLE, SQL-Server, DB2, PostGres and my-SQL.
  • Experience in distributed computing architectures and NoSQL Databases such as Cassandra, Mongo DB and HBase to solve big data type problem using Cloudera Hadoop Distribution (CHD).
  • Extensive experience managing designing and developing BI projects including BI Technical Architecture Solutions, ETL mapping and transformations, OLAP processing and Visualization Strategy using dashboards, Views, Graphs, etc.
  • Solid multinational project experience in diverse industries like Banking, Retail, Health and Insurance Industries using development methodologies.
  • Effective communicator, mediator, team leader, planner and organizer

TECHNICAL SKILLS

Databases: Hive, HBASE, HDFS, Oracle 10/11/12, Postgres, MS-SQL Server, MySQL, DB2, NoSQL

Operating Systems: UNIX (AIX, Solaris, HP-Unix, Linux), MS Windows

Programming: Java, Scala, PL/SQL, Tsql, HQL, Python, Eclipse, bash, Perl, C-Shell, K-Shell, Pig

Designing Tools: Oracle Designer, Embarcadero, ERwin, TOAD, JDeveloper, UML, RUP.

BI Tools: Tableau, QlickView, OBIEE, Siebel, BI Publisher, Essbase, Hyperion, MicroStrategy.

ETL: Talend, Informatica, Oracle Data Integrator ODI, SSIS, Oracle Warehouse Builder (OWB).

ERP-CRM: Financial, Oracle e-business Suite 11i. Siebel, PeopleSoft.

Web Environments: Web Logic, J2EE, JSP, HTML, XML, XSLT, DHTML, JavaScript, ADO, JQuery, CSS

Methodologies/Tools: TOGAF, SDLC (Waterfall, Prototyping, Agile), JIRA, Confluence, Stash, Jenkins

PROFESSIONAL EXPERIENCE

Confidential, Melville, NY

Hadoop Big-Data Architect

Responsibilities:

  • Support Projects for Enterprise Data Provisioning Platform to capture or provide data in the Enterprise Data Lake using a DaaS model and Identify Solution.
  • Receive Business Requirements and create documentation dat includes technical specs and design for Ingestion and Extraction Data Projects. Validate Metadata completeness and Data Governance and Architectural Principles are followed as part of the Solution.
  • Identify time-lines and onshore and/or offshore resources, coordinate resources like business analysts, developers, testing analysts and Production Support and stablish a bridge between technical Resources and Management.
  • Ingestion and extraction of multiple type of formats, structured, semi-structured and unstructured formats like: audio XML, Cobol Copy Books, Delimited Files, Json, excel, html.
  • Perform data analysis using HIVE, HBASE, HDFS for
  • Query and Analyze metadata using database repositories from Postgres DB.
  • Develop Hadoop applications using tools like Podium, bash, python, java, PIG, hive, sqoop, oozie, flume.
  • Participating of scrum meetings as tech lead.
  • Creating documentation like technical design documents and use cases about the changes applied, DB scripts for DB changes, testing and environment setting specifications.
  • Analysis and reviewing of assigned tasks and final development goals during team meetings (in Daily and Weekly Scrum meetings);
  • Running and deployments of DEV, SIT, PAT and PROD builds using GIT and Nexus through Jenkins.
  • Provide Big Data Solutions to TD LOB in the Hadoop ecosystem. Developed different projects dat involve ingest data in Data Lake, transform the Data using ETL tools like Talend to populate Data Marts for Financial Data Models and extract data to consumers as files or Reports using BI tools. Some of the projects like Advance Analytics, Enterprise Customer Risk Rating ECRR, MDM Gold extraction, Dodd Frank US Regulatory System.
  • ECRR Project is a Program dat consolidates Customer Credit Information from all Systems in the Bank and rates Customers base on statistical scoring calculations to identify High Risk Customers. The Customer Rating is used for Anti money Laundry and High Risk Customer Group to classify customers based on Risk. The Project took customer information from all systems in the bank, ingested into the data lake; from the data lake, the information was Extracted Transformed and Loaded into a Relational Data Model in Oracle Database using Java code. Then the Data is prepared and pushed into SAS database dat uses analytical processing to calculate Customer Scoring and feedback the Oracle Database with the Scores.
  • Capture and extraction of files to support Dodd Frank Act project. Ingested data from all available applications. The Data is consolidated into the Data Lake and processed to rank customers based on their risks
  • Architect and Implement solution to populate data lake with the information from one application
  • ES&O Web Application Data Lake ingestion Project required to load data base information. Designed solution using PODIUM an ETL tool dat facilitates Meta Data Capture. Implemented solution using sqoop-in
  • Enhancement of Security using Citrix to restrict application access and user access into the Cloudera Cluster.

Environment: Cloudera Hadoop Distribution Ecosystem, HDFS, MapReduce, Spark, Python, Scala, Pig, Hive, Impala, Java, Podium, Podium Publish, Sqoop, Oozie, Hue, Postgres, bash, kafka. Agile development tools using Atlassian stack (Confluence, JIRA, Stash-Bitbucked and Jenkins), Nexus as repository manager.

Confidential, New York, NY

Business Solution Architect

Responsibilities:

  • As Business consultant performed feasibility studies, analyzed and documented of business and technical requirements\designs for new systems or enhancements to existing ones.
  • As technical consultant performed project manager role for Implementation of Information Systems Projects defined plans, resources and timelines and follow up its execution.
  • Control Technical Resources, Creation of Project Plan, Control of Project Plan for Technology Projects.
  • Create Blue Prints for Technological Solutions, to define current and future estate.
  • Evaluated and validated architecture and security of a cloud solution for a global implementation of Strategic Sourcing, Procurement, Contract Management and Supplier Management. Implemented the chosen solution, Ivalua, using agile methodology. Led the integration of complex interfaces on the multinational solution including Security, Single Sign On, Accounts payables and HR. Met with different divisions to define requirements, design solutions, define timelines, identify resources and follow up tasks. The project generated efficiencies by standardize processes.
  • Advised on implement a global PeopleSoft ERP and OBIEE solution standardize procedures and creation of a center of excellence. The solution saved millions in infrastructure and IT maintenance costs. Built and controlled the infrastructure project plan to implement the first subsidiary, identified user needs, conducted status meeting and ensured compliance.
  • Define the process and procedures and Data Governance compliance for the implementation of Procurement and Sourcing Project.

Environment: AIX, ORACLE RAC 12c, Java, Python, ErWin, PeopleSoft, Java,, SharePoint, OBIEE, Informatica, MS Project, Visio, PowerPoint, Excel.

Confidential

Application Architect, Oracle Siebel BI and DW

Responsibilities:

  • Designed and developed ETL process to load CLM data into DWH and into the OBIEE 11g Repository to allow creation of OBIEE analysis.
  • Create Specification design based on Business Requirements, coordinate off-shore development and Design and Develop OBII reports to match Business Reporting and Analytical Requirements.
  • Designed and a BI solution for Tender Management solution dat used a SIEBEL database instance.
  • Worked on the DWH and DQM using Informatica 9 to Extract Data from Siebel Database and populate the EDW for Analytics and Reporting. Informatica ETL mapping to extract, transform and load data, using different integration techniques.
  • Designed and created mappings and workflows using Power Center Designer; workflow manager and repository manager for Tender Management Project.
  • Load Data to the DWH from multiple CRM Siebel database sources and MDM database, uses DQM solution for consolidation and cleansing purposes.
  • The BI solutions used multiple analytical reports and dashboard allowing the sales force to use BI Publisher from PDAs. OBIEE 11g and Informatica 9 were running on AIX servers.

Environment: AIX, Informatica 9, Oracle, RDBMS 12c, SIEBEL, OBIEE. IIS as HTTP Server, ClickView.

Confidential

BI/DW Data Architect

Responsibilities:

  • Designed, developed, and implemented enterprise business intelligence solutions using Software Development Lifecycle.
  • Hands-on delivery with engagement in propositions, pre-sales tenders, client management, team leadership and training.
  • As BI Data Architect increased customer satisfaction and expanded professional Services by training more TEMPthan 12 employees from the Fidelity Call Centre to use BI Analytics.
  • Created Cost Reductions with the consolidation of hardware and software of the migration of the CIBC New York Sub-Ledger to the Corporate General Ledger, Resulting in savings of thousands of dollars and time of implementation architecting and implementing write-back application in OBIEE.
  • As Data Architect mapped source data and target database and defined Data Warehouse’s changes based on CIBC Mutual Funds Sales Reporting Requirements; mapping source systems like CORFAX and UNITRAX with current-state of enterprise data warehouse.
  • Defined and Coordinated integration using OBIEE between multiple Hyperion Essbase cubes, PeopleSoft ERP system and Oracle DB databases. Facilitate drill through reporting from GL into sub-ledger details.
  • Improved corporate financial analysis, statutory and regulatory reporting by creation of Dashboards, reports and KPI using OBIEE. Developed multiple Store procedures using PL/SQL.
  • Coordinated Finance Report Developers; keep development on track, coordinating assignments and progress
  • As Solution Architect for CIBC Business Banking and based on requirements and analysis of current state, based on interviews; defined an analytical solution including the Strategic Plan and Implementation Roadmap to implement customer profitability using data warehouse and BI components.
  • CIBC Mutual Funds Sales Reporting - Data Architect
  • Created prototype for analytical solution for the Mutual Fund Sales Reporting project, defining Data Mart structure using ERWin implemented in ORACLE and using OBIEE as the analytical tool.
  • Identification of Mutual Fund Reporting Sales Business requirements and validation with source systems and existing data warehouse defining a current state mapping from manufacturer mutual fund source systems like CORFAX and UNITRAX into Enterprise Data Warehouse.

We'd love your feedback!