We provide IT Staff Augmentation Services!

Big Data Architect Resume

4.00/5 (Submit Your Rating)

Reston, VA

SUMMARY

  • Expertise in developing OLTP, OLAP, and Data Warehouse systems (using 3NF, Multidimensional Star and Snowflake schemas designs), NoSQL databases, MDM and DAMA - DMBOK framework for Data Governance, Data Mining, statistical and historical/predictive Data Analytics Solutions.
  • Strong Data modeling (Logical and Physical) experience and high proficiency in using Erwin, ER/Studio, and Entity-Relationship Modeling with in-depth understanding of business applications, dataflow and the use of stored procedures/triggers in ETL development tasks.
  • Strong understanding of data structure/model classifications, data wrangling and data cleansing.
  • Extensive hands on experience in APIs, Docker, Cloud and Virtualization technologies.
  • Have hands-on experience in implementing Big Data solutions using Hadoop/Storm and Spark, Azure HDInsight, IBM BigInsights, Cloudera-CDH, AWS, Google cloud; loading data using Flume and Sqoop, analyzing big data using python, Pig, Impala/Hive SQL editor in Hue and Ambari tool.
  • Have hands-on experience in SaaS, PaaS, IaaS, DaaS, SDLC (Agile/SCRUM framework - sprint planning, daily SRUM meeting, product/sprint backlog, sprint review, and sprint restrospective).
  • Significant training and experience in project management, unit/functional and system testing, user stories/requirements management, team(s) and tasks management using desktop/online visual studio TFS (Team Foundation Server)/Team Service, SharePoint, and IBM Relational DOORS.
  • Have demonstrated strong troubleshooting, problem solving, leadership and team work skills as well as the ability to accomplish tasks under minimal directions and supervision.
  • Have strong communication and presentation skills, with experience in translating complex business requirements into detailed functional and/or technical specifications, developing alternative data solutions/strategies to complex problems and recommending the best solution/strategy to the business process and project team.

TECHNICAL SKILLS

Operating Systems: Windows 7 EE, Windows Server 2008/2012, and UNIX/LINUX

Databases: MySQL, Oracle 11g/12c, SQL Server 2008-2016, DB2, Postgres, Teradata DynamoDB, Cassandra, HBase, MongoDB, DocumentDB, Red Shift, Impala.

Cloud Platform:  VMware, AWS, Salesforce, IBM, Cloudera, Microsoft Azure, Google Cloud.

Networking: TCP/IP, DNS, LAN, WAN, APIs, and RESTful Web Services.

Programming Languages: SQL, Hive, Pig, R, Python, JSON, Avro, XML, APIs, MapReduce, HDFS Shell, YARN, Kafka, Zookeeper, Oozie, Ambari tool.

Data Modeling: Design OLTP & OLAP using ERwin, ER Studio, Microsoft Visio.

ETL Tools: SSIS, SQL Loader, Informatica Power Center, Pentaho, Sqoop, Flume.

Machine Learning: WEKA, R, Python/Anaconda, SparkML, AML Studio, SSAS Data Mining.

BI Tools/Data Analysis:  Tableau, Excel, Power BI, SSIS, SSAS & SSRS, SAS, B.O, Google Analytics,

CRM & ERP Applications:  Intuit Quickbase, AWS, Salesforce Sales Cloud, OBIEE.

Requirements/Process Mgt:  IBM Rational DOORS, SIX SIGMA, Team Foundation Server online/desktop.

PROFESSIONAL EXPERIENCE

Confidential, Reston, VA

Big Data Architect

Responsibilities

  • Leverage Microsoft Azure (cloud platform) and Agile/SRUM methodology to build data storage and analytic solution based on data collection and migration/integration, data streaming and real-time big data analytics and data flow from broad range of devices/web applications; analyze and integrate data streams in-motion with back-office systems, use cloud based Hadoop/Spark solution(architecture includes Apache Accumulo and Zookeeper) for batch processing, and develop machine-learning solutions by integrating code from R and Python into cloud based Azure Machine Learning (AML) studio.
  • Develop data architectures and data models/schemas using Erwin for complex, highly visible business initiatives/enterprise projects and also provide consulting support in cloud data management to project teams to support Mobile solutions and Clinical Workflow Applications.
  • Manage JSON/XML/CSV data flow/transfer in a cloud based SQLDB and NoSQL Cassandra, and across MongoDB, HDFS, Oracle 11g, and SQL Server2012 to perform real time data analytics using Power BI, stream analytics, and Azure IoT micro service related solutions .
  • Use AML service and Cortana Analytics to create and deploy cloud-based predictive analytics models that learn from existing/current data in order to forecast future outcomes and trends.
  • Provide guidance and support in developing KPIs, data quality metrics, and metadata capture including evaluating alternative strategies for transactional and analytical systems to effectively share data between legacy data sources and cloud based enterprise relational/nonrelation systems.

Confidential, Washington DC

Sr. Data Architect/BI Developer

Responsibilities

  • Implemented Amazon AWS Data Lake (Amazon EC2, Amazon Redshift, and S3) that is leveraged in performing data migration and integration, writing complex SQL queries, analytical and aggregate functions to understand the current state of Confidential data assets and how those data assets can be leveraged to provide enhanced near real time decision making.
  • Performed data extraction and migration, data cleaning, analysis, and visualization using Informatica, Tableau Desktop 9.3 to support Redshift Data warehousing solution on AWS.
  • Implemented big data solution using CDH 5.0 and performed data streaming into HDFS from AWS and Arrow web servers using Sqoop/Flume and R.
  • Used Flume and Sqoop as the main Hadoop ETL tools for both batch and streaming data processing to extract, transform and load data coming from many heterogeneous systems into the integrated SQL Server 2012 operational data store and Teradata EDW solution.
  • Prototyped BI solutions using Tableau desktop 9.3 and lead the BI product design and the development of data models, testing, and integration of Sales ODS.
  • Developed reports using Tableau and materialized/updatable views for real-time data analytics.
  • Played a leading role in modeling and constructing of data architecture for operational data stores, data marts and enterprise data warehouse with data sources such as DB2, SQL server 2012 and Oracle 11g.
  • Performed data mapping tasks, data profiling, data cleansing, data standardization and consolidation using Informatica power center, including intensive data migration and the creating, executing, and documenting of test cases.
  • Used Informatica Designer, Workflow Manager and Repository Manager to create source and target definition, design mappings, create repositories and establish users, groups and their privileges.
  • Extracted data from the databases (Oracle 11g, SQL Server 2012, and DB2) using Informatica to load it into a single Teradata warehouse repository and developed many stored procedures to manage different aspects of the Confidential ’s database and data warehouse systems.

Confidential, Silver Spring, MD

Data Architect & Tableau Developer

Responsibilities

  • Lead Architect to evaluate, restructure/upgrade the information systems applications and using AWS cloud (EC2, Redshift, S3 and Amazon RDS) in performing data processing, migration, integration, and storage including implementing BI tools in claims administration and data analytics and providing recommendations to enhance data quality and the cloud DW solution.
  • Implemented data marts and data warehouse systems with de-normalized database schema objects using the star schema approach and performed data extraction from operational data stores and data marts on DB2, MySQL and SQL Server 2008/2012.
  • Deployed intensive use of ER/Studio and ERwin data modeler in the creation of conceptual, logical, and physical data models, database schemas for both data Server 2012, and transformed/loaded the data into Amazon Redshift.
  • Responsible for analyzing large and complex data sets to uncover anomalies, discover key patterns/trends, develop OLAP cubes and BI solutions using SSIS/SASS and Tableau.
  • Worked with Tableau Desktop 9x/Server to develop Dashboard and scorecard reports on KPIs using rich graphic visualizations with drill down and drop down menu option and various reporting objects like Facts, Attributes, Hierarchies, Transformations, filters, prompts, Sets, Groups, and Parameters.
  • Worked on building queries to retrieve data into Tableau from SQL Server 2012, Oracle 11g, Teradata, MySQL and other external sources.
  • Implemented big data solution using CDH 4.7 and CDH 5x to process semi structured data.
  • Designed, reviewed, implemented and optimized data transformation processes in the Hadoop ecosystems including developing HiveQL scripts to parse the raw data, populate staging tables and store the refined data in partitioned tables in Teradata EDW.
  • Implemented MDM solutions using MS MDM service and IBM Initiate (MDM tool).
  • Created pivot tables, dimensions, cubes, dashboards using Tableau, SSAS, and MS Add-in for Excel (power pivot, power query, and power view), while representing aggregates in different data visualization to highlight KPIs.

Confidential

Financial/Data Analyst

Responsibilities

  • Leveraged Global Banking application to analyze the risk factor involved in a loan and the possibility of default on a loan to help the credit administrator make a reasonable decision on approving the loan and determining the rate of interest, or even deny lending of loan.
  • Collected all loan applicants’ information and performed analysis on the credit history of the loan applicants and making use of statistics in assessing risks associated with lending loans.
  • Reviewed and synthesized large amounts of financial data from Core Banking and Financial Banking System Applications to produce reports for management and decision-makers.
  • Performed data extraction and manipulation using Microsoft SQL Server Management Studio 2005/2008 and Microsoft Excel to create and manipulate spreadsheets with intensive use of financial and statistical functions, v-look-up, pivot tables, charts and graphing.
  • Performed a lead role in the re-engineering of SQL Server 2008 OLTP/OLAP operations into Microsoft Access with use of linked tables, complex queries, and macros in implementing database tasks automation; importing and exporting data, and using various tools to manipulate and present data in a comprehensible format.
  • Used MS Expression Builder and functions such as string functions, mathematical functions, date functions, logical functions, aggregate and group by functions to extract and manipulate data.
  • Created new MS Access databases and also made major modifications to the existing databases and database's user interface forms and corrected data integrity and constraints issues.

We'd love your feedback!