Data Architect Resume
Richmond, VA
SUMMARY:
- Over 7 years of experience in Machine Learning, Data mining with large datasets of Structured and Unstructured data, Data Acquisition, Data Validation, Predictive modelling, Data Visualization .
- Having good experience in NLP with Apache, Hadoop and Python .
- Extensive experience in Text Analytics, developing different Statistical Machine Learning, Data Mining solutions to various business problems and generating data visualizations using R, Python and Tableau.
- Hands on SparkMlib utilities such as including classification, regression, clustering, collaborative filtering, dimensionality reduction.
- Designing of Physical Data Architecture of New system engines.
- Hands on experience in implementing LDA, NaiveBayes and skilled in Random Forests, Decision Trees, Linear and Logistic Regression, SVM, Clustering, neural networks, Principle Component Analysis and good knowledge on Recommender Systems.
- Developing Logical Data Architecture with adherence to Enterprise Architecture.
- Expertise in transforming business requirements into analytical models, designing algorithms, building models, developing datamining and reporting solutions that scales across massive volume of structured and unstructured data.
- Experience working with data modeling tools like Erwin, PowerDesigner and ERStudio.
- Skilled in using dplyr and pandas in R and python for performing Exploratory data analysis.
- Experience in designing stunning visualizations using Tableau software and publishing and presenting dashboards, Storyline on web and desktop platforms.
- Excellent communication skills (verbal and written) to communicate with clients and team, prepare + deliver effective presentations.
- Familiarity with agile principles (e.g. Scrum), facilitating workshops and prototyping.
- Adept in statistical programming languages like R and also Python including Big Data technologies like Hadoop, Hive .
- Strong experience in Software Development Life Cycle (SDLC) including Requirements Analysis, Design Specification and Testing as per Cycle in both Waterfall and Agile methodologies.
- Implemented Optimization techniques for better performance on the ETL side.
- Highly skilled in using visualization tools like Tableau for creating dashboards .
- Experience working on Data quality tools Informatica IDQ (9.1), Informatica MDM (9.1).
- Experience in foundational machine learning models and concepts: regression, random forest, boosting, GBM, NNs, HMMs, CRFs, MRFs, deep learning.
TECHNICAL SKILLS:
Databases: SQL Server 2017/2016/2014, MS - Access, Oracle 11g/10g/9i, Sybase 15.02 and DB2 2016.
DWH / BI Tools: Microsoft Power BI, SSIS, SSRS, SSAS, Business Intelligence Development Studio (BIDS), Visual Studio, SAP Business Objects, SAP SE v 14.1(Crystal Reports) and Informatica 6.1.
Languages: HTML, DHTML, PL/SQL, SQL, T-SQL, C, C++, XML, HTTP, Matlab, Python.
Tools: and Utilities: SQL Server 2016/2017, SQL Server Enterprise Manager, SQL Server Profiler, Import & Export Wizard, Visual Studio v14, .Net, Microsoft Management Console, Visual Source Safe 6.0, DTS, Crystal Reports, Power Pivot, ProClarity, Microsoft Office 2007/10/13, Excel Power Pivot, Excel Data Explorer, Tableau 8/10, JIRA.
Big Data Tools: Hadoop 2.7.2, Hive, Spark2.1.1, Pig, HBase, Sqoop, Flume.
Data Modeling Tools: Erwin r9.6/9.5, ER/Studio 9.7, Star-Schema Modeling, Snowflake-Schema Modeling, FACT and dimension tables, Pivot Tables.
Database Design Tools and Data Modeling: Star Schema/Snowflake Schema modeling, Fact & Dimensions tables, physical & logical data modeling, Normalization and De-normalization techniques, Kimball & Inmon Methodologies.
Operating Systems: Microsoft Windows 8/7/XP, Linux and UNIX.
PROFESSIONAL EXPERIENCE:
Confidential, Milpitas, CA
Data Scientist
Responsibilities:- Assisted the project with Python programming, coding and running QA on the same from time to time.
- As an Architect design conceptual, logical and physical models using Erwin and build datamarts using hybrid Inmon and Kimball DW methodologies.
- Implemented end-to-end systems for Data Analytics, Data Automation and integrated with custom visualization tools using R, Mahout, Hadoop and MongoDB.
- Worked with several R packages including knitr, dplyr, SparkR, CausalInfer, spacetime.
- Worked with Data governance, Data quality, data lineage, Data architect to design various models and processes.
- Perform a proper EDA, Univariate and bi-variate analysis to understand the intrinsic effect/combined effects.
- Designed data models and data flow diagrams using Erwin and MS Visio.
- Independently coded new programs and designed Tables to load and test the program effectively for the given POC's using with Big Data/Hadoop.
- Developed, Implemented & Maintained the Conceptual, Logical & Physical Data Models using Erwin for Forward/Reverse Engineered Databases.
- As an Architect implemented MDM hub to provide clean, consistent data for a SOA implementation.
- Lead the development and presentation of a data analytics data-hub prototype with the help of the other members of the emerging solutions team
- Established Data architecture strategy, best practices, standards, and roadmaps.
- Worked with Hadoop eco system covering HDFS, HBase, YARN and Map Reduce.
- Involved in business process modeling using UML .
- Performed data cleaning and imputation of missing values using R.
Environment:: Teradata 13.1, Informatica 6.2.1, Ab Initio, Business Objects, Oracle 9i, PL/SQL, Microsoft Office Suite (Excel, Vlookup, Pivot, Access, Power Point), Visio, VBA, Micro Strategy, Tableau, ERWIN.
Confidential, New York, NY
Data Scientist
Responsibilities:
- Developed the logical data models and physical data models that confine existing condition/potential status data fundamentals and data flows using ER Studio
- Reviewed and implemented the naming standards for the entities, attributes, alternate keys, and primary keys for the logical model.
- Created the conceptual model for the data warehouse using Erwin data modeling tool.
- Used External Loaders like Multi Load, T Pump and Fast Load to load data into Oracle and Database analysis, development, testing, implementation and deployment.
- Design and model the reporting data warehouse considering current and future reporting requirement
- Worked with data compliance teams, Data governance team to maintain data models, Metadata, Data Dictionaries; define source fields and its definitions.
- Created stored procedures using PL/SQL and tuned the databases and backend process.
- Worked with Data Scientist in order to create a Data marts for data science specific functions.
- Performed data analysis and data profiling using complex SQL on various sources systems including Teradata, SQL Server.
- Involved in analysis of Business requirement, Design and Development of High level and Low level designs, Unit and Integration testing.
Environment:: Erwin 8, Teradata 13, SQL Server 2008, Oracle 9i, SQL*Loader, PL/SQL, ODS, OLAP, OLTP, SSAS, Informatica Power Center 8.1.
Confidential,Michigan
Data Scientist
Responsibilities:
- Worked as a Data Modeler/Analyst to generate Data Models using Erwin and developed relational database system.
- Analyzed the business requirements of the project by studying the Business Requirement Specification document.
- Extensively worked on Data Modeling tools Erwin Data Modeler to design the data models.
- Designed mapping to process the incremental changes that exists in the source table. Whenever source data elements were missing in source tables, these were modified/added in consistency with third normal form based OLTP source database.
- Designed tables and implemented the naming conventions for Logical and Physical Data Models in Erwin 7.0.
- Designed logical and physical data models for multiple OLTP and Analytic applications.
- Extensively used the Erwin design tool & Erwin model manager to create and maintain the Data Mart.
- Wrote simple and advanced SQL queries and scripts to create standard and ad hoc reports for senior managers.
- Collaborated the data mapping document from source to target and the data quality assessments for the source data.
- Used Expert level understanding of different databases in combinations for Data extraction and loading, joining data extracted from different databases and loading to a specific database.
- Co-ordinate with various business users, stakeholders and SME to get Functional expertise, design and business test scenarios review, UAT participation and validation of financial data.
Environment:: Erwin r7.0, SQL Server 2000/2005, Windows XP/NT/2000, Oracle 8i/9i, MS-DTS, UML, UAT, SQL Loader, OOD, OLTP, PL/SQL, MS Visio, Informatica.
Confidential,Richmond,VA
Data Architect
Responsibilities:- Configured the project on WebSphere 6.1 application servers
- Communicated with other Health Care info by using Web Services with the help of SOAP, WSDL JAX-RPC
- Deployed GUI pages by using JSP, JSTL, HTML, DHTML, XHTML, CSS, JavaScript, AJAX
- Used SAX and DOM parsers to parse the raw XML documents
- Used RAD as Development IDE for web applications.
- Maintenance in the testing team for System testing/Integration/UAT
- Used Log4J logging framework to write Log messages with various levels.
- Conducted Design reviews and Technical reviews with other project stakeholders.
- Implemented Microsoft Visio and Rational Rose for designing the Use Case Diagrams, Class model, Sequence diagrams, and Activity diagrams for SDLC process of the application
- Created test plan documents for all back-end database modules
- Guaranteeing quality in the deliverables.
- Implemented the project in Linux environment.
Environment:: R 3.0, Erwin 9.5, Tableau 8.0, MDM, QlikView, MLLib, PL/SQL, HDFS, Teradata 14.1, JSON, HADOOP (HDFS), MapReduce, PIG, Spark, R Studio, MAHOUT, JAVA, HIVE, AWS.
Confidential,Memphis,TN
Data Analyst
Responsibilities:- Developed the logical data models and physical data models that confine existing condition/potential status data fundamentals and data flows using ER Studio
- Involved in analysis of Business requirement, Design and Development of High level and Low level designs, Unit and Integration testing
- Reviewed and implemented the naming standards for the entities, attributes, alternate keys, and primary keys for the logical model.
- Performed data analysis and data profiling using complex SQL on various sources systems including Teradata, SQL Server.
- Designed, Build the Dimensions, cubes with star schema and Snow Flake Schema using SQL Server Analysis Services (SSAS).
- Created the conceptual model for the data warehouse using Erwin data modeling tool.
- Worked with Data Scientist in order to create a Data marts for data science specific functions.
- Determined data rules and conducted Logical and Physical design reviews with business analysts, developers and DBAs.
- Translate business and data requirements into Logical data models in support of Enterprise Data Models, ODS, OLAP, OLTP, Operational Data Structures and Analytical systems.
Environment:: Erwin r9.0, Informatica 9.0, ODS, OLTP, Oracle 10g, Hive, OLAP, DB2, Metadata, MS Excel, Mainframes MS Visio, Rational Rose, Requisite Pro, Hadoop, PL/SQL, etc.