We provide IT Staff Augmentation Services!

Data Architect/data Modeler Resume

0/5 (Submit Your Rating)

New York, NY

SUMMARY

  • Overall 10 years of experience as Data Architect/Modeler and Data Analyst including designing, developing and implementation of data models for enterprise - level applications and systems.
  • Solid understanding of architecture, working of Hadoop framework involving Hadoop Distributed File System and its components like Pig, Hive, Sqoop, PySpark.
  • Experience in Microsoft Azure components like Event Hub, Stream Analytics, ADW, HDInsight Cluster and Azure Data factory.
  • Experience in AWS services S3, Glue, AWS Lambda and Redshift.
  • Strong working experience in extracting, wrangling, ingestion, processing, storing, querying and analyzing structured and semi - structured.
  • Experience in developing ETL framework using Talend for extracting and processing data.
  • Strong experience with SQL Server and T-SQL in constructing joins, user defined functions, stored procedures, views, indexes, user profiles and data integrity.
  • Knowledge of Star Schema Modeling, and Snowflake modeling, FACT and Dimensions tables, physical and logical modeling.
  • Experience in Data transformation, Data mapping from source to target database schema, Data Cleansing procedures
  • Adept in programming languages like Python including Big Data technologies like Hadoop and Hive.
  • Experience with data transformations utilizing SnowSQL in Snowflake
  • Extensive experience working with RDBMS such as SQL Server, MySQL, and NoSQL databases such as MongoDB, HBase.
  • Experienced in creating shell scripts to push data loads from various sources from the edge nodes onto the HDFS.
  • Strong knowledge of various Data warehousing methodologies and Data modeling concepts.
  • Strong experience developing visualizations with Tableau and deploying those visualizations to Tableau Server.
  • Ability to work effectively in cross-functional team environments, excellent communication and interpersonal skills.

TECHNICAL SKILLS

Big Data Tools: Hadoop Ecosystem Hadoop3.3, MapReduce, Spark 2.3, HBase 1.2, Hive 3.2.1, Pig 0.17, Sqoop 1.4, Kafka 2.8, Oozie 4.3

Data Modeling Tools: Power Designer, ER/Studio V17 and Erwin 9.7

Cloud Management: Azure Data Lake, Azure Data Bricks, Azure Data Factory, Azure Synapse, AWS, GCP, Big Query and Snowflake.

Data Base: SQL Server2019, Oracle12/11g, Teradata15/14, MySQL, DB2

Methodologies: RAD, JAD, RUP, UML, SDLC, Agile, Waterfall Model.

NoSQL Data Base: Amazon DynamoDB, CosmosDB, MongoDB and HBase

Programming Languages: SQL, PL/SQL, UNIX shell Scripting

Operating System: Windows, Unix, Sun Solaris

ETL Tools: Talend, Informatica 9.6/9.1, SSIS.

PROFESSIONAL EXPERIENCE

Confidential

Data Architect/Data Modeler

Responsibilities:

  • As a Data Architect/Modeler, Responsible for building scalable distributed data solutions using Big Data and Data Systems.
  • Leading a Cloud migration initiatives moving environments currently hosted in AWS to Azure.
  • Responsible for thedataarchitecturedesign delivery, data model development, review, approval and usedAgile MethodologyforData Warehousedevelopment.
  • Involved in designing & managing the data integration architecture based on ad-hoc, continuous and scheduled requests or operations.
  • Designed and implemented migration strategies for traditional systems onto Azure platform.
  • Involved in designing and building Multitenant SaaS architectures and relational data models for enterprise applications.
  • Enhanced & maintained the data warehouse in snowflake to support all reporting BI needs.
  • Involved in building cloud solutions architecture using Microsoft Azure and Snowflake CDW (Cloud Data Warehouse).
  • Automated Cube refresh using the Azure Functions.
  • Design and implemented secure data pipelines into a Snowflake data warehouse from on-premises and cloud data sources.
  • Developed conceptual solutions and created proof-of-concepts to demonstrate viability of solutions and performance.
  • Architected and implemented ETL and data movement solutions using Azure Data Platform services (Azure Data Lake, Azure Data Factory, Data Bricks, Delta Lake).
  • Built Azure Data Factory (V2) pipelines to ingest source system data into Azure Data Lake Storage.
  • Involved in creating the business case for the reference data platform and Data Governance.
  • Designed ETL process using Talend Tool to load from Sources to Snowflake through data Transformations
  • Wrote complex Stored Procedures, coding and triggers to capture updated and deleted data from OLTP systems.
  • Architect solutions using MS Azure PaaS services such as Azure SQL Managed Instance (AMI), Azure Data Factory, Azure Synapse, Azure Data Lake and HDInsight.
  • Created complex data transformations and automated data ingestion into information marts utilizing Azure Data Bricks.
  • Developed ETL pipelines in and out of data warehouse using combination of Python and Snowflake’s SnowSQL.
  • Implemented row-level security policies in Azure Synapse Analytics (SQL DW) and within Power BI data models
  • Analyzed data sources and requirements and business rules to perform logical and physical data modeling.
  • Responsible for choosing the Hadoop components (Hive, Pig, Map-Reduce, Sqoop etc).
  • Developed data marts in theSnowflakecloud data warehouse.
  • Created program inpythonto handle PL/SQL functions like cursors and loops which are not supported by snowflake.
  • Implemented Copy activity, custom Azure Data Factory pipeline activities.
  • Generate the DDL of the target data model and attached it to the JIRA to be deployed in different Environments.
  • Developed PySpark scripts that runs on MSSQL table pushes to Big Data where data is stored in Hive tables.
  • Created POC of Enterprise Data Lake Store using ADF V2, ADLS Gen 1/2, Azure SQL DW, Azure HDInsights, Azure Databricks using python, PowerBI.
  • Used Scrum Development Methodology.
  • Generated JSon files from the JSon models created for Zip Code, Group and Claims using Snowflake DB.
  • Involved in creating the notebooks for moving data from raw to stage and then to curated zones using Azure data bricks.
  • Designed and built a Data Discovery Platform for a large system integrator using Azure HdInsight components
  • Developed different MapReduce applications on Hadoop.
  • Involved in scheduling Oozie workflow engine to run multiple Hive and Pig jobs.
  • Created and Configured Azure Cosmos DB.
  • Worked on different data formats such as JSON and performed machine learning algorithms in Python.
  • Performed structural modifications using Map-Reduce, Hive and analyze data using visualization/reporting tools (Tableau).
  • Worked at conceptual/logical/physical data model level using Erwin according to requirements.
  • Involved in Architecting, Design, developing complex Azure Analysis Services tabular databases and deploying the same in Microsoft azure and scheduling the cube through Azure functions.
  • Worked with Azure BLOB and Data lake storage and loading data into Azure SQL Synapse analytics (DW).
  • Worked on Data Factory editor, to create linked services, tables, data sets, and pipelines
  • Used ETL component Sqoop to extract the data from MySQL and load data into HDFS.
  • Implemented various Azure platforms such as Azure SQL Data Warehouse, Azure Synapse, Azure Analysis Services, HDInsight, Azure Data Lake and Data Factory.
  • Built, integrated, and deployed reports and dashboards using Power BI.
  • Designed and developed user defined functions, stored procedures, triggers for Cosmos DB.
  • Tuned and Troubleshoot Snowflake for performance and optimize utilization.
  • Created tables, views, secure views, user defined functions in Snowflake Cloud Data Warehouse.
  • Created Snowpipe for continuous data load.
  • Created Azure Data Lake Analytics U-SQL jobs for transforming data in Azure Data Lake Store to move from Raw to Stage and then to Curated zones.
  • Involved in debugging the applications monitored on JIRA using agile methodology.
  • Designed relational database models for small and large applications.
  • Created Store procedures, User defined functions, Views, T-SQL scripting for complex business logic.

Environment: Agile, Erwin R2SP2, Azure Data Lake, Azure Data Factory, Azure Synapse, Big Data, SnowflakeDB, Azure Data Bricks, CosmosDB, Hive3.2.1, Pig0.17, Map-Reduce, Sqoop, Python, PL/SQL, Oozie, Power BI, T-SQL.

Confidential - New York, NY

Sr. Data Modeler/Data Architect

Responsibilities:

  • Worked as a Data Modeler/Architect to Import and export data from different databases.
  • Involved in Agile methodologies, daily scrum meetings and spring planning.
  • Reviewed business requirement and compose source to target data mapping documents.
  • Integrated and automated data workloads to Snowflake Warehouse.
  • Worked with data governance team to maintain data models, Data Dictionaries; define source fields and its definitions.
  • Developed Azure Data factory pipelines to move the data from source systems and transform the data in azure blob storage and created ADLA (Azure Data Lake Analytics) USQL jobs to transform the data in Azure.
  • Managed Azure Infrastructure Azure Data Lake, Azure Data Factory, Azure SQL, Azure Storage and Recover from a Recovery Services Vault using Azure Portal.
  • Responsible for allmetadatarelating to theEDW'soveralldataarchitecture, descriptions of data objects, access methods and security requirements.
  • Designed and implemented effective Analytics solutions and models with Snowflake.
  • Analyzed Azure Data Factory, Azure Data Lake and Azure Data Bricks to build new ETL process in Azure.
  • Worked on developing ETL processes to load data from multiple data sources to HDFS using Kafka and Sqoop.
  • Recreated existing application logic and functionality in the Azure Data Lake, Data Factory, and SQL data warehouse environment
  • Used Polybase for ETL process with Azure Data Warehouse to keep data in Blob Storage with almost no limitation on data volume.
  • Designed and developed operational and analytical reports and dashboards using Power BI, and Excel.
  • Developed application components interacting with HBase.
  • Worked with Azure BLOB and Data lake storage and loading data into Azure SQL Synapse analytics (DW).
  • Moved data from Oracle to HDFS and vice-versa using Sqoop to supply the data for Business users.
  • Validated the data feed from the source systems to Snowflake DW cloud platform.
  • Developed MDM meta data dictionary and naming convention across enterprise.
  • Used Azure Data Factory as an orchestration tool for integrating data from upstream to downstream systems.
  • Involved in running Hadoop jobs for processing millions of records of text data.
  • Developed HIVE queries for the analysis, to categorize different items.
  • Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
  • Wrote Pig Scripts to generate Map Reduce jobs and performed ETL procedures on the data in HDFS.
  • Worked with DBA to create Best-Fit Physical Data Model from the logical Data Model using Forward Engineering in Erwin.
  • Created the template SSIS package that will replicate about 200 processes to load the data using Azure SQL.
  • Migrated Oracle database tables data into Snowflake Cloud Data Warehouse.
  • Created multiple scripts to automate ETL process using Pyspark from multiple sources
  • Used Spark for data analysis and store final computation result to HBase tables.
  • Used SSIS package to populate data from excel to database, used lookup, derived column and conditional split to achieve the required data.
  • Scheduled Power BI datasets from Power BI service for automatic refresh of data.
  • Used Azure Data Factory extensively for ingesting data from disparate source systems.
  • Extracted the needed data from the server into HDFS and Bulk Loaded the cleaned data into HBase.
  • Designed, documented operational problems by following standards and procedures using JIRA.
  • Designed OLAP cubes with star schema and multiple partitions using SSAS
  • Generated Tableau Public Dashboard with constraints to show specific aspects for a different purpose.

Environment: Erwin9.8, Snowflake DW, Agile, PySpark, Python, Azure Data Lake, Azure Data Factory, Oozie4.3, Pig0.17, MapReduce, DBA, SSIS, Azure Data Bricks, Power BI, Tableau, HBase1.2, JIRA, Sqoop1.4, Kafka1.1, MDM, OLAP,.

Confidential - Minneapolis, Minnesota

Data Modeler/Data Architect

Responsibilities:

  • Worked as a Data Modeler/Architect to review business requirement and compose source to target data mapping documents.
  • Involved in story-driven agile development methodology and actively participated in daily scrum meetings.
  • Moved ETL pipelines from SQL server to Hadoop Environment.
  • Defined the strategy and architecture required to integrate data across multiple systems, improve the existing data warehouse architecture and support our transition to AWS.
  • Worked on AWS Data Pipeline to configure data loads from S3 to into Redshift.
  • Troubleshoot and resolved complex production issues while providing data analysis and data validation.
  • Design and developed end-to-end ETL process from various source systems to Staging area, from staging to Data Marts and data load
  • Used Hive to analyze the Partitioned and Bucketed data and compute various metrics for reporting.
  • Designed and develop end to end ETL processing from Oracle to AWS using Amazon S3, EMR, and Spark.
  • Used Spark Streaming to divide streaming data into batches as an input to Spark engine for batch processing.
  • Designed and developed Scala workflows for data pull from cloud-based systems and applying transformations on it.
  • Used DynamoDB Stream to lambda using python.
  • Involved in using Sqoop for importing and exporting data between RDBMS and HDFS.
  • Transferred data from AWS S3 to AWS Redshift.
  • Developed Hive scripts for implementing control tables logic in HDFS.
  • Integrated data quality plans as a part of ETL processes.
  • Created and run jobs on AWS cloud to extract transform and load data into AWS Redshift using AWS Glue, S3 for data storage and AWS Lambda to trigger the jobs.
  • Automated Regular AWS tasks like snapshots creation using Python scripts.
  • Developed code in Spark SQL for implementing Business logic with python as programming language.
  • Automated the Informatica jobs using UNIX shell scripting.
  • Implemented AWS Lambdas to drive real-time monitoring dashboards from system logs.
  • Used Teradata OLAP functions like RANK, ROW NUMBER, QUALIFY, CSUM and SAMPLE.
  • Identified the entities and relationship between the entities to develop Conceptual Model using ERWIN.
  • Identified performance issues in existing sources, targets and mappings by analyzing the data flow, evaluating transformations and tuned accordingly for better performance.
  • Involved in administrative tasks, including creation of database objects such as database, tables, and views, using SQL, DDL, and DML requests.
  • Involved in the OLAP and OLTP data model for consistent relationship between tables and efficient database design.
  • Involved in Managing and troubleshooting the multi-dimensional data cubes developed in SSAS.

Environment: Erwin, Agile, Hive, Spark, Sqoop, Informatica, AWS, Teradata OLAP, OLTP, Python, AWS Lambda, SSAS, HDFS, DynamoDB.

Confidential, Jacksonville FL

Data Modeler/Data Analyst

Responsibilities:

  • Worked with Data Modeler/Analyst for requirements gathering, business analysis and project coordination.
  • Involved in requirement gathering and database design and implementation of star-schema, snowflake schema using ER/Studio.
  • Involved in developing the data warehouse for the database using the Ralph Kimball's Dimensional Data Mart modeling methodology.
  • Involved inNormalization/De-normalization, Normal Form and database design methodology.
  • Created a process design architecting the flow of data from various sources to target and Involved in extracting the data from OLTP to OLAP.
  • Analyzed the data which is using the maximum number of resources and made changes in the back-end code using PL/SQL stored procedures and triggers
  • Created Hive tables to push the data to MongoDB.
  • Worked on creating DDL, DML scripts for the data models.
  • Worked on debugging and identifying the unexpected real-time issues in the production server SSIS packages.
  • Used MongoDB to stored data in JSON format and developed and tested many features of dashboard using Python.
  • Documented logical, physical, relational and dimensional data models.
  • Involved in Integration of various data sources like DB2 and XML Files.
  • Wrote documentations for each package including purpose, data source, column mapping, transformation.
  • Worked with the ETL team to document the transformation rules for data migration from OLTP to Warehouse environment for reporting purposes.
  • Used forward engineering approach for designing and creating databases for OLAP model.
  • Produced report using SQL Server Reporting Services (SSRS) and creating various types of reports.
  • Created the DDL scripts using ER/Studio and source to target mappings to bring the data from Source to the warehouse.
  • Gathered the data from multiple data sources and validated source data as per requirements.
  • Designed and created data warehouse solutions by applying Ralph Kimball methodology including dimensional modeling and star schema.
  • Used SAS to extract, transform & load source data from transaction systems, generated reports, insights, and key conclusions.
  • Created Pivot tables in Excel to analyze data across several dimensions.

Environment: ER/Studio, DB2, OLAP, SSIS, Excel, PL/SQL, MongoDB, OLTP, SAS, MS Excel.

Confidential - Atlanta, GA

Data Analyst/Data Modeler

Responsibilities:

  • Worked with Data Analysis, Modeling and Profiling for multiple sources and answered complex business questions by providing data to business users.
  • Worked closely with business analysts to ensure business requirements were gathered and written on time and in a manner suitable for technical staff.
  • Worked on cloud technologies and experience in Amazon EC2 and S3 and supporting both the development and production environment.
  • Involved in logical and physical designs and transforming logical models into physical implementations.
  • Involved in extensive coding in Oracle forms/reports using PL/SQL.
  • Maintained SSIS packages for large scale data conversion, data import, and export processes.
  • Wrote complex SQL queries, PL/SQL stored procedures and convert them to ETL tasks.
  • Created and maintained documents related to business processes, mapping design, data profiles and tools.
  • Used the automated process for uploading data in production tables by using Unix.
  • Implemented logical and physical relational database and maintained Database Objects in the data model using Power Designer
  • Worked on Amazon Web service (AWS) to integrate EMR with Spark and S3 storage and Snowflake.
  • Participated in JAD sessions, gathered information from Business Analysts, end users and other stakeholders to determine the requirements.
  • Created SSIS Packages for import and export of data between Oracle database and others like MS Excel and Flat Files.
  • Involved in generating and documenting Metadata while designing OLTP and OLAP systems environment.
  • Coordinated dictionaries and other documentation across multiple applications.
  • Performed ad hoc analyses, as needed, with the ability to comprehend analysis as needed.
  • Transformed all reports, dashboards previously built-in Excel into Tableau.

Environment: Power Designer, Oacle, OLAP, OLTP, SSIS,Tableau, MS Excel, PL/SQL, Unix.

We'd love your feedback!