We provide IT Staff Augmentation Services!

Data Engineer/data Modeler Resume

5.00/5 (Submit Your Rating)

Monroe, LouisianA

SUMMARY

  • Near about 9+ years in IT industry with hands on working experience in Data Engineering, Data Modeling and Data Analysis.
  • Expert knowledge in SDLC (Software Development Life Cycle) and Agile development methods was involved in all phases in projects.
  • Experience in automation and building CI/CD pipelines by using Jenkins.
  • Experience in Microsoft Azure date storage and Azure Data Factory, Azure Data Lake Store(ADLS), AWS and Redshift.
  • Good knowledge in streaming applications using Apache Kafka.
  • Good working experience on Hadoop tools related to Data warehousing like Hive, Pig and also involved in extracting the data from these tools on to the cluster using Sqoop.
  • Experience in designing Star Schema, Snowflake schema for Data Warehouse, by using tools like Erwin, Power Designer and E - R Studio.
  • Experience in integration of various relational and non-relational sources such as SQL Server, Teradata, Oracle, and NoSQL database.
  • Experience in developing Map Reduce Programs using Apache Hadoop for analyzing the Big Data as per the requirement.
  • Experience in modeling with both OLTP/OLAP systems and Kimball and ImmonData warehousing environments.
  • Experience in extracting, transforming and loading (ETL) data from spreadsheets, database tables and other sources using Microsoft SSIS.
  • Excellent in Data Analysis, Data Profiling, Data Validation, Data Cleansing, Data Verification, and Data Mismatch Identification.
  • Extensive experience in writing UNIX shell scripts and automation of the ETL processes using UNIX shell scripting.
  • Experience designing security at both the schema level and the accessibility level in conjunction with the DBAs
  • Using MS Excel and MS Access to dump the data and analyze based on business needs.
  • Ability to work on multiple projects at once while prioritizing the tasks based on team priorities.
  • Proven ability to convey rigorous technical concepts and considerations to non-experts.

TECHNICAL SKILLS

Big Data tools: Hadoop 3.3, HDFS, Hive 3.2.1, Pig 0.17, HBase 1.2, Sqoop 1.4, MapReduce, Spark, Kafka2.8.

Cloud Services: AWS and Azure.

Data Modeling Tools: Erwin 9.8/9.7, ER/Studio V17

Database Tools: Oracle 12c/11g, Teradata 15/14.

Reporting tools: SQL Server Reporting Services (SSRS), Tableau, Crystal Reports, Strategy, Business Objects

ETL Tools: SSIS, Informatica v10, Matillion.

Programming Languages: SQL, T-SQL, UNIX shells scripting, PL/SQL.

Operating Systems: Microsoft Windows 10/8/7, UNIX

Project Execution Methodologies: RUP, JAD, Agile, Waterfall, and RAD

PROFESSIONAL EXPERIENCE

Confidential

Data Engineer/Data Modeler

Responsibilities:

  • As a Data Engineer/Data Modeler designed and deployed scalable, highly available, and fault tolerant systems on Azure.
  • Worked in SCRUM (Agile) development environment with tight schedules.
  • Defined the business objectives comprehensively through discussions with business stakeholders, functional analysts and participating in requirement collection sessions.
  • Developed and designed continuous integration pipeline and integrated using Jenkins.
  • Developed ETL pipelines in and out of data warehouse using combination of Python and Snowsql.
  • Utilized Matillion ETL solution to develop pipeline that extract and transform data from multiple sources and loading to Snowflake.
  • Designed and implemented database solutions in Azure Data Lake, Azure Data Factory, Data Bricks, Azure Synapse Analytics.
  • Used Matillion tool blob storage component and loaded the tables to snowflake Stage layer.
  • Developed Python scripts to clean the raw data.
  • Worked at conceptual/logical/physical data model level using Erwin according to requirements.
  • Involved in the creation, maintenance of Data Warehouse and repositories containing Metadata.
  • Designed and implemented ETL pipelines between from Snowflake DB to the Data Warehouse using Apache Airflow.
  • Used Python Packages for processing HDFS file formats.
  • Worked on Microsoft Azure toolsets including Azure Data Factory Pipelines, Azure Data bricks, Azure Data Lake Storage.
  • Worked with Data governance, Data quality to design various models and processes.
  • Encoded and decoded JSON objects using PySpark to create and modify the data frames in Apache Spark.
  • Installed and configured Hive, Pig, Sqoop, and Oozie on the Hadoop cluster.
  • Configured Input & Output bindings of Azure Function with Azure Cosmos DB collection to read and write data from the container whenever the function executes.
  • Developed MDM meta data dictionary and naming convention across enterprise.
  • Configured the ADF jobs, SnowSQL jobs triggering in Matillion using python.
  • Implemented Azure Data bricks clusters, notebooks, jobs and auto scaling.
  • Designed and implemented effective Analytics solutions and models with Snowflake.
  • Developed ETL pipelines in and out of data warehouse using combination of Python and Snowflake’s Snow SQL.
  • Troubleshoot and maintain ETL jobs running using Matillion.
  • Created Oozie workflows to manage the execution of the crunch jobs and vertica pipelines.
  • Worked on Azure on services like Azure Data Factory, Azure Synapse.
  • Supported solutions and constructed prototypes that incorporated Azure resources like Azure Data Factory, Azure Cosmos Db and Data Bricks.
  • Worked on complex SQL Queries, PL/SQL procedures and convert them to ETL tasks.
  • Tuned and Troubleshooted Snowflake for performance and optimize utilization.
  • Designed and built a Data Discovery Platform for a large system integrator using Azure HdInsight components.
  • Scheduled all the staging, intermediate and final core tables load to snowflake on the Matillion tool.
  • Created and Configured Azure Cosmos DB.
  • Provided ad-hoc queries and data metrics to the Business Users using Hive and Pig.
  • Developed visualizations and dashboards using PowerBI.

Environment: Hadoop3.3, Agile, CI/CD, Azure, Matillion, Python3.5, Erwin R2Sp2, Snowflake, MDM, CosmosDB, Spark, SQL, PL/SQL, JSON.

Confidential - Monroe, Louisiana

Data Modeler/Data Engineer

Responsibilities:

  • Worked as a Data Modeler/Data Engineer to Import and export data from different databases.
  • Involved in Agile development methodology active member in scrum meetings.
  • Used AWS S3 Buckets to store the file and injected the files into Snowflake tables using Snow Pipe and run deltas using Data pipelines.
  • Implemented phasing and checkpoint approach in ETL process to prevent data loss and to maintain uninterrupted data flow against process failures.
  • Generated database scripts usingForward Engineer in using Data Modeling Tool.
  • Created Design Fact & Dimensions Tables, Conceptual, Physical and Logical Data Models using Erwin.
  • Responsible for data lineage, maintaining data dictionary, naming standards and data quality.
  • Objective of this project is to build a data lake as a cloud based solution in AWS using Apache Spark.
  • Extracted files from MongoDB through Sqoop and placed in HDFS and processed.
  • Developed ETL data mapping and loading logic for MDM loading from internal and external sources.
  • Implemented Referential Integrity using primary key and foreign key relationships.
  • Implemented Installation and configuration of multi-node cluster on Cloud using Amazon Web Services (AWS) on EC2.
  • Widely used Normalization methods and have done different normalizations (3NF).
  • Designed and created Data Quality baseline flow diagrams, which includes error handling and test plan flow data
  • Performed Verification, Validation and Transformations on the Input data (Text files) before loading into target database.
  • Used Erwin for reverse engineering to connect to existing database and ODS.
  • Used AWS Cloud with Infrastructure Provisioning / Configuration.
  • Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting on the dashboard.
  • Working closely with the Data Stewards to ensure correct and related data is captured in the data warehouse as part of Data Quality check.
  • Extracted data using SQL queries and transferred it to Microsoft Excel and Python for further analysis.
  • Validated the data feed from the source systems to Snowflake DW cloud platform.
  • Configured Spark streaming to receive real time data from the Kafka and store the stream data to HDFS.
  • Created Queries and Tables using MySQL.
  • Developed dashboards using Tableau Desktop.
  • Created numerous reports using report lab and python packages. Installed numerous Python modules.
  • Handled performance requirements for databases in OLTP and OLAP models.
  • Designed and developed an entire module called CDC (change data capture) in python and deployed in AWS GLUE using PySpark library and python.
  • Created mapping tables to find out the missing attributes for the ETL process.
  • Developed various T-SQL stored procedures, triggers, views and adding/changing tables for data load, transformation and extraction.

Environment: Erwin9.8, AWS, MySQL, Python, Hadoop3.0, NoSQL, ETL, Sqoop1.4, MDM, OLAP, OLTP, ODS, Tableau, Agile.

Confidential - Chicago, IL

Sr. Data Analyst/Data Modeler

Responsibilities:

  • Performed data analysis, data modeling and data profiling using complex SQL queries, Facets as the source, and Oracle as the database.
  • Created physical and logical data models using Erwin.
  • Translated business concepts into XML vocabularies by designing XML Schemas with UML
  • Designed both 3NF data models for ODS, OLTP systems and dimensional data models using star and snow flake Schemas.
  • Extensively used Star Schema methodologies in building and designing the logical data model into Dimensional Models
  • Developed batch processing solutions by using Data Factory, Azure SQL and Azure Databricks.
  • Worked with data investigation, discovery, and mapping tools to scan every single data record from many sources.
  • Worked with developers on data Normalization and De-normalization, performance tuning issues, and provided assistance in stored procedures as needed.
  • Worked with MDM systems team with respect to technical aspects and generating reports.
  • Extensively created SSIS packages to clean and load data to data warehouse.
  • Designed and developed T-SQL stored procedures to extract, aggregate, transform, and insert data
  • Performed data mining on Claims data using very complex SQL queries and discovered claims pattern.
  • Created PL/SQL procedures, triggers, generated application data, Created users and privileges, used oracle utilities import/export.
  • Create and maintain data model standards, including master data management (MDM).
  • Facilitated in developing testing procedures, test cases and User Acceptance Testing (UAT).
  • Developed and maintained data dictionary to create metadata reports for technical and business purpose.
  • Created customized report using OLAP Tools such as Crystal Report for business use.
  • Generated periodic reports based on the statistical analysis of the data from various time frame and division using SQL Server Reporting Services (SSRS).
  • Involved in extensive data analysis on Teradata, and Oracle Systems Querying and Writing in SQL.
  • Developed database triggers and stored procedures using T-SQL cursors and tables.
  • Loaded multi format data from various sources like flat-file, Excel, MS Access and performing file system operation.
  • Extensively created tables and queries to produce additional ad-hoc reports.

Environment: Erwin, Azure, SQL, MDM, PL/SQL, SSIS, SSRS, OLAP, OLTP, T-SQL, UNIX, MX Excel.

Confidential - Boston, MA

Data Analyst/Data Modeler

Responsibilities:

  • Worked with Data Analyst/Data Modeler for requirements gathering, business analysis and project coordination.
  • Conducted a JAD session to review the data models involving SME, developers, testers and analysts.
  • Translated business requirements into working logical and physical data models for Data warehouse, Data marts and OLAP applications.
  • Responsible for the development and maintenance of Logical and Physical data models, along with corresponding metadata, to support Applications.
  • Involved in requirement gathering and database design and implementation of star-schema, snowflake schema/dimensional data warehouse using ER/Studio.
  • Worked extensively with MicroStrategy Report developers in creating data marts and develop reports
  • Work with the Data Analysis team to gathering the Data Profiling information.
  • Worked on Data Mining and data validation to ensure the accuracy of the data between the warehouse and source systems.
  • Created DB2 objects such as databases, tables, indexes, triggers, stored procedures etc.
  • Performed the detail data analysis, Identify the key facts and dimensions necessary to support the business requirements.
  • Generated Data dictionary reports for publishing on the internal site and giving access to different users.
  • Wrote a complex SQL, PL/SQL, Procedures, Functions, and Packages to validate data and testing process.
  • Created SSIS package to load data from Flat files, Excel and Access to SQL server using connection manager.
  • Develop all the required stored procedures, user defined functions and triggers using T-SQL and SQL.
  • Produced report using SQL Server Reporting Services (SSRS) and creating various types of reports.
  • Wrote UNIX shell scripts to invoke all the stored procedures, parse the data and load into flat files.
  • Used MS Visio to represent system under development in a graphical form by defining use case diagrams, activity and workflow diagrams

Environment: ER/Studio, DB2, PL/SQL, SSIS, MicroStrategy, MX Excel, T-SQL, UNIX, OLAP, OLTP.

We'd love your feedback!