We provide IT Staff Augmentation Services!

Sr. Data Engineer Resume

2.00/5 (Submit Your Rating)

SUMMARY

  • About 8+ years of professional experience in IT, working with various Legacy Database systems, which includes work experience in Big Data technologies as well.
  • Hands on experience on complete Software Development Life Cycle SDLC for the projects using methodologies like Agile and hybrid methods.
  • Expert in using Python for Data Engineering and Modeling
  • Vast experience of working in the area of data management including data analysis, gap analysis and data mapping.
  • Excellent knowledge on Confidential Azure Services, Amazon Web Services and Management.
  • Transforming and retrieving the data by using Pig, Hive, SSIS and Map Reduce.
  • Extensive experience on interaction with system users in gathering business requirements and involved in developing projects.
  • Data importing and exporting by using Sqoop from HDFS to Relational Database Systems and vice - versa.
  • Excellent knowledge and extensively using NOSQL databases (HBase).
  • Excellent Knowledge of RalphKimball and BillInmon's approaches to Data Warehousing.
  • Responsible for designing and building a DataLake using Hadoop and its ecosystem components.
  • Strong background in various Data modeling tools using ERWIN, ER Studio and Power Designer.
  • Solid in-depth understanding of Information security concepts, Data modeling and RDBMS concepts.
  • Hands on experience on writing Queries, Stored procedures, Functions and Triggers.
  • Good knowledge on converting complex RDMS (Oracle, MySQL & Teradata) queries into Hive query language.
  • Good knowledge in OLAP, OLTP, BusinessIntelligence and Data Warehousing concepts with emphasis on ETL and Business Reporting needs.
  • Develop generic SQL Procedures and Complex T-SQL statements to achieve the reports generation.
  • Experience in Performance Tuning and query optimization techniques in transactional and Data Warehouse Environments.
  • Hands on experience on data modeling with Star schema and Snowflake schema.
  • Perform structural modifications using Map - Reduce, HIVE and analyze data using visualization/ reporting tools.
  • Experience in developing customized UDF’s in Python to extend Hive and Pig Latin functionality.
  • Experience in extracting source data from Sequential files, XML files, CSV files, transforming and loading it into the target Data warehouse.
  • Excellent knowledge of MicrosoftOffice with an emphasis on Excel.
  • Experience in Data mining with querying and mining large datasets to discover transition patterns and examine financial reports.
  • An excellent experience in generating ad-hoc repots using Crystal Report
  • Hands on experience for Data Integrity process and Data Modelling concepts.
  • Effectively Plan and Manage project deliverable with on-site and offshore model and improve the client satisfaction.

TECHNICAL SKILLS

Big Data Tools: HBase 1.2, Hive 2.3, Pig 0.17, HDFS, Sqoop 1.4, Kafka 1.0.1, Oozie 4.3, Hadoop 3.0

Data Modeling Tools: Erwin Data Modeler 9.8, ER Studio v17, and Power Designer 16.6.

Databases: Oracle 12c, Teradata R15, MS SQL Server 2016, DB2.

Cloud Platform: AWS, Azure, Google Cloud, Cloud Stack/Open Stack

Programming Languages: SQL, PL/SQL, UNIX shell Scripting, Python, Spark.

Methodologies: JAD, System Development Life Cycle (SDLC), Agile, Waterfall Model.

Operating System: Windows, UnixETL Tools: Informatica 9.6/9.1 and Tableau.

PROFESSIONAL EXPERIENCE

Confidential

Sr. Data Engineer

Responsibilities:

  • As a Sr. Data Engineer, provided technical expertise and aptitude to Hadoop technologies as they relate to the development of analytics.
  • Implemented data pipelines using python code
  • Followed solely SDLC methodologies during the course of the project.
  • Participated in JAD sessions for design optimizations related to data structures as well as ETL processes
  • Loaded and transformed large sets of structured, semi structured and unstructured data using Hadoop/Big Data concepts.
  • Used windows Azure SQL reporting services to create reports with tables, charts and maps.
  • Developed a data model (star schema) for the sales data mart using Erwin tool.
  • Extracted data using Sqoop Import query from multiple databases and ingest into Hive tables
  • Developed SQL scripts for creating tables, Sequences, Triggers, views and materialized views.
  • Performed several ad-hoc data analysis in Azure Data bricks Analysis Platform on KANBAN board.
  • Used Azure reporting services to upload and download reports
  • Worked on debugging and identifying the unexpected real-time issues in the production server SSIS packages.
  • Loaded real time data from various data sources into HDFS using Kafka.
  • Developed Map Reduce jobs for Data Cleanup in Python.
  • Defined extract - translate-load (ETL) and extract-load-translate (ELT) processes for the Data Lake.
  • Participated in integration of MDM (Master Data Management) Hub and data warehouses.
  • Written pig script to load processed data from HDFS into MongoDB.
  • Developed Simple to complex Map/reduce Jobs using Hive and Pig
  • Extracted the needed data from the server into HDFS and Bulk Loaded the cleaned data into HBase.
  • Designed the Data Marts in dimensional data modeling using star and snowflake schemas.
  • Worked on reading multiple data formats on HDFS using python.
  • Involved in database development by creating Oracle PL/SQL Functions, Procedures and Collections.
  • Worked in Oozie and workflow scheduler to manage hadoop jobs with control flows.
  • Prepared Tableau reports and dashboards with calculated fields, parameters, sets, groups or bins and publish on the server.
  • Translated business requirements into SAS code for use within internal systems and models
  • Migrated of ETL processes from RDBMS to Hive to test the easy data manipulation.

Environment: Erwin9.8, Big Data3.0, Hadoop3.0, Azure, SQL, Sqoop1.4, ETL, HDFS, Kafka1.1, Python, Map Reduce, SSIS, MDM, Pig0.17, MongoDB, Hive2.3, HBase1.2, Oracle12c, PL/SQL, Tableau, Oozie4.3 SAS.

Confidential - Nashville, TN

Data Engineer / Data Modeler

Responsibilities:

  • As a Sr. Data Engineer I was responsible for all data management related aspects of a project.
  • Used the Agile Scrum methodology to build the different phases of Software development life cycle.
  • Key player in maintaining data pipelines and flows
  • Conducted JAD sessions, wrote meeting minutes and also documented the requirement
  • Worked on Data modeling, Advanced SQL with Columnar Databases using AWS
  • Imported data from RDBMS to HDFS and Hive using Sqoop on regular basis.
  • Performed Data Analysis using procedures and functions in PL/SQL.
  • Developed stored procedures in SQL Server to standardize DML transactions such as insert, update and delete from the database.
  • Wrote T-SQL procedures to generate DML scripts that modified database objects dynamically based on user inputs.
  • Worked on Normalization and De-Normalization techniques for both OLTP systems.
  • Analyzed data using Hadoop components Hive and Pig.
  • Extensively designed and developed the HBase target schema.
  • Used Sqoop to import data into HDFS and Hive from other data systems.
  • Generated DDL statements for the creation of new ER/studio objects like table, views, indexes, packages and stored procedures.
  • Generated ad-hoc SQL queries using joins, database connections and transformation rules to fetch data from Teradata database.
  • Used reverse engineering to connect to existing database and create graphical representation (E-R diagram).
  • Managed database design and implemented a comprehensive Snow flake-Schema with shared dimensions.
  • Created Hive tables to store data and written Hive queries.
  • Designed Mapping Documents and Mapping Templates for ETL developers
  • Analyzed the SQL scripts and designed the solution to implement using python.
  • Facilitated in developing testing procedures, test cases and User Acceptance Testing (UAT).
  • Designed and Developed Oracle PL/SQL and Shell Scripts, Data Import/Export, Data Conversions and Data Cleansing.
  • Presented the data scenarios via, ER/Studio logical models and excel mockups to visualize the data better.
  • Created graphical representation in the form of Entity Relationships and elicit more information.

Environment: ER/studio, Agile, Hadoop3.0, AWS, HDFS, SQL, PL/SQL, Hive2.3, Sqoop1.4, T-SQL, OLTP, Pig0.17, HBase1.2, Teradata15, ETL, Oracle12c.

Confidential

Data Analyst/Engineer

Responsibilities:

  • Worked on AWS Redshift and RDS for implementing models and data on RDS and Redshift
  • Used Agile Method for daily scrum to discuss the project related information.
  • Performed data analysis and data profiling using complex SQL on various sources systems including Oracle.
  • Worked with MDM systems team with respect to technical aspects and generating reports.
  • Created HBase tables to load large sets of structured, semi-structured and unstructured data coming from UNIX, NoSQL and a variety of portfolios.
  • Worked on Normalization and De-Normalization techniques for both OLTP and OLAP systems
  • Implemented Dimensional model for the Data Mart and responsible for generating DDL scripts using Erwin.
  • Involved in complete SSIS life cycle in creating SSIS packages, building, deploying and executing the packages all environments.
  • Created SQL scripts for database modification and performed multiple data modeling tasks at the same time under tight schedules.
  • Loaded data into Hive Tables from Hadoop Distributed File System (HDFS) to provide SQL-like access on Hadoop data
  • Created Mappings, Mapplets, Sessions and Workflows using Informatica to replace the existing Stored Procedures
  • Developed Data mapping, Data Governance, Transformation and Cleansing rules for the Data Management.
  • Designed and created Data Marts as part of a data warehouse.
  • Developed Hive queries to process the data for visualizing.
  • Repopulated the static tables in the Data warehouse using PL/SQL procedures and SQL Loader.
  • Translated business concepts into XML vocabularies by designing XML Schemas with UML
  • Involved in the design and development of user interfaces and customization of Reports using Tableau.
  • Created the data model for the Subject Area in the Enterprise Data Warehouse (EDW).
  • Involved with data profiling for multiple sources and answered complex business questions by providing data to business users.
  • Developed reports for users in different departments in the organization using SQL Server Reporting Services (SSRS).
  • Performed dicing and slicing on data using Pivot tables to acquire the churn rate pattern and prepared reports as required.

Environment: Erwin9.7, Redshift, Agile, MDM, Oracle12c, SQL, HBase1.1, UNIX, NoSQL, OLAP, OLTP, SSIS, Informatica, HDFS, Hive, XML, PL/SQL, Tableau, SSRS.

Confidential

Data Analyst/Data Modeler

Responsibilities:

  • Effectively involved in Data Analyst/Data Modeler role to review business requirement and compose source to target data mapping documents.
  • Used Sybase Power Designer tool for relational database and dimensional data warehouse designs.
  • Extensively used Star Schema methodologies in building and designing the logical data model and Physical Data Model into Dimensional Models
  • Designed and developed databases for OLTP and OLAP Applications.
  • Used SQL Profiler and Query Analyzer to optimize DTS package queries and stored procedures.
  • Created SSIS packages to populate data from various data sources.
  • Extensively performed Data analysis using Python Pandas.
  • Extensively involved in the modeling and development of Reporting Data Warehousing System.
  • Developed reports and visualizations using Tableau Desktop as per the requirements.
  • Extensively used Informatica tools- Source Analyzer, Warehouse Designer, Mapping Designer.
  • Involved in migration and Conversion of Reports from SSRS.
  • Migrated data from SQL Server to Oracle and uploaded as XML-Type in Oracle Tables.
  • Performed Data Dictionary mapping, extensive database modeling (Relational and Star schema) utilizing Sybase Power Designer.
  • Involved in performance tuning and monitoring of both T-SQL and PL/SQL blocks.
  • Worked on creating DDL, DML scripts for the data models.
  • Created volatile and global temporary tables to load large volumes of data into Teradata database
  • Involve in designing Business Objects universes and creating reports
  • Provided production support to resolve user issues for applications using Excel VBA
  • Worked with Tableau in analysis and creation of dashboard and user stories.
  • Involved in extensive Data validation using SQL queries and back-end testing
  • Performed GAP analysis of current state to desired state and document requirements to control the gaps identified.

Environment: Sybase Power Designer, SQL, PL/SQL, Teradata14, Oracle11g, XML, Tableau, OLAP, OLTP, SSIS, Informatica, SSRS, Python, T-SQL, Excel.

We'd love your feedback!