We provide IT Staff Augmentation Services!

Sr. Data Engineer Resume

Eden Prairie, MN

SUMMARY

  • Over 7 years of experience in IT with legacy data base systems and data model implementation and maintenance.
  • Experience in design the Data Pipeline which can capture data from streaming web data as well as RDBMS source data
  • Good experience in Data Modeling and Data Analysis as a Proficient in gathering business requirements and handling requirements management.
  • Experience in developing Conceptual, logical models and physical database design for OLTP and OLAP systems using ER/studio, ERwin and Sybase Power Designer.
  • Good knowledge of Software Development Life Cycle (SDLC) methodologies like Waterfall and Agile.
  • Effectively using open source languages Python and Scala
  • Experience in building Azure stack( ADW, Data Factory, CosmosDB, Event Hub, Stream Analytics)
  • Experience in data management and implementation of Big Data applications using Spark and Hadoop frameworks.
  • Highly proficient with AWS resources such as EMR, S3 buckets, EC2 instances, RDS and others.
  • Expert in writing SQL queries and optimizing the queries in Oracle, SQL Server.
  • Responsible for Data cleansing & transformation of time series data with PySpark
  • Working knowledge in Importing and exporting data into HDFS and Hive using Sqoop
  • Excellent Experience with Big Data technologies (e.g., Hadoop, BIgQuery, Hive, Hbase etc.)
  • Solid in - depth understanding of Information security concepts, Data modeling and RDBMS concepts.
  • Experience in data analysis using Hive and Pig Latin.
  • Expertise in Data Analysis, Data Validation, Data Cleansing, Data Verification and identifying data mismatch.
  • Extensive experience in development of T-SQL, Oracle PL/SQL Scripts, Stored Procedures and Triggers for business logic implementation.
  • Expertise in SQL Server Analysis Services (SSAS) and SQL Server Reporting Services (SSRS) tools.
  • Excellent experience in troubleshooting test scripts, SQL queries, ETL jobs, data warehouse/data mart/data store models.
  • Experience in Data transformation, Data mapping from source to target database schemas, Data Cleansing procedures.
  • Experience in writing and executing unit, system, integration and UAT scripts in a data warehouse projects.
  • Involve in analysis, development and migration of Stored Procedures, Triggers, Views and other related database objects
  • Strong experience in using Excel and MS Access to dump the data and analyze based on business needs.
  • Performing extensive data profiling and analysis for detecting and correcting inaccurate data from the databases and to track data quality.

TECHNICAL SKILLS

Big Data tools: Hadoop 3.0, HDFS, Hive 2.3, Pig 0.17, Scala, HBase 1.2, Sqoop 1.4, Kafka1.1, Oozie4.3

Data Modeling Tools: Erwin 9.8/9.7, Sybase Power Designer, Oracle Designer, ER/Studio V17

Database Tools: Oracle 12c/11g, Teradata 15/14 and

ETL Tools: SSIS, Informatica v10.

Reporting tools: (SSRS), Tableau, Crystal Reports

Project Execution Methodologies: JAD, Agile, SDLC.

Programming Languages: SQL, T-SQL, Python, UNIX shells scripting, PL/SQL.

Operating Systems: Microsoft Windows 10/8/7, UNIX, and Linux

PROFESSIONAL EXPERIENCE

Confidential - Eden Prairie, MN

Sr. Data Engineer

Responsibilities:

  • Worked on importing and exporting data from Oracle and Teradata into HDFS and HIVE using Sqoop.
  • Extensively Created data pipelines in cloud using Azure Data Factory.
  • Involved with all the phases of Software Development Life Cycle (SDLC) methodologies throughout the project life cycle.
  • Migrated SQL database to Azure Data Lake, Azure SQL Database, Data Bricks and Azure SQL Data warehouse and Controlling and granting database access.
  • Resolved the revolving issues by conducting and participating in JAD sessions with the users, modelers and developers.
  • Built the model on Azure platform using Python and Spark for the model development and Dash by plotly for visualizations.
  • Created dimensional model for the reporting system by identifying required dimensions and facts using Erwin.
  • Exported analyzed data to HDFS using Sqoop for generating reports.
  • Developed solutions using Spark SQL, Spark streaming, Kafka to process web feeds and server logs.
  • Worked with PySpark framework
  • Automated the data processing with Oozie to automate data loading into the Hadoop Distributed File System.
  • Used Sqoop to channel data from different sources of HDFS and RDBMS.
  • Involved in running Hadoop streaming jobs to process terabytes of text data.
  • Involved in the design of Data-warehouse using Star-Schema methodology and converted data from various sources to Sql tables.
  • Implemented of Azure cloud solution using HDInsight, Event Hubs, CosmosDB, cognitive services and KeyVault.
  • Implemented best income logic using Pig scripts and Joins to transform data to AutoZone custom formats.
  • Imported Bulk Data into HBase Using Map Reduce programs.
  • Involved in Normalization / De-normalization, Normal Form and database design methodology.
  • Maintained PL/SQL objects like packages, triggers, procedures etc.
  • Used sing Sqoop and Azure Data Factory from HDFS to Relational Database Systems and vice-versa.
  • Involved in several facets of MDM implementations including Data Profiling, Metadata acquisition and data migration.
  • Work with ETL members to implement data acquisition logic and resolve data defects.

Environment: Erwin9.8, Oracle12c, Teradata15, HDFS, HIVE2.3 Sqoop1.4, Azure Data Factory, Spark SQL, Kafka1.1 Oozie4.3, Hadoop3.0 CosmosDB, Pig0.17, Map Reduce, HBase1.2, PL/SQL, MDM, ETL, Python.

Confidential - Franklin Lakes, NJ

Data Engineer

Responsibilities:

  • Translated business concepts into XML vocabularies by designing XML Schemas with UML
  • Involved in Agile methodologies, daily scrum meetings, planning's.
  • AWS provided a secure global infrastructure, plus a range of features that use to secure the data in the cloud
  • Communicated and presented default customers profiles along with reports using Python
  • Worked extensively on ER Studio in several projects in both OLAP and OLTP applications.
  • Created and worked Sqoop jobs with incremental load to populate Hive External tables.
  • Extracted the data from Oracle into HDFS using Sqoop.
  • Created customized report using OLAP Tools such as Crystal Report for business use.
  • Involved in managing and reviewing Hadoop log files.
  • Worked on Data Mining and data validation to ensure the accuracy of the data between the warehouse and source systems.
  • Facilitated in developing testing procedures, test cases and User Acceptance Testing (UAT)
  • Wrote MapReduce jobs to discover trends in data usage by users.
  • Written Hive queries for data analysis to meet the business requirements.
  • Exported data using Sqoop from HDFS to Teradata on regular basis.
  • Worked on Normalization and De-normalization concepts and design methodologies.
  • Wrote T-SQL statements for retrieval of data and Involved in performance tuning of T-SQL queries and Stored Procedures.
  • Worked and transformed structured, semi structured and unstructured data and loaded into Hbase.
  • Participated in the creation of Business Objects Universes using complex and advanced database features.
  • Involved in creating Hive tables, loading with data and writing Hive queries which will run internally in map reduce way.
  • Reviewed Complex ETL Mappings and Sessions based on business user requirements.
  • Used excel sheet, flat files, CSV files to generated Tableau ad-hoc reports

Environment: ER Studio, XML, Python, OLAP, OLTP, AWS, HDFS, Sqoop, Hadoop, MapReduce, Teradata, T - SQL, HBase, Hive, ETL.

Confidential - Rockville, MD

Data Analyst/Data Engineer

Responsibilities:

  • Used Python to preprocess data and attempt to find insights.
  • Exactly connected to AWS Redshift through Tableau to extract live data for real time analysis.
  • Worked at conceptual/logical/physical data model level using Erwin according to requirements.
  • Created customized report using OLAP Tools such as Crystal Report for business use.
  • Defined the Primary Keys PKs and Foreign Keys FKs for the Entities, created dimensions model star and snowflake schemas using Kimball methodology
  • Worked on debugging and identifying the unexpected real-time issues in the production server SSIS packages.
  • Involved in loading data from UNIX file system to HDFS.
  • Created multiple automated reports and dashboards sourced from data warehouse using Tableau/SSRS
  • Designed SSIS Packages to transfer data from flat files to SQL database tables.
  • Developed the batch program in PL/SQL for the OLTP processing and used Unix Shell scripts to run in corn tab.
  • Developed and maintained data dictionary to create metadata reports for technical and business purpose.
  • Worked in importing and cleansing of data from various sources like Teradata, Oracle, flat files, with high volume data
  • Worked on Informatica Utilities Source Analyzer, warehouse Designer, Mapping Designer, Mapplet Designer and Transformation Developer.
  • Written complex SQL queries for validating the data against different kinds of reports generated by Business Objects XIR.
  • Created various PL/SQL stored procedures for dropping and recreating indexes on target tables.
  • Created data masking mappings to mask the sensitive data between production and test environment.
  • Collected, analyze and interpret complex data for reporting and/or performance trend analysis
  • Wrote T-SQL procedures/Views to facilitate data access for reporting needs

Environment: Erwin, Redshift, Python, OLAP, UNIX, HDFS, PL/SQL, OLTP, SQL, SSRS, SSIS, Tableau, Teradata, Oracle, Informatica, T-SQL.

Confidential - Sparks, NV

Data Analyst/Data Modeler

Responsibilities:

  • Performed data analysis and profiling of source data to better understand the sources.
  • Created ER diagrams using Power Designer modeling tool for the relational and dimensional data modeling.
  • Involved in writing, testing, and implementing triggers, stored procedures and functions at Database level using PL/SQL.
  • Used forward engineering approach for designing and creating databases for OLAP model
  • Ensured the feasibility of the logical and physical design models.
  • Provided investigation and root cause analysis support for operational issues.
  • Designed a STAR schema for the detailed data marts and Plan data marts involving confirmed dimensions.
  • Created tables, views, sequences, indexes, constraints and generated SQL scripts for implementing physical data model.
  • Created ETL Jobs and Custom Transfer Components to move data from Oracle Source Systems to SQL Server usingSSIS.
  • Worked on the Snow-flaking the Dimensions to remove redundancy.
  • Designed and Developed Oracle PL/SQL and Shell Scripts, Data Import/Export, Data Conversions and Data Cleansing.
  • Created or modified the T-SQL queries as per the business requirements.
  • Used Teradata utilities fastload, multiload, tpump to load data.
  • Generated Sub-Reports, Cross-tab, Conditional, Drill down reports, Drill through reports and Parameterized reports using SSRS.
  • Used data profiling tools and techniques to ensure data quality for data requirements.
  • Used Normalization methods up to 3NF and De-normalization techniques for effective performance in OLTP systems.
  • Designed and produced client reports using Access, Tableau and SAS.

Environment: Power Designer, OLAP, PL/SQL, ETL, Oracle, T-SQL, Teradata, SSRS, SQL, SAS, Access, Tableau, OLTP.

Hire Now