We provide IT Staff Augmentation Services!

Data Engineer Resume

3.00/5 (Submit Your Rating)

Bothell, WA

PROFESSIONAL SUMMARY:

  • More than 7+ year of IT experience with special emphasis on Analysis, Design, Development and Testing of ETL methodologies in all the phases of the Data Warehousing.
  • 5 Years of strong experience in designing and implementing Data Mart / Data Warehouse applications using various ETL tools.
  • Expert in writing SQL queries and optimizing the queries in Oracle, MS SQL Server, Netezza and Teradata.
  • Hands on experience in implementing Slowly Changing dimension types (I, II &III) Methodologies, Incremental Loads and Change Data Capture (CDC).
  • In - depth understanding of Star Schema, Snow Flake Schema, Normalization, 1st NF, 2nd NF, 3rd NF, Fact tables, Dimension tables.
  • Experience in optimizing and performance tuning of Mappings and implementing the complex business rules by creating re-usable Transformations, Mapplets and Tasks .
  • Hands-on Expertise in Data Warehouse programming concepts such as SQL Server Stored Procedures, PL/SQL, Tableau, Teradata, JavaScript and HTML.
  • Knowledge of Teradata BTEQ, Fast Load, Fast Export and MLOAD scripts.
  • Queried Vertica , SQL Server for data validation along with developing validation worksheets in Excel in order to validate the dashboards on Tableau .
  • Extensively used SQL and PL/SQL for development of Procedures, Functions, Packages and Triggers.
  • Good knowledge of Normalization, Fact Tables and Dimension Tables, also dealing with OLAP and OLTP systems.
  • Experience in implementing Data Warehouse, Datamart, ODS, OLTP and OLAP, teamed with project scope. Familiar with Top-Down & Bottom-Up Data Warehouse approaches.
  • Experience working with data modelers to translate business rules/requirements into conceptual/logical Dimensional model and worked with Complex Denormalized and Normalized data models.
  • Extensive experience with ETL tool Informatica in Designing the Workflows, Worklets, Tasks and Mappings, scheduling and monitoring the Workflows and sessions using Informatica PowerCenter 9.1/8.x/7.x .
  • Experienced on Tableau Desktop, Tableau Server and good understanding of tableau architecture.
  • Excellent understanding and knowledge of NOSQL databases like HBase, Cassandra .
  • Expert knowledge in real time data analytics using Apache Storm.
  • Expertise in Java/J2EE technologies such as Core Java, spring, Hibernate, JDBC, JSON, HTML, Struts, Servlets, JSP, JBOSS and JavaScript.
  • Experience in designing both time driven and data driven automated workflows using Oozie.
  • Experience in ETL operation on Sqoop, Hive, Pig, HBase, Teradata, and Tableau.

TECHNICAL SKILLS:

Data Warehousing: Informatica Power Center, Power Connect, Power Exchange, Informatica PowerMart, Informatica Web services, Vertica, Informatica MDM 10.1/9.X, OBIEE 11g/10g, Oracle Data Integrator 12c/11g, OBIA/BI APPS11g/ 7.9.6.x/7.9.5, Oracle Data

Business Tools: Tableau 8.X/9.X, Business Objects XI R2, OLAP/OLTP, Power BI, Cognos 8, MS Access

Big Data: Hadoop, Map Reduce 1.0/2.0, Pig, Hive, Hbase, Sqoop, Oozie, Zookeeper, Kafka, Spark, Flume, Storm, Impala, Scala, Mahout, Hue, Tez, HCatalog, Storm, Cassendra

Databases and Related Tools: DB2, MySQL, Vertica, Mongo DB, SQL 2008, Oracle 10g/9i/8i/8/7.x, MS SQL Server 2012/2008, Teradata, Netezza, Sybase ASE, PL/SQL, Hive, T SQL, NoSQL, HDFS, TOAD 8.5.1/7.5/6.2. DB2 UDB

Languages: Java / J2EE, Scala, Python HTML, SQL, Spring, Hibernate, JDBC, JavaScript, PHP

Operating System: Mac OS, Unix, Linux (Various Versions), Windows 2003/7/8/8.1/XP

Web Development: HTML, Java Script, XML, PHP, JSP, Servlets, JavaScript

Application Server: Apache Tomcat, WebLogic, WebSphere Tools Eclipse, NetBeans

EXPERIENCE:

Confidential, Bothell, WA

DATA ENGINEER

Responsibilities:

  • Collaborating with business teams and data owners to successfully implement and maintain enterprise level business analytics and Data Warehousing solutions.
  • Developed custom ETL solution, batch and real time data ingestion pipeline to move data in and out from Hadoop.
  • Developed Custom aggregation framework in PYSPARK and PYSQL to aggregate the data.
  • Currently working on the Teradata to HP Vertica Data Migration Project Working extensively on the Copy Command for extracting the data from the files to Vertica . Monitor the ETL process job and validate the data loaded in Vertica DW.
  • Analyze extracted data in source, stage, and Load files. Perform analysis on long running queries for Vertica DB. Optimize Vertica projection Segmentation, table Partitioning and Tuple mover, move-out and merge out tasks.
  • Analysis, architecture, design, development and implementation of ETL processes, data migration, data conversion, metadata management, reference data management modules on Huntsman EHS data as per requirements using SQL, PL/SQL, Python and Java.
  • Developed SQL queries, PL/SQL programming Packages, Procedures, and Functions to meet various user/business requirements.
  • Expertise in PYTHON. Analyzed the SQL scripts and designed the solution to implement using PySpark. Developed a custom ETL pipeline by using python.
  • Development and testing of Extract, Transformation, and Loading (ETL) and data management and data quality solution modules based on design specifications.
  • Experience in working with Hive data warehouse tool-creating tables, data distribution by implementing partitioning and bucketing, writing and optimizing the Hive queries. Built real time pipeline for streaming data using Kafka and Spark Streaming.
  • Experienced with NoSQL databases like HBase, MongoDB and Cassandra and wrote Storm topology to accept the events from Kafka producer and emit into Cassandra DB.
  • Participated in creating data quality procedures, data profiles for member and claims data before loading into EDW batch tables. Worked on creating batch jobs and scripts that load data using TPT, Fast Export utilities.
  • Experience in designing of Dimensional Modelling (Star & Snowflake schemas) and achieving Physical Modelling (i.e.) developing Metadata, Mappings, Tasks & Workflows and migrating data from Sources through Data Warehouse environment.
  • Hands of experience in VSQL and Vertica Responsibilities data processing for large data sets.
  • Involved in planning, designing and developing the reusable ETL components.
  • Involved in Unit testing, System testing and User Acceptance Testing.
  • Identify and gather requirements for Business Objects reports. Build, test, enhance and deployment of BI reports and dashboards using Crystal Reports, Webi BOE platform tools and Tableau.

Environment: Hadoop, Hive , Apache Spark, Apache Kafka, Apache Cassandra, Hbase, SQL, Sqoop, Flume, Oozie, Java (jdk 1.6), Eclipse, Tableau, Teradata13.x, Teradata SQL Assistant .

Confidential, Chicago, IL

DATA ENGINEER

Responsibilities:

  • Analysis, Architecture, Design, Development and implementation of Data warehouse and enterprise application development projects based on a provided set of business requirements.
  • Worked in three different profiles including Hadoop ETL reporting, Mainframe Job monitoring, scheduling and user administration. Proposed, developed backup and recovery architecture for recurring (Extract, Transform and Load) reports.
  • Developed SQL queries, PL/SQL programming Packages, Procedures, and Functions to meet various user/business requirements. Worked on multiple ETL tools to transform the data from Oracle, Mainframe, DB2, Flat file to target Oracle, Netezza & Teradata on a large Data Warehouse.
  • Designed and implemented appropriate ETL mappings to extract and transform data from various sources to meet requirements.
  • Customized and developed the OBIEE Physical Layer, Business Model and Mapping layer.
  • Used Teradata Aster bulk load feature to bulk load flat files to Aster. Used Aster UDFs to unload data from staging tables and client data for SCD which resided on Aster Database.
  • Exploring with the Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark SQL, Data Frame, Pair RDD's, Spark YARN.
  • Experience in using Sqoop to migrate data to and fro from HDFS and MySQL or Oracle and deployed Hive and HBase integration to perform OLAP operations on HBase data.
  • Assisted in Batch processes using Fast Load, BTEQ, UNIX Shell and Teradata SQL to transfer cleanup and summarize data.
  • Designed and publishing visually rich and intuitive Tableau dashboards for executive decision making. Created various views in Tableau like Tree maps, Heat Maps, Scatter plots, Geographic maps, Line chart, Pie charts and etc.

Environment: Hadoop, Hive, Apache Spark, Tableau, My SQL, Apache Mesos, Unix Shell Programming, Hbase, Teradata SQL, Flume, MS Office and Delimited Flat files, Oozie, DB2, Teradata 13.x, Teradata SQL Assistant, UNIX Shell Scripting, Toad, Windows XP and MS Office Suite.

Confidential, SFO, CA

ETL/INFORMATICA DEVELOPER

Responsibilities:

  • Extensive experience with Data Extraction, Transformation, and Loading (ETL) from heterogeneous Data sources of Multiple Relational Databases like Oracle, Netezza, Teradata, DB2, SQL Server, MS Access and Worked on integrating data from flat files like fixed width and delimited, CSV, XML into a common reporting and analytical Data Model using Informatica .
  • Experience with Teradata utilities like Fast load, Fast Export, Multi Load, TPUMP & TPT. Have experience in creating BTEQ scripts. Strong knowledge of OBIEE as a Business Intelligence tool and Data Extraction using Informatica as ETL tool.
  • Used Unix Command and Unix Shell Scripting to interact with the server and to move flat files and to load the files in the server.
  • Worked extensively in tuning the current ETL processes for improving the performance by implementing database partitioning and increasing block size, data cache size and SQL overrides.
  • Worked on tuning ETL dimension and fact table loads by using optimization techniques and tuning mappings and database objects and improved performance, availability and throughput.
  • Experience in optimizing and performance tuning of Mappings and implementing the complex business rules by creating re-usable transformations, Mapplets and Tasks.
  • Involved in designing, developing and documenting of the ETL (Extract, Transformation and Load) strategy to populate the Data Warehouse from various source systems feeds using Informatica , PL/SQL scripts.
  • Loaded the data into the Teradata database using Load utilities like (Fast Export, Fast Load, and MultiLoad). Experience on Oracle utilities like SQL Loader, TOAD and Worked extensively on PL/SQL as part of the process to develop several scripts to handle different scenarios.
  • Extensively used Transformations like Router, Aggregator, Normalizer, Joiner, Expression and Lookup, Update strategy and Sequence generator and Stored Procedure.
  • Created and used Filters, Quick Filters, table Calculations and parameters on Tableau reports. Published Tableau Dashboards into Tableau Server.

Environment: Informatica Power Center 9.1, Informatica Data Studio, Windows, Netezza, DB visualizer, Aginity workbench for Netezza, Putty, Winscp, NZSQL, Teradata 13.x, Teradata SQL Assistant, Tableau, UNIX Shell Scripting, PL-SQL, Oracle 10g/ 9i/11g, MS-SQL Server, Toad, HP Quality Center, Windows XP and MS Office Suite, MS Office and Delimited Flat files .

Confidential

Oracle PL/SQL Developer

Responsibilities:

  • Worked for commerce clients in designing and developing databases for POS systems. The system was developed to maintain the entire information about the material availability and finished goods, delivery details, purchases, vendor’s information and catalog information.
  • Creating PL/SQL Procedures, Packages, Functions for billing module.
  • Developing Forms and Reports.
  • Developed custom reports for various modules as per the client requirement.
  • Query Optimization, Tuning SQL queries.
  • Unit Testing on Forms and Reports and PL/SQL Stored Procedures, Functions, Triggers, Packages.
  • Developed Oracle Stored procedures and database objects for the whole billing database.
  • Created Stored Procedures, Functions and Triggers using SQL*Plus to be invoked by shell scripts and forms.
  • Tuned the billing module to improve the performance, because of growing number of transactions.
  • Developed the Oracle Reports to show the users various employee reports, Salary slabs, promotion, tax, loans and advances.

Environment: Oracle 9i RDBMS Enterprise Edition, Microsoft windows server 2003, PL/SQL, Oracle 9i Application Server Enterprise Edition, SQL loader.

We'd love your feedback!