We provide IT Staff Augmentation Services!

Senior Data Engineer Resume

3.00/5 (Submit Your Rating)

Austin, Texas

SUMMARY

  • 10 years of professional IT experience in Data Warehousing, ETL (Extract, Transform, and Load) and Data Analytics.
  • 7 years of extensive experience in Informatica Power Center 10.x/9.x/8.x. 3 years as Lead Developer, responsible for supporting enterprise level ETL architecture.
  • 2.5 years of experience in Hadoop 2.0. Led development of enterprise level solutions utilizing Hadoop utilities such as Spark, MapReduce, Sqoop, PIG, Hive, HBase, Phoenix, Oozie, Flume, streaming jars, Custom SerDe, etc. Worked on proof of concepts on Kafka, and Storm.
  • Experience with Hortonworks, Cloudera, and Amazon EMR Hadoop distributions.
  • Worked on Java EE 7 and 8. Developed ETL\Hadoop related java codes, created RESTful APIs using Spring Boot Framework, developed web apps using Spring MVC and JavaScript, developed coding framework, etc.
  • Well - versed in relational database management systems (RDBMS) including Oracle, MS SQL Server, MYSQL, Teradata, DB2, Netezza, and MS Access. More than 5 years of experience in Teradata.
  • Proficient in SQL, T SQL, BTEQ and PL/SQL (Stored Procedures, Functions, Triggers, Cursors, and Packages).
  • Extensive experience in developing UNIX Shell Script, Perl, Windows Batch Script, JavaScript and PowerShell to automate ETL processes.
  • Exposure to NoSQL databases such as MongoDB, HBase, and Cassandra. Created Java apps to handle data in MongoDB and HBase. Used Phoenix to create SQL layer on HBase.
  • Experience with Talend’s Data Integration, ESB, MDM and Big Data tools.
  • Exposure to HL7’s FHIR specifications and related java API HAPI. Created FHIR APIs\ web services to store and manage resources in MongoDB.
  • Healthcare domain knowledge including Facets, CareAdvance, Care Analyzer, HL7, EDI, NCPDP, EMR, HEDIS, NCQA, URAC, etc.
  • Hands on experience in various open source Apache technologies such as NiFi, Hadoop, Avro, ORC, Parquet, Spark, HBase, Phoenix, Kite, Drill, Presto, Drools, Talend, Airflow, Falcon, Flume, Ranger, Ambari, Kafka, Oozie, ZooKeeper, Karaf, Camel, JMeter, etc.
  • Experience in Elasticsearch and MDM solutions.
  • Worked on message oriented architecture with RabbitMQ and Kafka as a Message Broker option. Used Talend ESB to exchange messages from AMQP and JMS clients.
  • Well-versed in version control and CI-CD tools such as SVN, GIT, SourceTree, Bitbucket, etc.
  • Experience in Amazon Web Services (AWS) products S3, EC2, EMR, and RDS.
  • Strong experience in design and development of Business Intelligence solutions using data modeling, Dimension Modeling, ETL Processes, Data Integration, OLAP and client /server application.
  • Extensive experience in agile software development methodology.

TECHNICAL SKILL

SRDBMS: Oracle, Teradata, DB2, SQL Server 2012/2008, MySQL, MS Access

NoSQL:  MongoDB, HBase, Cassandra

ETL Tools: Informatica Power Center / Power Exchange, Informatica Data Quality (IDQ), Talend, Apache NiFi

Languages: Java EE, Python, JavaScript

SQL:   ANSI SQL, T-SQL, PL/SQL, BTEQ

Scripting: UNIX Shell Scripting, PERL, Windows Batch Script, VB, PowerShell, YAML

Versioning\ CI-CD: SVN, GIT, SourceTree, Bitbucket, Bamboo, JIRA, Confluence

IDE: SQL*Plus, SQL Developer, TOAD, SQL Navigator, Query Analyzer, SQL Server Management Studio, SQL Assistance, Eclipse, Postman

Analytics/BI:  Microstrategy, SPSS, IBM Cognos, OBIEE, Business Objects

Other: HTML, CSS, JQuery, Thymeleaf, XML, JSON, PHP, MS Visio, Erwin, Confluence, Spring Framework, SQL*Loader, Tidal, Oozie, SAS, R

PROFESSIONAL EXPERIENCE

Confidential,Austin, Texas 

Senior Data Engineer

Responsibilities:

  • Developed Spark jobs in Java to perform ETL from SQL Server to Hadoop.
  • Designed HBase schema based on data access patterns. Used Phoenix to add SQL layer on HBase table. Created indexes on Phoenix tables for optimization.
  • Integrated Spark with Drools Engine to implement business rule management system.
  • Benchmarked compression algorithms and file formats (Avro, ORC, Parquet, and Sequence File) for Hive, MapReduce, Spark, and HBase.
  • Analyzed Stored Procedures to convert business logic into Hadoop jobs
  • Used SSIS and SSRS for BI projects
  • Worked on various POCs including Apache NiFi as a data flow\orchestration engine, Talend as ESB and Big Data solutions, Elasticsearch as an indexing engine for MDM, and SMART on FHIR with MongoDB for FHIR based app interface.
  • Used Cassandra to build next generation storage platform. Employed CQL for data manipulation.
Confidential, Dayton, Ohio 

Senior ETL Developer \ Lead Hadoop Developer

Responsibilities:

  • Led enterprise level ETL architecture design initiatives.
  • Developed Informatica mappings, sessions, and workflows to load transformed data into EDW from various source systems such as SQL Server, Teradata, and Flat Files.
  • Extracted data from heterogeneous sources and performed complex business logic on data using Informatica transformations (e.g. Router, Normalizer, Lookup, Rank, Aggregator, Union, etc.) and\or SQL Scripts (e.g. BTEQ, and T SQL).
  • Designed Batch Audit Process in batch\shell script to monitor each ETL job along with reporting status which includes table name, start and finish time, number of rows loaded, status, etc.
  • Developed a general purpose Change Data Capture (CDC) process based on audit table for a standard incremental ETL process.
  • Supported production jobs and developed several automated processes to handle errors and notifications. Also, tuned performance of slow jobs by improving design and configuration changes of Informatica code.
  • Led company’s Hadoop initiatives for Big Data solutions. Played a key role in deploying Hadoop infrastructure to production.
  • Implemented layered architecture for Hadoop to modularize design. Developed framework scripts to enable quick development. Designed reusable shell scripts for Hive, Sqoop, and PIG jobs. Standardize error handling, logging and metadata management processes. Developed Batch Audit Process in Hadoop gateway node.
  • Developed Disaster Recovery backup solution using DistCp utility.
  • Set up initial Java architecture for MapReduce and Spark codes.
  • Assisted in setting up real-time message consumption architecture in Hadoop from Tibco EMS using Apache Flume and Spark Streaming. Flattened canonical XML messages using Hive custom serializer\ deserializer.

Datafactz, Novi, MI (Feb 2012 - Oct 2013)

ETL Developer

  • Developed Stored Procedures using PL/SQL for ETL processes.
  • Designed and developed complex Informatica mappings. Used Normalizer, Lookup, Router, XML Source, Stored Procedure, etc. to meet the requirements.
  • Performed Data Analyst role to prepare Source to Target mapping to populate data mart tables.
  • Assisted in data and dimension modeling.
Confidential,Edison NJ

SQL Developer \ Senior Developer

Responsibilities:

  • Worked as a SQL\ETL developer for multiple clients in different domain including healthcare, financial and retail.
  • Wrote SQL scripts and Stored Procedures to perform business logic on data.
  • Used Informatica, Data Integrator, SSIS, SSRS, OBIEE, etc. as Business Intelligence tools.

We'd love your feedback!