Data Engineer Resume Dallas, TX - Hire IT People

SUMMARY

Overall 7+ years of experience as Data Engineer and Data Analyst including designing, developing and implementation of data models for enterprise - level applications and systems.
Full life cycle implementation experience of Big Data Pipelines.
Excellent Software Development Life Cycle (SDLC) with good working knowledge of testing methodologies, disciplines, tasks, resources and scheduling.
Extensive knowledge of Bigdata, Hadoop, MapReduce, Hive, NoSQL Databases and other emerging technologies.
Good experience in building pipelines using Azure Data Factory and moving the data into Azure Data Lake Store.
Experience in developing Map Reduce Programs using Apache Hadoop for analyzing the big data as per the requirement.
Good working experience using Sqoop to import data into HDFS from RDBMS and vice - versa.
Experience in creating tables, constraints, views, and materialized views using ERwin, ER Studio, and SQL Modeler.
Extensive experience in Text Analytics, generating data visualizations using Python and creating dashboards using tools like Tableau.
Data streaming from various sources like cloud (AWS, Azure) and on - premises by using the tools Spark.
Hands on experience in Normalization and De - Normalization techniques for optimum performance in relational and dimensional database environments.
Good experience in AGILE delivery process of software using SCRUM.
Excellent SQL programming skills and developed Stored Procedures, Triggers, Functions, Packages using SQL, PL/SQL.
Excellent Knowledge of Ralph Kimball and BillInmon's approaches to Data Warehousing.
Experience in working on Distributed storage for analysis and processing of large data sets using Apache Hadoop.
Expert in Data Analysis, Data Validation, Data Cleansing, Data Verification and identifying data mismatch.
Experience in working with Teradata. And making the data to be batch processing using distributed computing.
Excellent knowledge and extensively using NOSQL databases (HBase).
Experience in Designing and implementing data structures and commonly used data business intelligence tools for data analysis.
Hands on experience on data modeling with Star schema and Snowflake schema.
Using MS Excel and MS Access to dump the data and analyze based on business needs.
Extensive experience on usage of ETL & Reporting tools like SQL Server Integration Services (SSIS), SQL Server Reporting Services (SSRS).
Good communication skills, work ethics and the ability to work in a team efficiently with good leadership skills.

TECHNICAL SKILLS

Big Data Tools: HBase 1.2, Hive 2.3, Pig 0.17, HDFS, Sqoop 1.4, Kafka 1.0.1, Oozie 4.3, Hadoop3.0, Spark

Methodologies: JAD, System Development Life Cycle (SDLC), Agile, Waterfall Model.

ETL Tools: Informatica 9.6/9.1 and Tableau.

Data Modeling Tools: Erwin Data Modeler 9.8, ER Studio v17, and Power Designer 16.6.

Databases: Oracle 12c, Teradata R15, MS SQL Server 2016, DB2.

Cloud Platform: AWS, Azure, Google Cloud, Cloud Stack/Open Stack

Programming Languages: SQL, PL/SQL, Python, UNIX shell Scripting

Operating System: Windows, Unix

PROFESSIONAL EXPERIENCE

Confidential - Dallas, TX

Data Engineer

Responsibilities:

Worked on Big data requirement analysis, develop and design solutions for ETL and Business Intelligence platforms.
Involved in all the phases of SDLC including Requirements Collection, Design & Analysis of the Customer Specifications from Business Analyst.
Developed a data pipelines using Kafka and Storm to store data into HDFS and performed the real time analytics on the incoming data.
Worked on reading multiple data formats on HDFS using python.
Performed Reverse Engineering of the current application using Erwin, and developed Logical and Physical data models for Central Model consolidation.
Implemented Custom Azure Data Factory pipeline Activities and SCOPE scripts.
Created SQL tables with referential integrity, constraints and developed queries using SQL, SQL*PLUS and PL/SQL.
Created Data Dictionary and Data Mapping from Sources to the Target in MDM Data Model.
Loaded and transformed large sets of structured, semi structured and unstructured data using Hadoop/Big Data concepts.
Imported the complete data from RDBMS to HDFS cluster using Sqoop.
Created customized report using OLAP Tools such as Crystal Report for business use.
Developed data Mart for the base data in Star Schema, Snow-Flake Schema
Worked on claims data and extracted data from various sources such as flat files, Oracle and Mainframes.
Created Complex SQL Queries using Views, Indexes, Triggers, Roles, Stored procedures and User Defined Functions worked with different methods of logging in SSIS.
Used the Spark framework Enhanced and optimized product Spark code to aggregate, group and run data mining tasks.
Designed and build a DataLake using Hadoop and its ecosystem components.
Created a new data model that embeds NoSQl submodels within a relational data model by applying Hybrid data modelling concepts.
Implemented Partitioning, Dynamic Partitions and Buckets in HIVE for efficient data access.
Actively involved in SQL and Azure SQL DW code development using T-SQL
Implemented Spark using Python ( pySpark ) and SparkSQL for faster testing and processing of data.
Handled structured and unstructured data and applying ETL processes.
Developed reports for users in different departments in the organization using SQL Server Reporting Services (SSRS).
Developed Json Scripts for deploying the Pipeline in Azure Data Factory (ADF) that process the data using the Cosmos Activity.
Successfully Generated consumer group lags from Kafka using their API
Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team Using Tableau.
Extracted the needed data from the server into HDFS and Bulk Loaded the cleaned data into Hbase.
Defined job work flows as per their dependencies in Oozie.
Developed Map Reduce Programs for data analysis and data cleaning.
Developed the batch program in PL/SQL for the OLTP processing and used Unix Shell scripts to run in corn tab.

Environment: Erwin9.8,, Big data3.0, ETL, Hadoop3.0, NoSQl, SQL, PL/SQL, Azure, HDFS, Python, Kafka1.1, OLAP, Oracle12c, Sqoop1.4, SSIS, PySpark, T-SQL, HIVE2.3, Hbase1.2, SSRS, API, Oozie4.3, Cosmos, Tableau, Map Reduce, OLTP.

Confidential - Houston, TX

Data Analyst/Data Engineer

Responsibilities:

Accomplished implementation of Data Pipelines as per the data models.
Worked in Agile environment, and used rally tool to maintain the user stories and tasks.
Configured AWS EC2 instances, S3Buckets, Cloud services and architected the flow of data to and from AWS.
Installed Oozie workflow engine to run multiple Hive and Pig Jobs.
Skilled experience in Python with proven expertise in using new tools and technical developments
Wrote a complex SQL, PL/SQL, Procedures, Functions, and Packages to validate data and testing process.
Designed and Implemented Error-Free Data Warehouse-ETL and Hadoop Integration.
Loaded real time data from various data sources into HDFS using Kafka.
Performed Data analysis and Data profiling using complex SQL on various sources systems including Oracle and Teradata.
Designed and Developed Oracle PL/SQL and Shell Scripts, Data Import/Export, Data Conversions and Data Cleansing.
Worked in generating and documenting Metadata while designing OLTP and OLAP systems environment.
Assisted in the oversight for compliance to the Enterprise Data Standards, data governance and data quality.
Carried out effective data profiling to eradicate anomalies between source and target data.
Developed stored procedures in SQL Server to standardize DML transactions such as insert, update and delete from the database.
Developed Star and Snowflake schemas based dimensional model to develop the data warehouse.
Developed prototype solutions to verify capabilities for new systems development, enhancement, and maintenance of MDM
Developed PIG Latin scripts for the analysis of semi structured data.
Performed Tableau administering by using tableau admin commands.
Developed all the required stored procedures, user defined functions and triggers using T-SQL and SQL.
Worked with Looker, ESB(Enterprise Service Bus), API, AWS EMR, Ranger, and Hadoop technologies
Assisted in the oversight for compliance to the Enterprise Data Standards, data governance and data quality.
Involved in complete SSIS life cycle in creating SSIS packages, building, deploying and executing the packages all environments.
Used Sqoop to import data into HDFS and Hive from other data systems.
Produced report using SQL Server Reporting Services (SSRS) and creating various types of reports.
Created/Modified shell scripts for scheduling various data cleansing scripts and ETL loading process.

Environment: Erwin9.8, Hadoop3.0, Agile, Oracle12c, SQL, PL/SQL, Teradata15, AWS, Oozie4.3, Hive2.3, Pig0.17, OLAP, OLTP, Kafka1.1, HDFS, ETL, Tableau, T-SQL, SSIS, SSRS.

Confidential - Nashville, TN

Data Analyst/Data Modeler

Responsibilities:

Performed Data Analysis on the source data in order to understand the relationship between the entities
Participated in JAD sessions, gathered information from Business Analysts, end users and other stakeholders to determine the requirements.
Created dimensional model for the reporting system by identifying required dimensions and facts using ER /Studio
Designed logical and physical data models using data provisioning and consumption techniques.
Used existing Deal Model in Python to inherit and create object data structure for regulatory reporting.
Handled performance requirements for databases in OLTP and OLAP models.
Used Teradata Fast Export utility to export large volumes of data from Teradata tables and views for processing and reporting needs.
Extensively created SSIS packages to clean and load data to data warehouse.
Used SQL for extract, transfer and load ETL methodology and processes
Created PL/SQL procedures, triggers, generated application data, Created users and privileges, used oracle utilities import/export.
Translated business concepts into XML vocabularies by designing XML Schemas with UML
Used normalization and de-normalization techniques to achieve optimum performance of the database.
Designed and developed of data warehouse using T-SQL, SQL.
Deployed SSRS reports to Report Manager and created linked reports, snapshots, and subscriptions for the reports and worked on scheduling of the reports.
Worked on Data Mining and data validation to ensure the accuracy of the data between the warehouse and source systems.
Developed scripts that automated DDL and DML statements used in creations of databases, tables, constraints, and updates.
Created Data Dictionaries, Source to Target Mapping Documents and documented Transformation rules for all the fields.
Developed Ad-hoc reports using Tableau Desktop, Excel.

Environment: ER /Studio, SQL, PL/SQL, OLAP, OLTP, Python, Teradata, ETL, SSIS, XML, T-SQL, SSRS, Tableau, Excel, Oracle.

Confidential

Data Analyst

Responsibilities:

Worked in data management performing data analysis, gap analysis, and data mapping.
Worked on two sources to bring in required data needed for reporting for a project by writing SQL extracts
Created stored procedures using PL/SQL and tuned the databases and backend process.
Worked on debugging and identifying the unexpected real-time issues in the production server SSIS packages.
Conducted GAP analysis and data mapping to derive requirements for existing systems enhancements for a project.
Involved in extensive DATA validation using SQL queries and back-end testing
Evaluated data profiling, cleansing, integration and extraction tools (e.g. Informatica)
Designed and developed T-SQL stored procedures to extract, aggregate, transform, and insert data and developed SQL Stored procedures to query dimension and fact tables in data warehouse.
Worked with SQL Server Reporting Services (SSRS) to author, manage, and deliver both paper- based
Prototyped data visualizations using Charts, drill-down, parameterized controls using Tableau to highlight the value of analytics in Executive decision support control.
Performed data mining on Claims data using very complex SQL queries and discovered claims pattern.
Developed Enterprise Data Dictionary for reusable objects like domains, attachments, defaults, reference values, User data types and reusable procedural logic.
Written complex SQL queries for validating the data against different kinds of reports generated by Business Objects.
Extensively created tables and queries to produce additional ad-hoc reports.
Performed data validation on the flat files that were generated in UNIX environment using UNIX commands as necessary.

Environment: SQL, PL/SQL, SSIS, SSRS, T-SQL, Informatica, Tableau, XML, UNIX.

We provide IT Staff Augmentation Services!

Data Engineer Resume

Dallas, TX

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship