Data Engineer Resume
Nashville, TN
SUMMARY
- Overall 5 years of experience as a Data Engineer including designing, developing and implementation of data models for enterprise - level applications and systems.
- Expert in writing SQL queries and optimizing the queries in Oracle, SQL Server.
- Experience in Data transformation, Data mapping from source to target database schemas, Data Cleansing procedures.
- Good experience working on analysis tool like Tableau for regression analysis, pie charts, and bar graphs.
- Extensive experience in development of T-SQL, Oracle PL/SQL Scripts, Stored Procedures and Triggers for business logic implementation.
- Expertise in SQL Server Analysis Services (SSAS) and SQL Server Reporting Services (SSRS) tools.
- Involve in writing SQL queries, PL/SQL programming and created new packages and procedures and modified and tuned existing procedure and queries.
- Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems (RDBMS) and from RDBMS to HDFS.
- Experience in designing, building and implementing complete Hadoop ecosystem comprising of MapReduce, HDFS, Hive, and MongoDB.
- Experience in various Teradata utilities like Fastload, Multiload, BTEQ, and Teradata SQL Assistant.
- Extensively worked with Teradata utilities BTEQ, Fast Export and Multi Load to export and load data to/from different source systems including flat files.
- Good in System analysis, ER Dimensional Modeling, Database design and implementing RDBMS specific features.
- Experience in Worked on NoSQL databases - HBase, Cassandra & MongoDB, database performance tuning & data modeling.
- Strong experience in writing SQL and PL/SQL, Transact SQL programs for Stored Procedures, Triggers and Functions.
- Experienced in Data Scrubbing/Cleansing, Data Quality, Data Mapping, Data Profiling, Data Validation in ETL.
- Excellent working experience in Scrum / Agile framework and Waterfall project execution methodologies.
- Good Knowledge on big data tools like Hadoop, Azure Data Lake, and AWS Redshift.
- Hands on experience in Normalization and Demoralization techniques for effective and optimum performance in OLTP and OLAP environments.
- Experience in developing and designing POC's using Scala, Spark SQL and MLlib libraries then deployed on the Yarn cluster.
- Extensive experience in Extraction, Transformation and Loading (ETL) of data from multiple sources into Data Warehouse and Data Mart.
- Good experience in Data Modeling and Data Analysis as a Proficient in gathering business requirements and handling requirements management.
- Experience in configuring and administering the Hadoop Cluster using major Hadoop Distributions like Apache Hadoop and Cloudera.
TECHNICAL SKILLS
Cloud Platform: AWS, Azure, Amazon Redshift, Azure, Azure SQL Database, Azure SQL Data Warehouse, Azure Data Lake and Data Factory
Big Data Tools: Hadoop Ecosystem MapReduce, HBase 1.2, Hive 2.3, Pig 0.17, Flume 1.8, Sqoop 1.4, Cloudera Manager, Neo4j, Hadoop 3.0, Cassandra 3.11
Programming Languages: SQL, PL/SQL, UNIX shell Scripting
OLAP Tools: Tableau, SAP BO, SSAS, Business Objects, and Crystal Reports 9
Databases: Oracle 12c/11g, Teradata R15/R14, MS SQL Server 2016/2014, DB2.
Data Modeling Tools: Erwin Data Modeler, Erwin Model Manager, ER Studio v17, and Power Designer 16.6.
Testing and defect tracking Tools: HP/Mercury, Quality Center, Win Runner, MS Visio & Visual Source Safe
Methodologies: Agile, System Development Life Cycle (SDLC), Waterfall Model
PROFESSIONAL EXPERIENCE
Confidential - Nashville, TN
Data Engineer
Responsibilities:
- As a Data Engineer developed Data analytic solutions on a Hadoop-based platform and engage clients in technical discussions.
- Used Agile (SCRUM) methodologies for Software Development.
- Wrote complex Hive queries to extract data from heterogeneous sources (Data Lake) and persist the data into HDFS.
- Installed, Configured and Maintained the Hadoop cluster for application development and Hadoop ecosystem components.
- Created data integration and technical solutions for Azure Data Lake Analytics, Azure Data Lake Storage, Azure Data Factory, Azure SQL databases and Azure SQL Data Warehouse for providing analytics.
- Involved in all phases of data mining, data collection, data cleaning, developing models, validation and visualization.
- Worked on Hive queries to categorize data of different wireless applications and security systems.
- Responsible in loading and transforming huge sets of structured, semi structured and unstructured data.
- Extensively involved in writing PL/SQL, stored procedures, functions and packages.
- Developed, planed and migrated servers, relational databases (SQL) and websites to Microsoft Azure.
- Loaded and transformed large sets of structured, semi structured and unstructured data using Hadoop/ Big Data concepts.
- Designed and implemented scalable Cloud Data and Analytical architecture solutions for various public and private cloud platforms using Azure.
- Configured Azure SQL database with Azure storage Explorer and with SQL server.
- Experienced in developing Web Services with Python programming language.
- Involved in all phases of data mining, data collection, data cleaning, developing models, validation and visualization.
- Extracted and loaded data into Data Lake environment (MS Azure) by using Sqoop which was accessed by business users.
- Developed numerous MapReduce jobs in Scala for Data Cleansing and Analyzing Data in Impala.
- Used Hive to analyze data ingested into HBase by using Hive-HBase integration and compute various metrics for reporting on the dashboard
- Worked in developing Pig Scripts for data capture change and delta record processing between newly arrived data and already existing data in HDFS.
- Optimized Hive queries to extract the customer information from HDFS.
- Analyzed data using Hive the partitioned and bucketed data and compute various metrics for reporting.
- Used python scripts to update content in the database and manipulate files.
- Built Azure Data Warehouse Table Data sets for Power BI Reports.
Environment: Hive 2.3, HDFS, Hadoop 3.0, Azure, PL/SQL, SQL, Python 3.7, MapReduce, Scala 2.1, Imapla 3.3
Confidential - Washington, DC
Data Analyst/Data Engineer
Responsibilities:
- Worked as a Data Analyst/ Data Engineer to review business requirement and compose source to target data mapping documents.
- Participated in JAD meetings to gather the requirements and understand the End Users System.
- Involved in Agile development methodology active member in scrum meetings.
- Performed Data Analysis and Data Profiling and worked on data transformations and data quality rules.
- Developed SQL Joins, SQL queries, tuned SQL, views, test tables, scripts in development environment.
- Extensively use SAS procedures like means, frequency and other statistical calculations for Data validation.
- Designed and developed a Data Lake using Hadoop for processing raw and processed claims via Hive.
- Worked on AWS Redshift and RDS for implementing models and data on RDS and Redshift.
- Performed Data Analysis and extensive Data validation by writing several complex SQL queries.
- Responsible for data lineage, maintaining data dictionary, naming standards and data quality.
- Developed SAS macros for data cleaning, reporting and to support routing processing.
- Created SQL scripts to find Data quality issues and to identify keys, Data anomalies, and Data validation issues.
- Worked with NoSQL databases like HBase in creating HBase tables to load large sets of semi-structured data coming from various sources.
- Actively involved in writing T-SQL Programming for implementing Stored Procedures and Functions and cursors, views for different tasks.
- Performed Hive programming for applications that were migrated to big data using Hadoop
- Used MS Visio for business flow diagrams and defined the workflow.
- Performed Data analysis for the existing Data warehouse and changed the internal schema for performance.
- Wrote SQL Stored Procedures and Views, and coordinate and perform in-depth testing of new and existing systems.
- Performed Data Manipulation using MS Excel Pivot Sheets and produced various charts for creating the mock reports.
- Performed reverse engineering of the dashboard requirements to model the required data marts.
- Cleansed, extracted and analyzed business data on daily basis and prepared ad-hoc analytical reports using Excel and T-SQL
- Created Data Migration and Cleansing rules for the Integration Architecture (OLTP, ODS, DW).
- Handled performance requirements for databases in OLTP and OLAP models.
- Conducted meetings with business and development teams for data validation and end-to-end data mapping.
Environment: SQL, SAS, Hadoop 3.0, AWS, NoSQL, HBase 1.2, T-SQL, Hive 2.1, MS Visio 2014, MS Excel 2014, OLTP, OLAP
Confidential - Alexandria, VA
Data Analyst
Responsibilities:
- Analyzed daily of the data reporting operations. Identified and developed insight to reduce data quality issues and process gaps.
- Responsible to the stakeholders for timely delivery & completion of assigned projects.
- Developed the stored Procedures, SQL Joins, SQL queries for data retrieval, accessed for analysis.
- Used Microsoft Excel tools like pivot tables, graphs, charts, solver to perform quantitative analysis.
- Created complex SQL queries to fetch data per the software needs and created Dashboards visualizing
- Developed all the required stored procedures, user defined functions and triggers using T-SQL and SQL.
- Performed the detail data analysis, Identify the key facts and dimensions necessary to support the business requirements.
- Used MS Visio and Rational Rose to represent system under development in a graphical form by defining use case diagrams, activity and workflow diagrams.
- Wrote a complex SQL, PL/SQL, Procedures, Functions, and Packages to validate data and testing process.
- Performed Data Analysis and Data validation by writing SQL queries using SQL assistant.
- Translated business concepts into XML vocabularies by designing XML Schemas with UML
- Gathered business requirements through interviews, surveys with users and Business analysts.
- Created reports to retrieve data using Stored Procedures.
- Worked in data management performing data analysis, gap analysis, and data mapping.
Environment: SQL, MySQL, Excel, MS Visio, PL/SQL, XML, T-SQL
Confidential
Data Analyst
Responsibilities:
- Worked with Data Analyst for requirements gathering, business analysis and project coordination.
- Developed stored procedures in SQL Server to standardize DML transactions such as insert, update and delete from the database.
- Developed SQL Queries to fetch complex data from different tables in databases using joins, database links.
- Performed Data analysis of existing data base to understand the data flow and business rules applied to Different data bases using SQL.
- Created SSIS package to load data from Flat files, Excel and Access to SQL server using connection manager.
- Developed all the required stored procedures, user defined functions and triggers using T-SQL and SQL.
- Produced report using SQL Server Reporting Services (SSRS) and creating various types of reports.
- Performed the detail data analysis, Identify the key facts and dimensions necessary to support the business requirements.
- Used MS Visio and Rational Rose to represent system under development in a graphical form by defining use case diagrams, activity and workflow diagrams.
- Wrote a complex SQL, PL/SQL, Procedures, Functions, and Packages to validate data and testing process.
- Performed Data Analysis and Data validation by writing SQL queries using SQL assistant.
- Translated business concepts into XML vocabularies by designing XML Schemas with UML
- Gathered business requirements through interviews, surveys with users and Business analysts.
- Worked on Data Mining and data validation to ensure the accuracy of the data between the warehouse and source systems.
- Created reports to retrieve data using Stored Procedures.
- Ensured the compliance of the extracts to the Data Quality Center initiatives.
- Worked in data management performing data analysis, gap analysis, and data mapping.
Environment: SQL, SSIS, SSRS, T-SQL, MS Visio, Rational Rose, PL/SQL, XML
