We provide IT Staff Augmentation Services!

Data Architect/senior Data Engineer Resume

2.00/5 (Submit Your Rating)

Northbrook, IL

SUMMARY

  • Over 12 years of experience in technology solutions, design, and implementation of end - to-end data-driven platforms including 4 years of experience with Hadoop ecosystem like Hadoop Map Reduce, HDFS, Hive, HBase, Kafka, Spark
  • Experience in developing solutions and leading data migration projects on to AWS EMR, Redshift, S3, Kinesis etc.
  • Experienced in all stages of software development Lifecycle (SDLC) including Requirements, Analysis and Design, implementation, integration and testing, deployment and maintenance in waterfall and Agile methodologies
  • Built large-scale data pipelines and data-centric applications using Big Data tooling like Hadoop, Spark, Hive, Oozie, and Airflow in a production setting
  • Experienced in data modeling for OLAP, OLTP, Enterprise Data warehouse and ODS applications
  • Experienced in Apache Spark cluster and stream processing using Spark Streaming
  • Experienced in using distributed computing architectures such as AWS products (S3, Redshift, and EMR), Hadoop, Python, Spark, and effective use of map-reduce, SQL to solve big data type problems
  • Hands-on experience with MPP query engines like Spark SQL
  • Strong experience in writing applications using python using different libraries like Pandas, NumPy, SciPy etc.
  • Experienced in performance tuning and optimization process on both RDBMS and Hadoop data platforms
  • Experienced in continuous integration (CI) and continuous deployment (CD) using Team Foundation Server (TFS) and Git
  • Experience in developing Map Reduce Programs using Apache Hadoop for analyzing the big data
  • Experienced in creating reporting models and dashboards using PowerBI and Tableau
  • Strong knowledge in database schema designing like Snowflake and Star schema
  • Strong experience in data warehouse designing using Kimball and Inmon’s approach
  • Expertise in working with Transactional Databases - SQL server 2008R 2/2012/2014/2016
  • Experienced in SQL server database design, development, and implementation, developing SQL queries, Stored Procedures, server level/query level performance tuning
  • Provided thought leadership and architectural expertise and managed cross-team integration
  • Strong experience in leading teams and managing onshore- offshore model
  • Excellent communication, analytical and problem-solving skills

TECHNICAL SKILLS

Hadoop/Big Data: HDFS, Map Reduce, HBase, Hive, Sqoop, Kafka, Spark

Data Languages: T-SQL, Python, PySpark

Cloud Technologies: AWS S3, Redshift, AWS EMR, AWS Kinesis, Glue Data Catalog

Orchestration tools: Airflow, Oozie, Zookeeper

ETL/Data Integration framework: Microsoft Business Intelligence Development Studio (BIDS), SSIS, SQL Server Data tools (2012), Talend.

Tuning and Optimization Tools: SQL profiler, Query analyzer, Tuning advisor

Data visualization tools: Tableau, Power BI

Database System: MS SQL server 2008R 2/2012/2014/2016 , PostgreSQL

CI/CD: Team foundation server (TFS), Git

Automation and configuration management: PowerShell, VBA

Project management tools: JIRA, Confluence, Workfront, Asana, Visio

PROFESSIONAL EXPERIENCE

Confidential - Northbrook, IL

Data Architect/Senior Data Engineer

Responsibilities:

  • Responsible for creating and mapping Business Requirement Documents (BRD) into technology solutions
  • Worked with various business units and stake holders for designing architecture roadmaps for data initiatives
  • Worked closely with EAs, engineering & data teams to design and implement business-critical enterprise data-solutions
  • Led various cross functional teams in building and deploying data-platform solutions for business on Cloud and on-premises
  • Migrated big data and analytics workloads from on-premises Hadoop to Amazon EMR
  • Created batch data pipelines for ingesting data from various sources using Apache Spark and PySpark
  • Used Spark SQL and Data Frames API to load structured and semi-structured information into Spark Clusters
  • Performance tuning of the Hadoop clusters, Hive/Spark Queries and Map Reduce workloads
  • Scheduled and orchestrated data pipelines using Apache Airflow
  • Created RESTFUL APIs for several data applications using Python and Django framework for data ingestions
  • Loaded and transformed large sets of structured, semi structured and unstructured data in to Hive through Sqoop.
  • Designed and implemented data pipelines for loading data from different Data sources like (SQL Server and flat files) into HDFS using Sqoop and loaded into partitioned Hive tables
  • Worked in tuning Hive to improve performance and solved performance issues in both scripts
  • Enhanced the data infrastructure framework by evaluating new and existing technologies and techniques to create efficient processes around data extraction, aggregation, and analytics on both SQL and Hadoop platforms
  • Performed data migrations for legacy applications on SQL Server 2008 R2 to SQL Server 2012/2014 and 2017
  • Developed MOLAP Cubes with Star and Snowflake Schema using MS SQL Server Analysis Services (SSAS)
  • Designed data pipelines for extracting, transforming, and loading data from various sources like flat files, APIs and different servers using SQL Server Data Tools (SSIS) on RDBMS platform
  • Designed current databases by adding tables, relationships, creating stored procedures, views, and performance tuning of T-SQL queries
  • Created data pipelines to extract large data sets from various sources like flat files, data marts, servers/APIs using SSIS
  • Deployed SSIS packages and automated builds for CI/CD using TFS and Git
  • Increased efficiency of existing platforms through performance tuning and optimization by creating clustered/non- clustered indexes, SQL profiler and execution plans etc.
  • Developed Enterprise-wide Power BI reports and dashboards from multiple data sources using data blending
  • Extracted and Integrated data from various sources like APIs, flat files, Google Analytics into Power BI
  • Led onshore offshore teams in developing end to end data platforms as per business requirements

Environment: Hadoop 3.0, HBase, Sqoop, Kafka, HDFS, AWS S3, Amazon EMR, MSSQL Server 2008R2/2012/2016, T-SQL, SSIS, SSAS, OLTP, MOLAP, Power BI 2.4, Power Query, Visual Studio 2017, Python 3.7.1, APIs. TFS, Git, Agile -Kanban

Confidential, Chicago, IL

Data Engineer

Responsibilities:

  • Worked as Data Engineer lead in managing the BI production support/ operations for the various business units on Hadoop, AWS Cloud and RDBMS platforms
  • Elicited business data requirements from clients and translated into technical specifications for development team
  • Created POCs for data migration projects from SQL Servers to Hadoop using Sqoop and Hive
  • Created various mappings between source systems and target systems on MS SQL as well as Hadoop platforms
  • Developed simple to complex MapReduce Jobs using Hive and Pig
  • Worked on data ingestion using Sqoop from HDFS to Relational Database Systems and vice-versa
  • Designed Big Data pipelines using HDFS, Hive, Sqoop, Oozie and Zookeeper along with Apache Spark
  • Created Hive Tables, loaded with information, and wrote Hive queries
  • Created Hive external tables and designed information models in Hive
  • Migrated ETL jobs to Python scripts for transformations, joins, aggregations before HDFS.
  • Worked with cross functional teams (Product and Engineering, Account and Data teams) in designing Data solutions using T-SQL, SSIS and SSAS
  • Worked in identifying gaps in the process, code, testing, and environments resulted in increased 20% efficiency
  • Created technical design documents for various data solutions on Microsoft stack using SSIS packages and ETL process
  • Worked with architects in legacy data migration projects from SQL Server /2014 platforms
  • Designed, developed, and implemented SQL Server database used for reporting, along with procedures to extract, transform and load data from production OLTP databases using SSIS
  • Developed and deployed ETL process/packages and SSIS packages using TFS and Git
  • Created data pipelines to extract, transform and load the data from various sources like flat files, APIs, databases using SSIS
  • Performed web scrapping, data manipulation and data wrangling using Python 2.7 for Google and Bing Media data
  • Created ODS and Data marts for reporting models for various cross functional teams like Marketing insights, Call center teams etc. on SQL platform.
  • Worked closely with engineering teams to design and build scalable data solutions to support fast growth and continually aligned the BI roadmap to the needs of the business on traditional as well as big data platforms
  • Created test cases for ETL mappings and design documents for production support
  • Responsible for managing and leading onshore-offshore model for BI operations at LFO
  • Responsible for the reports and managing project metrics to monitor project progress, risks/blockers & scope control
  • Worked on data analysis, data aggregations using Power Query and Power BI for reporting
  • Created business reporting models for Media Insight and Marketing Analytics Teams using Power BI

Confidential, Chicago, IL

Database/ETL Developer

Responsibilities:

  • Worked with stakeholders in creating business requirement documents (BRD) and mapped into system requirements documents for various data solutions
  • Designed, developed, tested, and implemented relational databases and stored procs on SQL server 2012
  • Created the deployment scripts and managed the code in Microsoft Team Foundation Server (TFS)
  • Built and designed Extract, Transformation and Load (ETL) processes, procedures, and code for large set of complex data
  • Optimized query performance using execution plans, SQL profiler and tuning advisor
  • Performed quality assurance and testing of SQL server environment
  • Transformed complex business trends into analytic and reporting requirements

Environment: MSSQL Server 2012, SSIS, T-SQL, Data Integration, Data Quality, Data Management, Visual Studio 2012, Agile- Kanban, SSMS, SQL Profiler, Performance tuning, clustered indexes, Query optimization

Confidential, Chicago, IL

Database Developer

Responsibilities:

  • Responsible for gathering business requirements, mapping functional requirements into technical specifications for development team and created requirements traceability matrix for BI applications
  • Worked with Tellabs Insight Analytics (SM) team in identifying the KPIs by extracting data from various flat files and APIs, performed data transformation and integration.
  • Implemented complex SQL queries, Schema Designing, Normalization, and Performance Tuning using SQL profiler
  • Developed stored procedures, tables, views, scripts, etc., for transformation of data for various data platforms
  • Developed T-SQL stored procedures, user-defined functions, views and ad hoc queries to support reporting requirements
  • Assisted in modeling/designing databases and data marts on RDBMS platform using SQL server 2008R2/2012
  • Wrote data analysis requirements, developed reporting and insights, defined support requirements and processes
  • Worked with cross functional teams for designing the standard reports for Tellabs in Agile environment using SSRS
  • Managed offshore and onshore resources on multiple client engagements
  • Created data models to show key components of source systems and perform profiling of source data
  • Developed ETLs using SSIS and SQL for data analysis and KPI reporting for Tellabs Analytics team
  • Automated the data loading process by creating SQL Server Agent jobs

Environment: MSSQL Server 2008R2/2012, SSIS, SSRS, T-SQL, Data Integration, Data Management, Visio, Visual Studio, Waterfall, SQL Profiler, Performance tuning, clustered indexes, Query optimization

Confidential, California

Business Data Analyst, Intern

Responsibilities:

  • Performed Data extraction from multiple sources like flat files, APIs etc. and loaded into data marts and data warehouses for Market monitoring reports using SSIS and BIDS 2010
  • Wrote SQL queries, involved in developing stored procedures, triggers for Market Monitoring Assessment report
  • Worked with team in managing the data process on Microsoft BI stack behind Market Monitoring preparation, release, and evaluation.
  • Created Views and triggers to reduce database complexities for the end users and facilitated reporting as part of process improvement
  • Performed query tuning and optimization using indexing strategies for improving performance
  • Created temporary SAS data sets for analysis, processed data and generated summary reports using PROCs in SAS

Environment: MSSQL Server 2008R2, SSIS, SAS 9.2, T-SQL, Agile, Adhoc reports, BIDS 2010, SQL Profiler, Performance tuning, clustered indexes, Query optimization

Confidential

Application Developer

Responsibilities:

  • Participated in project planning sessions with business analysts and team members to analyze business IT requirements and translated business requirements into working model
  • Wrote queries to generate reports for analysis of data using SQL Server Reporting Services (SSRS)
  • Analyzed and interpreted data feed sources (FTP connections, direct server queries, raw data files) to migrate data from various data formats into SQL Server databases using daily automated import processes (DTS and SSIS packages, etc.)
  • Developed test cases based on test matrix including test data preparation for Data Completeness, Data Transformations, Data quality, Performance and scalability
  • Developed ETL test scripts based on technical specifications/Data design documents and Source to Target mappings.
  • Tested a number of complex ETL mappings, mapplets and reusable transformations for daily data loads
  • Documented software defects using bug tracking system and reported defects
  • Performed black box testing and end to end testing to uncover bugs/errors for the developed solutions
  • Designed test plan and test cases, developed and executed automation scripts using IBM Rational Functional Tester
  • Performed System and Regression testing for web portal applications using Rational Functional tester (RFT)
  • Performed test activity management, execution and reporting using Rational Test Manager (RTM)

Environment: MSSQL Server 2008, SSIS, Visual Studio, Data Management, ETL testing, Excel, IBM Rational Functional tester, Rational Test Manager, OLTP, SQL Profiler, Performance tuning, clustered indexes, Query optimization

Confidential

Application Developer

Responsibilities:

  • Analyzed business requirements, reviewed the Mapping documents to validate the source and target mapping for database testing
  • Developed & Maintained database, maintenance routes and safeguards (procedures, scripts, function) to increase performance
  • Worked on defining the transformation rules for extraction and loading process
  • Created T-SQL objects & scripts used to maintain data integrity across all databases throughout database servers (Stored Procedures, Triggers, Functions).
  • Validated data in warehouse, dashboards, and reports for ensuring data quality in DataMart for reporting
  • Prepared system test plan, test scripts and test cases for automation of various modules within a continuous deployment environment
  • Wrote several SQL scripts to validate data integrity in the application using various Data Definition Language (DDL) statements
  • Maintained and enhanced the existing automation frameworks using Quality Test Professional (QTP)
  • Performed Integration /system and regression testing in different environments (UAT/SIT/Dev) and platforms
  • Designed traceability metrics for end-to-end testing of various applications
  • Responsible for troubleshooting the legacy systems and support data connections between them

Environment: MSSQL Server 2008, T-SQL, SSIS, BIDS 2010, Data Quality, Data Management, Visual Studio, Waterfall, QTP, Performance tuning, clustered indexes, Query optimization

We'd love your feedback!