Data Architect/Senior Data Engineer Resume Northbrook, IL - Hire IT People

SUMMARY

Over 12 years of experience in technology solutions, design, and implementation of end - to-end data-driven platforms including 4 years of experience with Hadoop ecosystem like Hadoop Map Reduce, HDFS, Hive, HBase, Kafka, Spark
Experience in developing solutions and leading data migration projects on to AWS EMR, Redshift, S3, Kinesis etc.
Experienced in all stages of software development Lifecycle (SDLC) including Requirements, Analysis and Design, implementation, integration and testing, deployment and maintenance in waterfall and Agile methodologies
Built large-scale data pipelines and data-centric applications using Big Data tooling like Hadoop, Spark, Hive, Oozie, and Airflow in a production setting
Experienced in data modeling for OLAP, OLTP, Enterprise Data warehouse and ODS applications
Experienced in Apache Spark cluster and stream processing using Spark Streaming
Experienced in using distributed computing architectures such as AWS products (S3, Redshift, and EMR), Hadoop, Python, Spark, and effective use of map-reduce, SQL to solve big data type problems
Hands-on experience with MPP query engines like Spark SQL
Strong experience in writing applications using python using different libraries like Pandas, NumPy, SciPy etc.
Experienced in performance tuning and optimization process on both RDBMS and Hadoop data platforms
Experienced in continuous integration (CI) and continuous deployment (CD) using Team Foundation Server (TFS) and Git
Experience in developing Map Reduce Programs using Apache Hadoop for analyzing the big data
Experienced in creating reporting models and dashboards using PowerBI and Tableau
Strong knowledge in database schema designing like Snowflake and Star schema
Strong experience in data warehouse designing using Kimball and Inmon’s approach
Expertise in working with Transactional Databases - SQL server 2008R 2/2012/2014/2016
Experienced in SQL server database design, development, and implementation, developing SQL queries, Stored Procedures, server level/query level performance tuning
Provided thought leadership and architectural expertise and managed cross-team integration
Strong experience in leading teams and managing onshore- offshore model
Excellent communication, analytical and problem-solving skills

TECHNICAL SKILLS

Hadoop/Big Data: HDFS, Map Reduce, HBase, Hive, Sqoop, Kafka, Spark

Data Languages: T-SQL, Python, PySpark

Cloud Technologies: AWS S3, Redshift, AWS EMR, AWS Kinesis, Glue Data Catalog

Orchestration tools: Airflow, Oozie, Zookeeper

ETL/Data Integration framework: Microsoft Business Intelligence Development Studio (BIDS), SSIS, SQL Server Data tools (2012), Talend.

Tuning and Optimization Tools: SQL profiler, Query analyzer, Tuning advisor

Data visualization tools: Tableau, Power BI

Database System: MS SQL server 2008R 2/2012/2014/2016 , PostgreSQL

CI/CD: Team foundation server (TFS), Git

Automation and configuration management: PowerShell, VBA

Project management tools: JIRA, Confluence, Workfront, Asana, Visio

PROFESSIONAL EXPERIENCE

Confidential - Northbrook, IL

Data Architect/Senior Data Engineer

Responsibilities:

Responsible for creating and mapping Business Requirement Documents (BRD) into technology solutions
Worked with various business units and stake holders for designing architecture roadmaps for data initiatives
Worked closely with EAs, engineering & data teams to design and implement business-critical enterprise data-solutions
Led various cross functional teams in building and deploying data-platform solutions for business on Cloud and on-premises
Migrated big data and analytics workloads from on-premises Hadoop to Amazon EMR
Created batch data pipelines for ingesting data from various sources using Apache Spark and PySpark
Used Spark SQL and Data Frames API to load structured and semi-structured information into Spark Clusters
Performance tuning of the Hadoop clusters, Hive/Spark Queries and Map Reduce workloads
Scheduled and orchestrated data pipelines using Apache Airflow
Created RESTFUL APIs for several data applications using Python and Django framework for data ingestions
Loaded and transformed large sets of structured, semi structured and unstructured data in to Hive through Sqoop.
Designed and implemented data pipelines for loading data from different Data sources like (SQL Server and flat files) into HDFS using Sqoop and loaded into partitioned Hive tables
Worked in tuning Hive to improve performance and solved performance issues in both scripts
Enhanced the data infrastructure framework by evaluating new and existing technologies and techniques to create efficient processes around data extraction, aggregation, and analytics on both SQL and Hadoop platforms
Performed data migrations for legacy applications on SQL Server 2008 R2 to SQL Server 2012/2014 and 2017
Developed MOLAP Cubes with Star and Snowflake Schema using MS SQL Server Analysis Services (SSAS)
Designed data pipelines for extracting, transforming, and loading data from various sources like flat files, APIs and different servers using SQL Server Data Tools (SSIS) on RDBMS platform
Designed current databases by adding tables, relationships, creating stored procedures, views, and performance tuning of T-SQL queries
Created data pipelines to extract large data sets from various sources like flat files, data marts, servers/APIs using SSIS
Deployed SSIS packages and automated builds for CI/CD using TFS and Git
Increased efficiency of existing platforms through performance tuning and optimization by creating clustered/non- clustered indexes, SQL profiler and execution plans etc.
Developed Enterprise-wide Power BI reports and dashboards from multiple data sources using data blending
Extracted and Integrated data from various sources like APIs, flat files, Google Analytics into Power BI
Led onshore offshore teams in developing end to end data platforms as per business requirements

Environment: Hadoop 3.0, HBase, Sqoop, Kafka, HDFS, AWS S3, Amazon EMR, MSSQL Server 2008R2/2012/2016, T-SQL, SSIS, SSAS, OLTP, MOLAP, Power BI 2.4, Power Query, Visual Studio 2017, Python 3.7.1, APIs. TFS, Git, Agile -Kanban

Confidential, Chicago, IL

Data Engineer

Responsibilities:

Worked as Data Engineer lead in managing the BI production support/ operations for the various business units on Hadoop, AWS Cloud and RDBMS platforms
Elicited business data requirements from clients and translated into technical specifications for development team
Created POCs for data migration projects from SQL Servers to Hadoop using Sqoop and Hive
Created various mappings between source systems and target systems on MS SQL as well as Hadoop platforms
Developed simple to complex MapReduce Jobs using Hive and Pig
Worked on data ingestion using Sqoop from HDFS to Relational Database Systems and vice-versa
Designed Big Data pipelines using HDFS, Hive, Sqoop, Oozie and Zookeeper along with Apache Spark
Created Hive Tables, loaded with information, and wrote Hive queries
Created Hive external tables and designed information models in Hive
Migrated ETL jobs to Python scripts for transformations, joins, aggregations before HDFS.
Worked with cross functional teams (Product and Engineering, Account and Data teams) in designing Data solutions using T-SQL, SSIS and SSAS
Worked in identifying gaps in the process, code, testing, and environments resulted in increased 20% efficiency
Created technical design documents for various data solutions on Microsoft stack using SSIS packages and ETL process
Worked with architects in legacy data migration projects from SQL Server /2014 platforms
Designed, developed, and implemented SQL Server database used for reporting, along with procedures to extract, transform and load data from production OLTP databases using SSIS
Developed and deployed ETL process/packages and SSIS packages using TFS and Git
Created data pipelines to extract, transform and load the data from various sources like flat files, APIs, databases using SSIS
Performed web scrapping, data manipulation and data wrangling using Python 2.7 for Google and Bing Media data
Created ODS and Data marts for reporting models for various cross functional teams like Marketing insights, Call center teams etc. on SQL platform.
Worked closely with engineering teams to design and build scalable data solutions to support fast growth and continually aligned the BI roadmap to the needs of the business on traditional as well as big data platforms
Created test cases for ETL mappings and design documents for production support
Responsible for managing and leading onshore-offshore model for BI operations at LFO
Responsible for the reports and managing project metrics to monitor project progress, risks/blockers & scope control
Worked on data analysis, data aggregations using Power Query and Power BI for reporting
Created business reporting models for Media Insight and Marketing Analytics Teams using Power BI

Confidential, Chicago, IL

Database/ETL Developer

Responsibilities:

Worked with stakeholders in creating business requirement documents (BRD) and mapped into system requirements documents for various data solutions
Designed, developed, tested, and implemented relational databases and stored procs on SQL server 2012
Created the deployment scripts and managed the code in Microsoft Team Foundation Server (TFS)
Built and designed Extract, Transformation and Load (ETL) processes, procedures, and code for large set of complex data
Optimized query performance using execution plans, SQL profiler and tuning advisor
Performed quality assurance and testing of SQL server environment
Transformed complex business trends into analytic and reporting requirements

Environment: MSSQL Server 2012, SSIS, T-SQL, Data Integration, Data Quality, Data Management, Visual Studio 2012, Agile- Kanban, SSMS, SQL Profiler, Performance tuning, clustered indexes, Query optimization

Confidential, Chicago, IL

Database Developer

Responsibilities:

Responsible for gathering business requirements, mapping functional requirements into technical specifications for development team and created requirements traceability matrix for BI applications
Worked with Tellabs Insight Analytics (SM) team in identifying the KPIs by extracting data from various flat files and APIs, performed data transformation and integration.
Implemented complex SQL queries, Schema Designing, Normalization, and Performance Tuning using SQL profiler
Developed stored procedures, tables, views, scripts, etc., for transformation of data for various data platforms
Developed T-SQL stored procedures, user-defined functions, views and ad hoc queries to support reporting requirements
Assisted in modeling/designing databases and data marts on RDBMS platform using SQL server 2008R2/2012
Wrote data analysis requirements, developed reporting and insights, defined support requirements and processes
Worked with cross functional teams for designing the standard reports for Tellabs in Agile environment using SSRS
Managed offshore and onshore resources on multiple client engagements
Created data models to show key components of source systems and perform profiling of source data
Developed ETLs using SSIS and SQL for data analysis and KPI reporting for Tellabs Analytics team
Automated the data loading process by creating SQL Server Agent jobs

Environment: MSSQL Server 2008R2/2012, SSIS, SSRS, T-SQL, Data Integration, Data Management, Visio, Visual Studio, Waterfall, SQL Profiler, Performance tuning, clustered indexes, Query optimization

Confidential, California

Business Data Analyst, Intern

Responsibilities:

Performed Data extraction from multiple sources like flat files, APIs etc. and loaded into data marts and data warehouses for Market monitoring reports using SSIS and BIDS 2010
Wrote SQL queries, involved in developing stored procedures, triggers for Market Monitoring Assessment report
Worked with team in managing the data process on Microsoft BI stack behind Market Monitoring preparation, release, and evaluation.
Created Views and triggers to reduce database complexities for the end users and facilitated reporting as part of process improvement
Performed query tuning and optimization using indexing strategies for improving performance
Created temporary SAS data sets for analysis, processed data and generated summary reports using PROCs in SAS

Environment: MSSQL Server 2008R2, SSIS, SAS 9.2, T-SQL, Agile, Adhoc reports, BIDS 2010, SQL Profiler, Performance tuning, clustered indexes, Query optimization

Confidential

Application Developer

Responsibilities:

Participated in project planning sessions with business analysts and team members to analyze business IT requirements and translated business requirements into working model
Wrote queries to generate reports for analysis of data using SQL Server Reporting Services (SSRS)
Analyzed and interpreted data feed sources (FTP connections, direct server queries, raw data files) to migrate data from various data formats into SQL Server databases using daily automated import processes (DTS and SSIS packages, etc.)
Developed test cases based on test matrix including test data preparation for Data Completeness, Data Transformations, Data quality, Performance and scalability
Developed ETL test scripts based on technical specifications/Data design documents and Source to Target mappings.
Tested a number of complex ETL mappings, mapplets and reusable transformations for daily data loads
Documented software defects using bug tracking system and reported defects
Performed black box testing and end to end testing to uncover bugs/errors for the developed solutions
Designed test plan and test cases, developed and executed automation scripts using IBM Rational Functional Tester
Performed System and Regression testing for web portal applications using Rational Functional tester (RFT)
Performed test activity management, execution and reporting using Rational Test Manager (RTM)

Environment: MSSQL Server 2008, SSIS, Visual Studio, Data Management, ETL testing, Excel, IBM Rational Functional tester, Rational Test Manager, OLTP, SQL Profiler, Performance tuning, clustered indexes, Query optimization

Confidential

Application Developer

Responsibilities:

Analyzed business requirements, reviewed the Mapping documents to validate the source and target mapping for database testing
Developed & Maintained database, maintenance routes and safeguards (procedures, scripts, function) to increase performance
Worked on defining the transformation rules for extraction and loading process
Created T-SQL objects & scripts used to maintain data integrity across all databases throughout database servers (Stored Procedures, Triggers, Functions).
Validated data in warehouse, dashboards, and reports for ensuring data quality in DataMart for reporting
Prepared system test plan, test scripts and test cases for automation of various modules within a continuous deployment environment
Wrote several SQL scripts to validate data integrity in the application using various Data Definition Language (DDL) statements
Maintained and enhanced the existing automation frameworks using Quality Test Professional (QTP)
Performed Integration /system and regression testing in different environments (UAT/SIT/Dev) and platforms
Designed traceability metrics for end-to-end testing of various applications
Responsible for troubleshooting the legacy systems and support data connections between them

Environment: MSSQL Server 2008, T-SQL, SSIS, BIDS 2010, Data Quality, Data Management, Visual Studio, Waterfall, QTP, Performance tuning, clustered indexes, Query optimization

We provide IT Staff Augmentation Services!

Data Architect/senior Data Engineer Resume

Northbrook, IL

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship