Manager, Data Engineering Resume
Chicago, IL
PROFILE SUMMARY:
- 13+ years of global IT experience with focus on data /data modeling / big data related technologies including traditional databases, data warehousing / Business Intelligence / Analytics concepts and tools and big data and data science tools.
- Experience in creating and maintaining conceptual, logical and physical data models for OLTP and OLAP systems (Sql Server and Oracle) as well as NoSQL systems (Cassandra).
- Comfortable to understand and work in machine learning based product development
- Understand well the basic of Machine Learning, workflow of ML projects, data pre - processing, feature selection / extraction, analyzing different model representations
- Understand well the different popular algorithms available in ML
- Have sufficient mathematics / statistics knowledge for ML
- Ability to see the big picture of implementing ML based solutions with various data / big data related technologies
- Ability to implement ML based data solutions using Confidential Big Data and Machine Learning services
- Experienced in designing / developing three PoC’s on Machine Learning (ML)/ Natural Language Processing (NLP)
- Currently working on a personal project on Face Detection using OpenCV, dlib, OpenFace, Python and Google’s FaceNet with combination of big data tools to detect face in streaming video
- Hands-on experience in Big data tools Apache Spark, HDFS, Map Reduce, YARN, Hive and Sqoop.
- Experience in NoSQL database - Cassandra, MongoDB
- Researched, compared and benchmarked no sql databases - Cassandra, MongoDb and CouchDB
- Experience developing complex database objects like Stored Procedures, Functions, SSIS Packages and Triggers using T-SQL (MS SQL Server) and PL/SQL (Oracle).
- Accomplished data / big data professional with hands-on experience in data architecture, data modeling, big data ecosystem, data analytics, NoSQL data stores, enterprise data management, master data management, data governance, metadata management, enterprise data warehouses, business intelligence, and relational-database management systems.
- Experience in handling end-to-end implementing solutions on enterprise level from requirement analysis to system study, designing, coding testing, de-bugging, documentation and implementation.
- Proven ability to coordinate with cross functional teams in onsite - offshore as well as onsite - onsite model for analyzing existing systems, and designing and implementing database solutions / big data solutions.
- Solid understanding of database theory, distributed database theory and principles and paradigms of distributed systems.
- Extensive experience in system integration architecture (ETL) combining data from disparate source systems and disparate data formats.
- Strong understanding of Application architecture design and Cloud computing.
- Good experience in manual / automated provisioning of AWS cloud services using configuration management tools Ansible / Chef and Python API BOTO for AWS.
- Experience in Dimensional modelling and data warehousing and business intelligence setup.
- Experience in RDBMS system - Oracle, MS SQL Server, PostgreSQL and MySQL.
- Hands on experience in logical and physical data modeling for data warehousing/ Business Intelligence roadmaps.
- Experience in Master Data Management (MDM) assessment, analysis, design, and implementation for a Healthcare Major.
- Experience with data flow diagrams, data dictionary, database normalization techniques, entity relation modeling and design techniques.
- Hands-on experience in ETL techniques using SSIS and Stored procedures and Analysis and Reporting using tools such as SSAS, SSRS, Crystal Report, Qlikview and Tableau.
KEY RESULT AREAS:
- Experienced in designing / developing three PoC’s on Machine Learning (ML)/ Natural Language Processing (NLP)
- Currently working on a personal project on Face Detection using OpenCV, dlib, OpenFace, Python and Google’s FaceNet with combination of big data tools to detect face in streaming video
- Architected, designed, developed and deployed big data solution with Machine learning model for Social Media sentiment analysis to be used as key accelerator for the company’s analytics initiatives with 45 GB of data used as data and Tweets stream to be used as live data. Application hosted in AWS cloud and live data streamed and processed every 3 seconds
- Proposed and built a Data warehousing / Business Intelligence / Big Data solution for sales and marketing programs which helps to track ROI of ad campaigns for a New York based media company. The historical data size was 4 TB with expected growth of 2 TB per year.
- Successfully modeled data marts in Cassandra using unique techniques
- Led / managed offshore data services team of 9 members for a Boston based medical CME Company for continuous 4 years without single escalation and built many data marts and MDM solutions.
- Architected, designed, developed and deployed DevOps MVP for the current company to be used as key accelerator for the company’s DevOps initiatives
- Implemented many utility tools for various client in various role.
- Ability to see the big picture of implementing ML based solutions with various data / big data related technologies
- Good exposure in developing configuration and orchestration scripts using DevOps tools Ansible, Chef and Jenkins
- Ability to understand business requirements and create data models that address the needs
- Team player with strong communication, analytical and organizational skills with expertise in interfacing with project teams for successful execution
- Quick learner of new tools/technologies pertaining to any technology areas, especially in business intelligence / data analytics.
CURRENT INTEREST AREAS:
- Data science, machine learning, natural language processing
- BlockChain technologies
- Distributed data processing technologies
- Data warehouse and Business Intelligence
- Search engines
- Open source technologies
- No SQL databases
BUSINESS SKILLS:
Computational and Adaptive Thinker, Leadership and Delegation, Capacity Planning (Forecasting and Modeling), Change Management, Risk Management, Transition Management, Program Management, Cross-Cultural Management, Virtual Collaboration, Strategy Formulation, Tactical Implementation Plans, Networking and Relationship Building, Organizational Behavior and Structure, Emotional Intelligence, Entrepreneurial Skills, Innovation and Data Analytics Research Management.
TECHNICAL SKILLS:
Languages: Python, Shell Scripting, C++, Scala, T-SQL, ANSI SQL
Data Modelling: OLTP, OLAP, Dimensional Modelling and E-R modelling
RDBMS: SQL Server 2005/2008/2012 , MS Access
NoSQL DB: Cassandra, MongoDB
Frameworks: Hadoop Ecosystem, Apache Spark
Web Related: HTML, ASP.NET, XML, XSLT, JavaScript
Tools: and Utilities: PyCharm, Ipython Notebook, SQL Developer, SQL Plus, SSMS, SSIS, SSRS, SSAS, Business Objects, Crystal Report X & X1, Qlikview, Tableau, Visual Studio 2010, Confidential Virtual Works, Maven, Jenkins, Eclipse, Git/GitHub, Ansible, VirtualBox, Chef, PyCharm
Search Engine: Elasticsearch
Cloud Platform: AWS cloud platform (EC2, RDS, S3, VPC, IAM, RedShift), Confidential Platform (Big Data and ML)
Web/App Server: Tomcat, JBoss, IIS
Domain Knowledge: Energy Billing, Healthcare, Event Management & Aircraft Maintenance, Advertisement Media
EXPERIENCE SUMMARY:
Confidential, Chicago, IL
Manager, Data Engineering
Responsibilities:
- Solution Architecture - Represent Google to its clients and help in implementing GCP based big data solutions using various big data and ML services like Cloud Storage, DataFlow, Cloud Pub/Sub, Cloud SQL, Cloud DataStore, BigTable, BigQuery, Cloud ML Engine etc.
Environment: Confidential Big Data and ML platform
Confidential, Hamilton, NJ
Data Architect
Responsibilities:
- Solution Architecture - Deployment to on-premise servers using AWS CodeDeploy
- Implement DevOps dashboard for end to end DevOps pipe line using CapitalOne Hygieia
- AWS Provisioning automation using Boto3, Terraform and Jenkins
Environment: AWS CodeDeploy, Python, Boto3, Jenkins, Hygieia, Centos7, EC2, S2
Confidential
Data Architect
Responsibilities:
- Data modeling - conceptual, logical using relational theory and physical model for Cassandra database
- Analyzing dataset, choosing features and participating in discussion in selecting ML algorithm (Numenta HTM was used)
- Provisioning, configuring and performance tuning Mesos, Cassandra, Kafka and Spark clusters with 5 nodes each
- Design of ETL in Python & Scala
- Provide key inputs on ETL related issues in integrating with different source and destination of data
Environment: Mesos, Kafka, Spark Stream, Cassandra, Scala, Python, Numenta HTM (Machine Learning Algorithm), Angular JS, Centos 7
Confidential, New York
Big Data Architect
Responsibilities:
- Big Data Architect
- Single resource to do end to end POC from data modelling to sample report development
- Played key role in requirement gathering
- Modelled data marts using dimensional modelling techniques in Cassandra database
- Technical leadership for Provisioning, configuring and performance tuning Mesos, Cassandra, Kafka and Spark clusters with 3 nodes each
- Designed ETL in Spark and Scala
Environment: Sql Server, SSIS, SSAS and Tableau, Visio, Windows, Scala, Linux, Spark, Mesos, Kafka and Cassandra
Confidential, Atlanta
Data Architect
Responsibilities:
- Attended a month long workshop with client’s other key IT professionals and product experts and understood the legacy systems.
- Created unified data model using E-R modelling techniques and Microsoft Visio
- Gathered requirements for the new product and created a universal data model to be used for their new product.
- Actively involved in the design of replication techniques based on the Directed Acyclic Graph theory and Git
- Played key role in selecting database technologies to be used for the product by comparing and bench marking database systems PostgresSQL, MongoDB, CouchDB and Cassandra.
- Now the product is under development.
Environment: Data Modelling, Microsoft Visio, PostgresSQL, MongoDB, CouchDB and Cassandra
Confidential
Data Science Architect
Responsibilities:
- Conceived and designed the application
- Developed module in Spark and Scala to process and get trained data using one month of downloaded Twitter archives (45 GB). A model was created using WordVec and NaiveBayes libraries in Spark
- Developed module in Spark and Scala to stream live Twitter feed and use the NaiveBayes model to classify the tweets
- Used Natural Language Processing techniques, Machine Learning algorithms
Environment: Spark MLIB, Spark Stream, Scala, Angular JS, Ubuntu 14.4 LTS
Confidential
Data Architect
Responsibilities:
- Single point of contact and interacted with the client team for understanding the modelling and data warehousing requirements and pain points
- Responsible for architectural design of ETL solution using Linux Shell Scripting
- Responsible for ETL stored procedures design and development using PLSQL to convert text/csv formatted data to XML
- Provided necessary input for technical direction
- Responsible for evaluating technologies and tools for test automation, build management and deployment
- Set up and maintained software development tools spanning source control, continuous integration/delivery and code review
- Implemented and managed code promotion automation between environments
- Provided support for development through scripts and tools writing
- Worked in an Agile development environment, collaborating with developers
- Ensured incident and change management procedures
- Contributed in everyday scrum calls and project discussions
- Planned and supported in development releases
Environment: Shell script, PL/SQL stored procedures, XML and Xslt
Confidential
Data Architect
Responsibilities:
- Analyzed and documented root cause for issues
- Developed database architectural strategies at the modeling, design and implementation stages to address business requirements
- Involved in defining logical and physical data models in SQL Server and data dictionary
- Design and development of ETL packages/store procedures in T-SQL / SSIS
- Responsible for effort estimation, work breakdown and task deliverables on each development cycle
- Responsible for upgrading and performance tuning
- Supported development teams by reviewing their data model designs for adherence to data strategic direction, integration and integrity compliance, and data administration standards
Environment: SQL server 2008 and SSIS
Confidential, Horsham, PA
Database Architect/Database Developer
Responsibilities:
- Responsible for understanding and analyzing technical architecture problems, challenges and complexities
- Implemented architectural changes and worked on increasing the performance of the SQL Server database
- Create and review data models for new components using ERWin tool.
- Reviewed store procedures to increase the performance and meet the standards and compliance
- Involved in fine-tuning scripts and store procedures in complex nature and reducing the execution time
- Developed and maintained database security and control procedures
- Responsible for setting up SQL standards for Stored Procedures developed by the client team
- Involved in rewriting some of the stored procedures with old features and the ones that were no longer supported
- Created database tables and wrote database triggers applicable to business rules
- Debugged various defects related to implementation of business logic
- Created technical design documents for enhancements and attended technical design reviews
- Created issue resolution documents and Unit Test Plans
- Coordinated with the front-end design team in KBM Template Editor to provide them with necessary stored procedures and the necessary insights into the data
- Involved in the continuous enhancements and fixing of production problems
- Handled errors usingException Handling,extensively for the ease of debugging and displaying the error messages in the application
- UsedSQL Server SSIS toolto build high performance data integration solutions includingextraction, transformationandload packagesfordata warehousing
- Extracted data from theXMLfile and loaded into thedatabase
- Provided technical support to team members
Environment: Windows 7, KBM Template Editor Tool, SQL Server 2005/2008
Confidential, Boston, MA
Database/Business Intelligence Architect/Developer
Responsibilities:
- Involved in preparing high level design document
- Responsible for architecting Database & DWBI
- Data modeling using OLTP & Data warehousing concepts and ERWin tool
- Involved in Data Modeling for Data Marts
- Conceived and engineered an admin tool for Business Objects - to reduce repetitive cumbersome work of updating data sources and passwords for all the reports in report repository at once
- Conceived and developed SQL Server compressed backup tools using MS VDI SDK and .Net for 2005 and older versions - this was used by the client to back up and restore their large databases of more than 100 GB size and achieved more than 70% compression during backup process
- Modeled and designed SSAS cube for the client’s marketing team
- Set up Oracle 11g and Linux based data warehousing for the client’s parent company
- The following tasks were performed as part of the Oracle 11g and Linux set up,
- Designed and implemented a data warehouse with four data marts
- Developed ETL Stored Procedures & Functions to pull data from source database to data warehouse databases using SQL Server T-SQL and SSIS
- Developed both inbound and outbound interfaces to load data into target database and extract data from database to flat files
- Created reports using Webi Reports (Business Objects), Crystal Report, SSRS and Qlikview
- Worked closely with Administrator for proper backup and recovery plans
- Wrote UNIX shell scripts to run SQL scripts daily
- Additional responsibilities and functions:
- Active Participation in all Business Development meetings of DWBI Practice
- Prepared Qlikview Business Development Plan
- Delivery Head for Qlikview Partnership program
- Qlikview and for DWBI team
- Pre-Sales activities for Qlikview Sales & Implementation Projects
Environment: Windows 2008, Linux, SQL Server 2005, 2008, Microsoft Business Intelligence Stack (SSIS & SSRS), Oracle 11g, SQL Developer 3.0, Business Objects, Qlikview & VB.Net
Confidential
System Analyst (Database Developer)
Responsibilities:
- Involved in setting up development and QA environment for managing more than 60000 stored procedures
- Involved in creating stored procedures in SQL Server for new components
- Fine-tuned procedures/SQL queries in SQL Server for getting maximum efficiency in various databases
- Involved inLogical & Physical Database Layout Design
- Tuned and optimized queries by altering database design, analyzing different query options and indexing strategies
- Involved in continuous enhancements and fixing of production problems
- Involved in database maintenance tasks
- Involved in designing & creating triggers in various ways Instead of responding to system events
- Created/modified various reports according to the business requirements
- Performed R&D on integrating the client’s system with other ERP solutions using Crystal Report 10.0
- Involved in integrating third party Gantt chart ActiveX control in the client’s ERP AMR Module’s Central Planning Component using VB 6.0 and Java script
- Created Excel based configuration manager tool for internal use of the client’s system for compliance with CMM Level 5 requirements using VBA scripts
Environment: SQL Server 2005/2008, Crystal Report 10.0, VB 6.0, VB. NET, VBA and Java script
Confidential
Developer
Responsibilities:
- Involved in understanding and analyzing the project requirements
- Developed stored procedures and functions in SQL Server
- Involved in writing code for database triggers on tables in SQL Server
- Developed packages as per the requirements
- Implemented robust stored procedures, functions and triggers to implement existing as well as new functionalities
- Implemented the required functionalities using various constraints, relationship and views
- Created inter tool reports using Crystal Reports 8
- Created UI and middleware modules using VB 6.0
Environment: SQL Server 2000, Crystal Report 8 & VB 6.0
