We provide IT Staff Augmentation Services!

Sr.data Engineer Resume

3.00/5 (Submit Your Rating)

South Sfo, CA

SUMMARY

  • Over 6+ years of diversified IT experience in E2E data analytics platforms(ETL - BI-Java)as Bigdata, Hadoop, Java/J2EE Development, Informatica, Data Modeling and System Analysis, In Banking, Finance, Insurance and Telecom domains.
  • Worked for 4 years with AWS-BigData/Hadoop Ecosystem in the implementation ofDataLake.
  • Hands on experience Hadoop framework and its ecosystem like Distributed file system (HDFS), MapReduce, Pig, Hive, Sqoop, Flume, Spark.
  • Experience in layers of Hadoop Framework - Storage (HDFS), Analysis (Pig and Hive), Engineering (Jobs and Workflows), extending the functionality by writing custom UDFs.
  • Extensive experience in developing Data warehouse applications usingHadoop, Informatica, Oracle, Teradata, MS SQL server on UNIX and Windowsplatforms and experience in creating complex mappings using various transformations and developing strategies for Extraction, Transformation and Loading (ETL) mechanism by using Informatica 9.x/8.x.
  • Proficient in Hive Query language and experienced in hive performance optimization using Static-Partitioning,Dynamic-Partitioning, Bucketing and Parallel Executionconcepts.
  • As Data Architect designed and maintained high performance ELT/ETL processes.
  • Experience in Java, J2ee, JDBC, Collections, Servlets, JSP, Struts, Spring, Hibernate, JSON, XML, REST, SOAP Web services, Groovy, MVC, Eclipse, Weblogic, Websphere, and Apache Tomcat severs.
  • Working experience with Functional programming languages like Scala, and Java.
  • Extensive knowledge ofData Modeling, Data Conversions, Data integration and Data Migrationwith specialization in Informatica Power Center.
  • Expertise in extraction, transformation and loading data from heterogeneous systemslike flat files, excel, Oracle, Teradata, MSSQL Server.
  • Good work experience withUNIX/Linuxcommands, scripting and deploying the applications on the servers.
  • Strong skills in algorithms, data structures, Object oriented design, Design patterns, documentation and QA/testing.
  • Experienced in working as part of fast paced Agile Teams, exposure to testing in scrum teams,Test-Driven development.
  • Excellent domain knowledge inInsurance, Telecom and Banking/Finance.

TECHNICAL SKILLS

  • Machine Learning/AI
  • Deep Learning
  • OpenCV
  • Python
  • R
  • R-studio
  • PyTorch
  • TensorFlow
  • C#
  • HTML
  • CSS
  • JavaScript
  • MATLAB
  • Jupyter Notebooks
  • Google-Colab
  • MS Office suite
  • Outlook
  • Data Visualization
  • Tableau
  • MySQL
  • SQL
  • Spark
  • SSAS
  • SSMS
  • SSIS
  • Amazon Web Services (AWS)

PROFESSIONAL EXPERIENCE

Confidential, South SFO, CA

Sr.Data Engineer

Responsibilities:

  • Gathered and documented MDM application, conversion and integration requirements.
  • Interacting with Business Analysts and Developers in identifying the requirements, designing and implementing the Database Schema.
  • Performing codebase maintenance and quality checks for Microsoft Azure.
  • Recreating existing application logic and functionality in the Azure Data Lake, Data Factory, SQL Database and SQL data ware house environment.
  • Develop dashboards and visualizations to help business users analyze data as well as providing data insight to upper management with a focus on products like SSRS and Power BI.
  • Documenting and maintaining database system specifications, diagrams, and connectivity charts.
  • Participating in T-SQL code reviews and technical quality standards reviews with the development teams.
  • Involved with Query Optimization to increase the performance.
  • Supporting solution architects in problem analysis and solution design.
  • Developing and optimizing Stored Procedures, Views, and User-Defined Functions for the Application.
  • Support Data and Analytics/Transformation Architecture teams who are building a data strategy aligned with global strategic direction - develop canonical and other models as required, implementing data architecture platforms and solutions and data services, develop MDM foundation, participate in the design and implementation of unified data warehouse.
  • Developing physical data models and creating DML scripts to create database schema and database objects.
  • Created Clustered and Non-Clustered Indexes to improve data access performance.
  • Identified Relationships between tables and enforce referential integrity using foreign key constraints.
  • Created Functional Design Documents and Transaction Definition Documents.
  • Implemented metadata standards, data governance and stewardship, master data management, ETL, ODS, data warehouse, data marts, reporting, dashboard, analytics, segmentation, and predictive modelling
  • Designing dashboards and reports, parameterized reports, predictive analysis in Power BI.
  • Creating dashboards with Combination Charts, Custom Charts based on the requirement.
  • Deploying and managing user permissions for reports and dashboards on Power BI web portal.
  • Creating DAX Queries to generated computed columns in Power BI.
  • Evaluated data profiling, cleansing, integration and extraction tools (e.g. Informatica).
  • Responsible for the Database backup and Restoration using SQL native tool.
  • Partnering closely with business and IT teams in meeting the deadlines pertaining to design and development deliverables and maintaining audit and compliance needs.

Environment: SQL, Business Objects XIR2, ETL Tools Informatica 8.6/9.1, 11G, Enterprise BI in Azure with Azure Data slake/Synapse, Microsoft Power BI

Confidential, Watertown MA

Data Engineer

Responsibilities:

  • Analyzed client products data and ingested onto Master Data Management (MDM) with compliance oversight into data governance standards.
  • Investigated market sizing, competitive analysis, and positioning for product feasibility.
  • Wrote SQL Scripts for various MDM tables which links all the customer’s demographic details along their associated products together and mapped them to Persistent Id’s which uniquely identifies each client.
  • Automation of mastering Customers daily transactions and ingesting into MDM.
  • Automation of regular analysis report by creation of BI platform, involving ETL of COSMOS log files, data modeling and data base creation, design and development of SSRS reports.
  • Performed data management projects and fulfilling ad-hoc requests per user specifications by utilizing data management software programs and tools like Perl, Toad, MS Access, Excel, and SQL. Written SQL scripts to test the mappings and Developed Traceability Matrix of Business Requirements mapped to Test Scripts to ensure any Change Control in requirements leads to test case update.
  • Generated graphs and reports using ggplot package in R-Studio for analytical models.
  • Developed and implemented R statistical analysis for business forecasting.
  • Performed time series analysis using Tableau.
  • Worked with AWS S3, AWS Glue, Amazon DynamoDB for extracting, transforming, data from various data sources and ingesting into MDM.
  • Developed various workbooks in Tableau from multiple data sources.
  • Created dashboards and visualizations using Tableau desktop.
  • Created dashboards in Power BI to visualize data.
  • Later used Alteryx designer to blend the data and to validate data lineage.
  • Performed analysis using JMP SAS.
  • Written connectors to extract data from databases.
  • Analysis on Mainframe data to generate reports for business users.
  • Identified & recorded defects with required information for issue to be reproduced by development team.

Environment: Tableau, SQL, Business Objects XIR2, ETL Tools Informatica 8.6/9.1, Oracle 11G, Teradata V2R12/ R13.10, Teradata SQL Assistant 12.

Confidential

Data Scientist

Responsibilities:

  • Implemented Natural Language Processing (NLP), PySpark and statistical methods to improve existing tracing systems by 7%. Hands-on work experience with time series models like RNN, LSTM.
  • Created and executed 30+ forecasting models to revamp the Number Plate tracing System.
  • Collected, studied, and interpreted 10M-record datasets. Prepared reports performed accurate and efficient data management.
  • Worked on the AWSSageMakerto quickly build, train and deploy the machine learning models.
  • Developed and coded the statistical programs using Pytorch, Tensorflow for 50+ clients (e.g., Hotel business). Structured and unstructured data. Extracted insights using visualization software, e.g., Tableau.
  • Built predictive machine learning, simulation, and/or statistical models using Python.
  • Generated comprehensive analytical reports by running SQL queries against current databases to conduct data analysis.
  • Maintained and developed complex SQL queries, stored procedures, views, functions, and reports that qualify customer requirements using Microsoft SQL Server.
  • Performed Data Cleaning, features scaling, features engineering using pandas and NumPy packages in python.
  • Used pandas, NumPy, Seaborn, SciPy, matplotlib, sci-kit-learn in python for developing various machine learning algorithms.
  • Performed Information Extraction using NLP algorithms coupled with Deep learning (ANN and CNN), Keras and TensorFlow.
  • Participated in all phases of datamining; data collection, data cleaning, developing models, validation, visualization and performed Gap analysis.
  • Performed data manipulation, Data preparation, Normalization and Predictive modelling. Improve efficiency and accuracy by evaluating model in Python.
  • Generated the reports and visualizations based on the insights mainly using Tableau and developed dashboards for the company insight teams.
  • Proficient in Predictive Modelling, Data Mining Methods, Factor Analysis, ANOVA, Hypothetical testing, normal distribution and other advanced statistical and econometric techniques.
  • Performed data manipulation, Data preparation, Normalization and Predictive modelling. Improved efficiency and accuracy by evaluating model in Python.
  • Generated the reports and visualizations based on the insights mainly using Tableau and developed dashboards for the company insight teams.
  • Designed and executed 100+ MySQL database queries from Python using Python-MySQL connector and MySQL database.

Confidential

Data Scientist

Responsibilities:

  • Managed the project for Royal Bank of Canada worth $1M by optimizing SQL and RDBMS databases
  • Developed 50+ data mining models to the existing system by using OpenCV, NumPy, Matplotlib, SciPy, and Pandas.
  • Initial project was on loan application preapproval using Supervised and Unsupervised learning.
  • The second project was using Computer Vision for analyzing the customer body language. Using only customers’ faces and behavioral characteristics, banks will authorize transactions.
  • Human behavior detection based on tracking trajectory analysis and pattern recognition.
  • Performed transfer learning to identify human emotion and detect faces.
  • Used OpenCV SSD to detect faces, and used Face Net, which has 1024 features to recognize and compare faces.
  • Hands-on experience in analytical techniques including sampling, clustering, decision trees, SVM, Random Forest, regressions, deep learning, etc.
  • Handled monthly offshore calls, reviewed code changes, communicated efficiently with senior management to hand over features, and time-critical tasks.

We'd love your feedback!