BI & Data Engineer Sr Resume Buffalo, NY - Hire IT People

SUMMARY

Data scientist with 7 years of financial services, technology, and e - commerce experience.
Over 4 plus years of experience involved in the entire data science project life cycle, including Data Acquisition, Data Cleaning, Data Manipulation, Data Mining, Machine Learning Algorithms, Data Validation, and Data Visualization.
Expertise in transforming business requirements into analytical models, applying algorithms, and reporting solutions that scales across massive volume of structured and unstructured data.
Experienced with linear regression and logistic regression, Bayesian inference, SVM, neural networks, ANOVA, Gaussian mixture, recommendation system and maximum likelihood estimation analysis.
Strong skills in statistical methodologies and dimension reduction methods like PCA and correspondence analysis, variable clustering.
Worked with testing and validation using k-fold cross validation and regularization.
Extensive experience in developing time series modeling, including but not limited to ARIMA and GARCH modeling, using SAS 9.4, SAS Enterprise Miner & SAS Enterprise Guide and SAS/JMP.
Worked with Python 3.3 in developing machine learning algorithms, like decision tree, random forest, lasso regression, k-mean clustering analysis, using Numpy, Pandas, Scikit-learn, SFrame, Scipy and Matplotlib, nltk packages.
Strong ability to write and optimize diverse SQL queries, working knowledge of RDBMS and NoSQL Database, such as MySQL, SQL Server, HBase, Cassandra, MongoDB.
Adept and deep understanding of text mining, generating data visualizations, delivering projects using various packages in R, like ggplot2, dplyr, caret, twitteR, NLP, rjson, openNLP, tm, GoogleVis, Shiny.
Working knowledge of Map Reduce with Hadoop and Spark. Good knowledge of big data ecosystem like Hadoop 2.0 (HDFS, Hive, Pig, Impala), Splunk, Spark (SparkSql, Spark MILib, Spark Streaming).
Excellent performance in building, publishing customized interactive reports and dashboards with customized parameters and user-filters using Tableau 10.3/9+
Working knowledge of HTML5/CSS3/JavaScript.
Excellent understanding of SDLC, Agile, and Scrum.
Effective team player with strong communication and interpersonal skills, possess a strong ability to adapt and learn new technologies and new business lines rapidly.

TECHNICAL SKILLS

BI Tools \Languages: \: Tableau 9.4/9.2, SharePoint 2016/2013, \Python 3.3/2.7, R 3, SQL, SAS 9.4, VBA, \ MS Office (Word/Excel/PowerPoint/Visio),\HiveQL, Pig Latin\ ELK\

Big Data Tools \Operating Systems: \: Hadoop 2 (Hive, HDFS, Pig, Impala), Spark 2.1 \Windows 10/8/7, UNIX, Linux\(SparkSql, MILib), MapReduce, Splunk 6.0, \SSIS\

Packages \Database: \: Python (Numpy, Pandas, Scikit-learn, SFrame, \Oracle 11g, MS Access 2013, SQL Server \Scipy, Matplotlib, nltk) R (ggplot2, dplyr, caret, \2014/2012, MySQL 5.5, HBase 1.2, MongoDB \Twitter, NLP, openNLP, rjson, tm, GoogleVis,3.2, Cassandra 3.0\Shiny)\

PROFESSIONAL EXPERIENCE

Confidential - Buffalo, NY

BI & Data Engineer Sr

Responsibilities:

Extracted, transformed, loaded transactional data from Oracle 12c and MS SQL Server
Developed interactive dashboards to support user studies for different products - Zelle, Real time payments, Mobile Next Gen, Enterprise Message Hub, Digital card self-services, and Confidential &T Insurance Agency sales, using Tableau
Monitor transactional database log to identify inconsistent data format, unexpected data loss
Effectively improved Tableau processing performance and reduced load time by scheduling extract refresh, and reducing calculation processing levels and improving queries
Applied various machine learning algorithms and statistical models - decision tree, logistic regression, gradient boosting machine to build predictive model using Scikit-learn in Python
Continuously worked with center technology to integrate Python script with Tableau in support of advanced data modeling and text analysis
Created data mapping documents, followed bank wide data architect regulations and best practices to support ETL work
Applied time series models to forecast customer enrollments and transactions, in support of adjusting promotions and strategies to meet KPIs
Responsible for software license management, created team owned database on MS SQL Server, extracted, transformed and loaded machine log data
Visualized software license management analysis results using Tableau dashboards and presented finding to executive management on a weekly basis
Helped bank wide software registration automation process by using license reports
Lead Tableau training programs across the bank, performed as Tableau trainer of the bank
Documented the requirements including the available code which should be implemented using Spark, Hive, HDFS

Environment: Tableau 10.3/9+, Python 3, Numpy, Pandas, nltk, Scikit-Learn, Oracle 12c, SQL server 2016/2012/2008

Confidential - Stamford, CT

Data Scientist

Responsibilities:

Continuously collected business requirements during the whole project life cycle
Worked on data cleaning and reshaping, generated segmented subsets using Numpy and Pandas in Python
Applied various machine learning algorithms and statistical modeling like decision tree, logistic regression, Gradient Boosting Machine to build predictive model using scikit-learn package in Python
Identified the variables that significantly affect the target
Conducted model optimization and comparison using stepwise function based on AIC value
Worked on model selection based on confusion matrices, minimized the Type II error
Generated cost-benefit analysis to quantify the model implementation comparing with the former situation
Generated data analysis reports using Matplotlib, Tableau, successfully delivered and presented the results for C-level decision makers
Wrote and optimized complex SQL queries involving multiple joins and advanced analytical functions to perform data extraction and merging from large volumes of historical data stored in Oracle 11g, validating the processed data in target database
Developed Python scripts to automate data sampling process. Ensured the data integrity by checking for completeness, duplication, accuracy, and consistency

Environment: Tableau 9.4, Python 3.3, Numpy, Pandas, Matplotlib, Scikit-Learn, Machine Learning, Oracle 11g, SQL

Confidential - Hartford, CT

Data Scientist

Responsibilities:

Performed time series model (ARIMA) with SAS to capture data pattern and traffic trends, conducted the forecasting of the occupancy rate by different parking lots
Effectively communicated with the business development team, ensured to implement, and complete the initiative that may increase opportunities
Efficiently delivered data interpretation by creating interactive analysis reports using data visualization tools - Tableau, to identify business solutions and to support business decisions on marketing and operation
Implemented and delivered all requirements that are outlined within the contractual agreement between company and the university
Prepared and executed complex SAS/PROC SQL involving multiple joins and advanced analytical functions to validate the processed data in target database
Searched and collected data from external sources, integrated with the primary database. Created SparkSql Context to load data from JSON files and performed SQL queries
Extracted and compiled data, conducted data manipulation to ensure data quality, consistency, and integrity using SFrame in Python

Environment: SAS 9.4/SAS Studio, Hadoop 2, Spark, SparkSql, MS Office (Excel), Tableau 9.4, Python, SFrame

Confidential - West Hartford, CT

Data Scientist

Responsibilities:

Collected business requirements and data analysis needs from other departments
Worked on data cleaning and ensured data quality, consistency, integrity using Numpy, SFrame in Python
Experimented text mining based on customer complaints using nltk in Python
Used k-means clustering technique to identify outliers and to classify unlabeled data
Applied Principal Component Analysis method in feature engineering to analyze high dimensional data
Application of various machine learning algorithms and statistical modeling - decision tree, lasso regression, multivariate regression to identify key features using scikit-learn package in python
Evaluated models using k-fold cross validation, log loss function
Ensured that the model has low false positive rate, validated model by interpreting ROC Plot
Built repeatable processes in support of implementation of new features and other initiatives
Created various type of data visualization using Tableau
Performed data parsing and data profiling from large volumes of varied data to learn about behavior with various features based on transactional data, call center history data and customer personal profile, etc.
Processed the primary quantitative and qualitative market research and loaded the survey responses into database, in preparation of data exploration
Developed python scripts to automate data sampling process. Ensured the data integrity by checking for duplication, completeness, accuracy, and validity

Environment: Python 3.3, Hadoop 2, Tableau 9.4, Numpy, SFrame, Scikit-Learn, nltk

Confidential

Data Analyst

Responsibilities:

Extracted and amalgamated information on the data working. Create primary and secondary competitive intelligence gathering for distribution of impactful bi-weekly and monthly reports
Performed initial descriptive data analysis on datasets using SAS, generated statistical report by PROC UNIVARIATE and FREQ
Conducted hypothesis tests and analysis on the content of clinical datasets to assess quality, completeness, and volumes of data
Coordinated with research team and system owners, in order to understand the origins, contents, and structure of datasets, ensured that research objectives were able to be met
Effectively communicated the results and reported to colleagues and partners
Created decision-driving competitive intelligence reporting from scientific conferences
Comprehensive knowledge of drug development and commercial landscapes

Environment: SAS 9.4, SAS Enterprise Guide, SAS Studio, SQL server 2012, SPSS, Microsoft Office 2013 (Access/PowerPoint/Word/Excel)

Confidential

Business Data Analyst

Responsibilities:

Accomplished the study of client, including buying behaviors, client profile, segmentations
Analyzed the traffic and business performance of commercial and marketing operations in an approach of continuous improvement of digital devices. Created the data visualization using Shiny in R, in order to track the performance of business campaigns (newsletters, mailing)
Implemented the strategic initiatives with history data, built and tested the predictive models to better estimate the impact of new campaigns
Developed the recommendation system by applying collaborative filter and content-based filter, based on large scale of data set, improved the accuracy and the promptitude of customized recommendation
Created materials on emphasizing product knowledge, brand heritage, website user experience, and luxury service to support CRM initiative and drive sales results
Prepared and executed complex SQL queries involving multiple joins and advanced analytical functions to validate data in target database

Environment: MySQL, R, dplyr, caret, mle2, Shiny

Confidential

Business Analyst

Responsibilities:

Participated in data entry, data extraction using SQL queries with MySQL
Identified the key parameters by clearly defining treatment and control groups and marking target audiences who would be incremental and profitable to business
Conducted A/B testing for the implementations of new initiatives and conducted documentation in support of the Web design team
Created different kinds of charts to visualize data analysis results
Successfully generated decision-driving reports

Environment: Python 2.7, MySQL, R, MS Office (Excel/PowerPoint/Word), Pandas

We provide IT Staff Augmentation Services!

Bi & Data Engineer Sr Resume

Buffalo, NY

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship