We provide IT Staff Augmentation Services!

Sr. Data Analyst Resume

4.00/5 (Submit Your Rating)

Reston, VA

PROFESSIONAL SUMMARY:

  • Over 7 years of experience in Analytics, Visualization, Data Modelling, Machine Learning & Reporting.
  • Independently led analytics, visualization, meetings for business importance wif clients, manage SLAs, modelling, reporting and providing actionable insights to managers and C - level executives.
  • Developed sophisticated statistical models for financial services in order to optimize operations in geographical locations.
  • Experience in descriptive & inferential statistics, forecasting models, classification, traditional data science techniques of regression & clustering, data mining, sentiment analysis, risk analysis, platform integrations, A/B testing and web-application development & deployment in Python, R, SPSS, MATLAB, SAS E-miner, Alteryx, Tableau, VBA (Excel)and etc.
  • Built data models by extracting and stitching data from various sources, integrated systems wif R to cater efficient data analysis.
  • Expertise in Automated back-end retrieval, Data Manipulation, Curing, quality control and Anomaly Detection using functional programming in Python & R.
  • Experience in Multiple polynomial regression, Support Vector Regression, Decision Tree Regression, Random Forest Regression using Python & R.
  • Experience in Logistic Regression, K-NN, Support Vector Machines, Kernel SVM, Naïve Bayes, Decision Tree Classification, Random Forest Classification.
  • Experience in K-means Clustering, Hierarchical Clustering using Python & MATLAB.
  • Experience is image processing & classification using MATLAB (AlexNet).
  • Developed Apriori & Eclat for data mining in Python.
  • Expertise in building Reinforcement Learning models such as Thompson Sampling in R and Upper Confidence Bound in Python.
  • Experience in developing Natural Language Processing models for using classical as well as Deep Learning Algorithms in Python.
  • Experience in Artificial Neural Network and Convolutional Neural Network development for data mining & statistical analysis in Python, SAS E-miner & Alteryx.
  • Expertise in Dimensionality Reduction techniques development & deployment such as Principal Component Analysis, Linear Discriminant Analysis, Kernel PCA and Sliced Inverse Regression in MATLAB, Python & R.
  • Data Mining models development using Decision Trees, Random Forest in Python, TreeNET in SPM-8 (Salford Systems) and MARS (Multivariate adaptive regression spline) in Python & Alteryx.
  • Static Web-page development & deployment wif Machine Learning & Analytics back-end processing using Streamlit (Python) & R-Shiny.
  • Worked on Back-end web-app framework development using Django & Streamlit (Python).
  • Expertise in Excel Macros, Pivot Tables, VLOOKUP, and other advanced functions.
  • Built a phase analytics model using Survival Analysis (Non-parametric: Kaplan-Meier & Semi-parametric: Cox Proportional Hazard Model), Markov Modelling (Hidden Markov Model & Markov Chains) and Monte Carlo Simulation in Python & R.
  • Worked on secondary products (derivates) pricing model, swaps and interest rate derivates such as Ho-Lee Model, Hull-White Model and Black-Derman-Toy using Python, MATLAB & VBA (Excel).
  • Expertise in developing mathematical models, probabilistic models and dynamical systems models for manufacturing, healthcare, financial, consumer mapping (retail), marketing (retail & e-commerce) and research industry.
  • Python modules primarily worked on pandas, numpy, scikit-learn, keras, requests, BeautifulSoup, PyTorch, Matplotlib, Seaborn, Scipy, Tensorflow and etc.
  • R packages primarily worked on ggplot2, data,table, dplyr, tidyr, Shiny, plotly, knitr, mlr3, XGBoost, Caret, Lubridate & Leaflet.
  • Good noledge and understanding of web designing programming languages JavaScript.
  • Experience in checking wif the interconnection of databases wif the user interface.
  • Expertise in Marketing & Customer Analytics focused on Market basket analysis, Campaign measurement, Private brand strategy, Sales forecasting, Customer segmentation and lifetime value analyses and Marketing mix modeling.
  • Developed complex database objects like Stored Procedures, Functions, Packages and Triggers using SQL and PL/SQL.
  • Proficient in Big Data, Hadoop and NoSQL databases such as MongoDB & Hbase
  • Experienced in SQL Queries and optimizing the queries in Oracle, SQL Server & DB2..
  • Experience in installing, configuring and maintaining the databases like PostgreSQL, Oracle, Big Data HDFS systems.
  • Used DFAST Modelling and Solutions for expected loss calculations and viewing the results in a dashboard for further insights.
  • Experienced in designing star schema (identification of facts, measures and dimensions), Snowflake schema for Data Warehouse, ODS Architecture by using tools like Power Designer and Microsoft Visio.
  • Experienced in designing Architecture for Modeling a Data Warehouse by using tools like Erwin, Power Designer and E-R Studio.
  • Largedata sets manipulation, datacleaning, quality control, and datamanagement
  • (Python, R, SQL)
  • Develop robust software for integrating multiple sensors and tracking systems
  • Aware of SDLC, Waterfall, Test-driven development (TDD) and Agile/Scrum Methodologies.
  • Excellent skills in System Analysis, Documentation and Designing the Technical Specifications for Business Processes.
  • Proficient in creating customized and interactive Tableau dashboards using multiple data sources - Oracle, SQL Server, Netezza, SAS datasets, CSV files & TDEs
  • Excellent understanding of concepts like Star-schema, Snowflake schema using fact and dimension tables and relational databases (Oracle, SQL), Teradata, MS Access and client/server applications.
  • Combined multiple data sources using blending in Tableau to keep track of sensor data, operator inputs, and real-time systems, all in one dashboard.
  • Identified Business Data, requirements and modified it to logical data models.
  • Skilled in accessing and transforming massive datasets through filtering, grouping, aggregation, and statistical calculation.
  • Good experience in dashboard design using large data sets.
  • Experienced in creating different visualizations using Bars, Lines, Pies, Maps, Scatter plots, Gantt charts, Bubbles, Histograms, Bullets, Heat maps and Highlight tables
  • Expert in developing user objects (filters, prompts, metrics, Sets and Groups, Calculated Fields) to develop reports.
  • Created ad-hoc reports based on business needs
  • Worked on creating interactive, real-time dashboards using drill down and customization.
  • Created incremental refreshes for data sources on Tableau server
  • Scheduled reports to create an Instance and used instance manager to feed the current data, to optimize the performance of data feeding the reports.
  • Designed the architecture of Tableau security by customizing the access levels and creating various user groups by assigning row and column level security at application level, folder level and user level.
  • Experienced in Trouble Shooting, Performance tuning of reports and resolving issues wifin Tableau Server and Reports
  • Established best practices for Enterprise Tableau Environment, Application Intake and Development processes
  • Extensive experience in production support and issue resolution dat includes debugging of Reports and related issues, dashboard design issues, archiving and performance issues.

TECHNICAL SKILLS:

Business Intelligence Tools: Tableau Desktop 10.x/9.x/8.x, Tableau server, Tableau Online, Microsoft Power BI, SAP Business Objects, Crystal Reports, Power BI and SAS.

Servers: Application Servers (WAS, Tomcat) and Web Servers (IIS, HIS, Apache).

Operating Systems: Windows 10/8/7/VISTA/NT/XP and LINUX/UNIX.

Databases: MS SQL Server, Oracle, HP Vertica, IBM UD2 DB2, Teradata, MS Access, HANA, SAP BW, AWS S3 Bucket Snowflake Schematic, Oracle, PostgreSQL, Teradata, Netezza, MS SQL Server, Mongo DB, HBase and Cassandra

Programming Languages: Python, R, JavaScript, SQL, PL/SQL, MATLAB & Java

Other Tools: Xcelsius 2011/2008, Universe Designer, IDT, MongoDB, AWS, Tableau, MS Office Suite, Scala, NLP, MariaDb, SAS, Spark, Simulink (MATLAB), Orange-3 and Elastic search packages

WORK EXPERIENCE:

Confidential, Reston, VA

Sr. Data Analyst

Responsibilities:

  • Performed exploratory data analysis like calculation of descriptive statistics, detection of outliers, assumptions testing, factor analysis, etc., in Python and R,
  • Utilized Spark, Python, R, a broad variety of machine learning methods including classifications, regressions, dimensionality reduction based on domain noledge and customer business objectives.
  • Innovated and leveraged machine learning, data mining and statistical techniques to create new, scalable solutions for business problems.
  • TEMPEffective software development processes to customize and extend the computer vision and image processing techniques to solve new problems.
  • Worked wif Machine learning algorithms like Regressions (linear, logistic etc...), Clustering and classification, SVMs and Decision trees.
  • Container management using Docker by writingDocker filesand set up the automated build on Docker HUB and installed and configured Kubernetes.
  • Building/Maintaining Docker container clusters managed byKubernetes, Linux, Bash, GIT, Docker, on GCP. Utilized Kubernetes and Docker for the runtime environment of theCI/CDsystem to build, test deploy.
  • Automated applications andMySQLcontainer deployment inDocker using Pythonand monitor them usingNagios.
  • Virtualized servers in Docker as per test environments and Dev-environments requirements and configured automation using Docker containers.
  • Performed ExploratoryData AnalysisusingR. Also involved in generating various graphs and charts for analyzing the data usingPythonLibraries.
  • Documented logical, physical, relational and dimensional data models. Designed the Data Marts in dimensional data modeling using star and snowflake schemas.
  • Enabled speedy reviews and first mover advantages by using Oozie to automate data loading into the Big Data Hadoop Distributed File System and PIG to pre-process the data.
  • Transformed Logical Data Model to Erwin, Physical Data Model ensuring the Primary Key and Foreign Key relationships in PDM, Consistency of definitions of Data Attributes and Primary Index Considerations.
  • Modelling and exponential smoothening for multivariate time series data.
  • Developed a machine learning system dat predicted purchase probability at a particular offer based on customer’s real time location data and past purchase behavior; these predictions are being used for mobile coupon pushes.
  • Used Pandas, NumPy, seaborn, SciPy, Matplotlib, Scikit-learn, NLTK in Python for developing various machine learning algorithms and utilized machine learning algorithms such as linear regression, multivariate regression, naive Bayes, Random Forests, K-means, & KNN for data analysis.
  • Checking the back-end databases connectivity dat using the JavaScript and JDBC connections to the databases.
  • Performed Source System Analysis, database design, data modeling for the warehouse layer using MLDM concepts and package layer using Dimensional modeling.
  • Handled importing data from various data sources, performed transformations using Hive, Map Reduce, and loaded data into HDFS.
  • Created Hive queries dat helped analysts spot emerging trends by comparing fresh data wif EDW tables and historical metrics and processed the data using HQL (like SQL) on top of Map-reduce.
  • Created tables, sequences, synonyms, joins, functions and operators in Netezza database.
  • Created and implemented MDM data model for Consumer/Provider for HealthCare MDM product from Variant.
  • Analyzed the web log data using the HiveQL to extract number of unique visitors per day, page views, visit duration, most purchased product on website and managed and reviewed Hadoop log files.
  • Performed Data Analysis and Data Profiling and worked on data transformations and data quality rules.
  • Working noledge of software development methodologies such as waterfall, agile, lean, XP, etc
  • Deployed my project on AWS, connected RDS(SQL) to AWS using python.
  • Worked wif Pandas/Numpy to analyze the data and infer from the Seaborn visualization.
  • Used PostgreSQL for the DB and various python libraries for data analysis and predictive modeling.
  • Formulated procedures for tableau wif data sources and delivery systems integration of
  • Interacted extensively wif end users on requirement gathering, analysis and documentation.
  • Used Python to create regression models, clustering algorithm and other data mining methods to explore the fast growth opportunities of the clients.
  • Classified customers by RFM analysis, clustering and regression model wif R, selected customers wif high value and improved their retention rate by sending ads and coupons.
  • Worked extensively wif Advance Analysis Actions, LODs (FIXED, INCLUDE, EXCLUDE) Calculations, Parameters, Background images, Maps.
  • Involved in Trouble Shooting, Performance tuning of reports and resolving issues wifin Tableau Desktop/Server Reports.
  • Worked wif AWS Redshift directly from Tableau to access cloud data warehouse and quickly perform visual analysis.
  • Created SSIS Packages using Pivot Transformation, Execute SQL Task, Data Flow Task, etc., to import data into the data warehouse.
  • Performed administrative tasks, including creation of database objects such as database, tables, and views, using SQL DCL, DDL, and DML requests.
  • Coding new tables, views and modifications as well as Pl/PgSQL stored procedures, data types, triggers, constraints in PostgreSQL databases
  • Built and published customized interactive reports and dashboards, report scheduling using Tableau server.
  • Used SQL Loader to load data from the Legacy systems into Oracle databases using control files extensively.
  • Used Oracle External Tables feature to read the data from flat files into Oracle staging tables.

Environment: Teradata, PostgreSQL, Big Data Hadoop,, Python, MapReduce, Time series analysis, ARIMA models, MDM, SQL Server, DB2, Tableau, SAS/Graph, SAS/SQL

Confidential, Santa Monica, CA

Sr. Data Analyst

Responsibilities:

  • Utilized domain noledge and application portfolio noledge to play a key role in defining the future state of large, business technology programs.
  • Provided the architectural leadership in shaping strategic, business technology projects, wif an emphasis on application architecture.
  • Participated in all phases of data mining, data collection, data cleaning, developing models, validation, and visualization and performed Gap analysis.
  • Developed MapReduce/Spark Python modules for machine learning & predictive analytics in Hadoop on AWS. Implemented a Python-based distributed random forest via Python streaming.
  • Used Pandas, NumPy, seaborn, SciPy, Matplotlib, Scikit-learn, NLTK in Python for developing various machine learning algorithms and utilized machine learning algorithms such as linear regression, multivariate regression, naive Bayes, Random Forests, K-means, & KNN for data analysis.
  • Installed and configured PostgreSQL databases and optimized postgresql.conf for the performance improvement.
  • Forecasted based on exponential smoothing, ARIMA modelling, statistical algorithms and statistical analysis and transfer function models.
  • Conducted studies, rapid plots and using advance data mining and statistical modelling techniques to build solution dat optimize the quality and performance of data.
  • Demonstrated experience in design and implementation of Statistical models, Predictive and descriptive models, enterprise data model, metadata solution and data life cycle management in both RDBMS, Big Data environments.
  • Created ecosystem models (e.g. conceptual, logical, physical, canonical) dat are required for supporting services wifin the enterprise data architecture (conceptual data model for defining the major subject areas used, ecosystem logical model for defining standard business meaning for entities and fields, and an ecosystem canonical model for defining the standard messages and formats to be used in data integration services throughout the ecosystem).
  • Worked on database design, relational integrity constraints, OLAP, OLTP, Cubes and Normalization (3NF) and De-normalization of database.
  • Utilized Spark, Scala, Hadoop, HBase, Kafka, Spark Streaming, MLlib, Python, a broad variety of machine learning methods including classifications, regressions, dimensionally reduction etc.
  • Experience in implementing python alongside using various libraries such as matplotlib for charts and graphs, MySQL db for database connectivity, PySide, Pandas data frame, Numpy
  • Wrangled data, worked on large datasets (acquired data and cleaned the data), analyzed trends by making visualizations using matplotlib using Python.
  • Collaborated wif data engineers, wrote and optimized SQL queries to perform data extraction from SQL tables.
  • Performed data integrity checks, data cleaning, exploratory analysis and feature engineer using R and python
  • Gathered user requirements, analyzed and designed software solutions based on the business requirements.
  • Experienced in creating Prototypes, Wireframes and Mock-Up Screens to visualize Graphical User Interface (GUI). Gained insights and verified the validity of multiple datasets by mining data and executing structured queries in pandas.
  • Implemented sentiment analysis using Text Blob and Vader Sentiment to determine trends in USDA dietary guidelines.
  • Built web scrapers using restful API’s to collect raw data.
  • Created visualizations in Tableau to view temporal trend of agriculturally relevant occurrence data.
  • Worked on customer segmentation using an unsupervised learning technique - clustering.
  • Designed and implemented system architecture for Amazon EC2 based cloud-hosted solution for client.
  • Tested Complex ETL Mappings and Sessions based on business user requirements and business rules to load data from source flat files and RDBMS tables to target tables.

Environment: Big Data Hadoop, Python, Teradata, PostgreSQL, Tableau, Netezza, SAS/Graph, SAS/SQL, SAS/Access, Time-series analysis

Confidential

Data Science Consultant

Responsibilities:

  • Perform Data Profiling to learn about behavior wif various features of turnover before the hiring decision, when one has no on-the-job behavioral data.
  • Performed preliminary data analysis using descriptive statistics and handled anomalies such as removing duplicates and imputing missing values.
  • Applied various machine learning algorithms and statistical Modeling like decision trees, text analytics, natural language processing (NLP), supervised and unsupervised, regression models, social network analysis, neural networks, deep learning, SVM, clustering to identify Volume using Scikit-learn package inPython.
  • Involved in creating Hive tables, loading the data and writing Hive queries which will run internally in map reduce way.
  • Designed and managed API system deployment using fast HTTP server and Amazon AWS architecture
  • Setup database in AWS using RDS and configuring backups for S3 bucket.
  • Performed data cleaning and feature selection using Pandas & Numpy
  • Conducted a hybrid of Hierarchical andK-meansCluster Analysis and identified meaningful segments of through a discovery approach.
  • Built Artificial Neural Network using TensorFlow inPythonto identify the customer's probability of canceling the connections. (Churn rate prediction)
  • Understanding the business problems and analyzing the data by using appropriate Statistical models to generate insights.
  • Having Knowledge on AWS Lambda, Auto scaling, Cloud Front, RDS.
  • Gained Knowledge on Deploying apps using AWS Cloud Formation.
  • DevelopedNLPalgorithms coupled wifDeep Learning for Topic Extraction, Sentiment Analysis.
  • Identify and assess availablemachine learningandstatistical analysislibraries (including regressors, classifiers, statistical tests, and clustering algorithms).
  • Work wifNLTKlibrary toNLPdata processing and finding the patterns.
  • Categorize comments into positive and negative clusters from different social networking sites using Sentiment Analysis and Text Analytics.
  • Ensure dat the model has low False Positive Rate for text classification and sentiment analysis on unstructured and semi-structured data.
  • Addressed overfitting by implementing the algorithm regularization methods like L2 and L1.
  • Use Principal Component Analysis in feature engineering to analyze high dimensional data.
  • Create and design reports dat will use gathered metrics to infer and draw logical conclusions from past and future behavior.
  • Perform Data Cleaning, features scaling, features engineering using pandas and NumPy packages in python.
  • Communicate the results wif operations team for taking best decisions.
  • Collect data needs and requirements by Interacting wif the other departments.
  • Build an in-depth understanding of the problem domain and available data assets.
  • Research, design, implement, and evaluate machine learning approaches and models.
  • Communicate findings and obstacles to stakeholders to help drive the delivery to market.
  • Developed and automated the data manipulation process for above using stored procedures/views in SQL Server.
  • Developed the code as per the client's requirements using SQL, PL/SQL and Data Ware housing concepts.
  • Automated the scraping and cleaning of data from various data sources in R.
  • Developed Banks’s loss forecasting process using relevant forecasting and regression algorithms in R.
  • Wrangled data, worked on large datasets (acquired data and cleaned the data), analyzed trends by making visualizations using matplotlib using Python
  • Collaborated wif data engineers, wrote and optimized SQL queries to perform data extraction from SQL tables.
  • Performed data integrity checks, data cleaning, exploratory analysis and feature engineer using R and python.
  • Used R, Python and tableau to develop variety of models and algorithms for analytic purposes.
  • Analyzed large sales data of Cisco devices and used R for predicting the future sales using regression models.
  • Performed Data Analysis and Data Profiling and worked on data transformations and data quality rules.
  • Created SSIS Packages using Pivot Transformation, Execute SQL Task, Data Flow Task, etc. to import data into the data warehouse.

Environment: Oracle, SAS, SQL, Tableau, TOAD for data analysis, MS Excel, Netezz

Confidential

Data Engineer

Responsibilities:

  • Used Star Schema methodologies in building and designing the logical data model into Dimensional Models extensively.
  • Developed Star and Snowflake schemas based dimensional model to develop the data warehouse.
  • Designed Context Flow Diagrams, Structure Chart and ER- diagrams.
  • Worked on database features and objects such as partitioning, change data capture, indexes, views, indexed views to develop optimal physical data mode.
  • Tested Complex ETL Mappings and Sessions based on business user requirements and business rules to load data from source flat files and RDBMS tables to target tables.
  • Worked wif SQL Server Integration Services in extracting data from several source systems and transforming the data and loading it into ODS.
  • Worked wif SME's and other stakeholders to determine the requirements to identify Entities and Attributes to build Conceptual, Logical and Physical data Models.
  • Worked wif SQL, SQL PLUS, Oracle PL/SQL Stored Procedures, Triggers, SQL queries and loading data into Data Warehouse/Data Marts.
  • Worked wif DBA group to create Best-Fit Physical Data Model from the Logical Data Model using Forward engineering using Erwin.
  • Generated ad-hoc SQL queries using joins, database connections and transformation rules to fetch data from legacy SQL Server database systems.
  • Reviewed business requirements and analyzing data sources form Excel/Oracle SQL Server for design, development, testing, and production rollover of reporting and analysis projects.
  • Performed data analysis and data profiling using complex SQL.
  • Involved in Data Analysis, Data Validation, Data Cleansing, Data Verification and identifying data mismatch.
  • Designed data model, analyzed data for online transactional processing (OLTP) and Online Analytical Processing (OLAP) systems.
  • Wrote and executed customized SQL code for ad-hoc reporting duties and other tools for routine.

Environment: Erwin, SQL Server 2005, PL/SQL, SQL, T-SQL, ETL, OLAP, OLTP, SAS, Oracle 9i, and Clear Quest.

Confidential

Data Engineer

Responsibilities:

  • Wrote and maintained technical and functional specifications to document database intentions and requirements.
  • Implemented BI solution framework for end-to-end business intelligence projects.
  • Involved in Business requirements gathering and developed prototypes from business requirement documentation for interactive dashboards.
  • Mentored business users to TEMPeffectively make use of Tableau Desktop for creating reports and dashboards.
  • Developed various KPI’s, dashboards, Network graphs useful for business to make decisions.
  • Analyzed data modelling in Cognos framework manager and created store procedures in SQL avoiding all the calculations on front end.
  • Replaced Cognos Powerplay and Report studios wif interactive KPI dashboard in Tableau.
  • System Integration testing supported by data validation and making the reports and dashboards integrate wif portal and backend data warehouse.
  • Worked on Agile methodologies.
  • Developed Test cases and automation scripts for end-to-end testing and data validation for various reports and dashboards.
  • Expertise in writing complex SQL queries, stored procedures and performance optimization by optimizing the queries.
  • Integrated Tableau Desktop wif R and Python to use their scripting functions in calculated fields.
  • Published Tableau data source to server for business to use it for ad hoc purposes and export data which is not available in reports.
  • Prepared Dashboards using calculations, parameters in Tableau and created calculatedfields,groups, sets and hierarchies etc.
  • Worked wif customers to understand data needs and provided services.
  • Tested trouble shooting methods, devised innovative solutions, and documented resolutions for inclusion in noledge base for support team use.
  • Used R to generate regression models to provide statistical forecasting.
  • Worked on Multiple datasets containing 2 billion values which are structured and unstructured data about web applications usage and online customer surveys.
  • Designed data profiles for processing, including running SQL queries and using Python and R for Data Acquisition and Data Integrity which consists of Datasets Comparing and Dataset schema checks.

Environment: Workday, Oracle Sales Cloud, SQL, R/R studio, Python, MS Excel, Tableau. Bentley University, Waltham, MA, USA Master of Science in Financial Analytics The University of Glasgow

We'd love your feedback!