We provide IT Staff Augmentation Services!

Data Scientist/ Analyst Resume

Kent, OH


  • Around 4 years of IT experience in the field of Data Scientist/Data analysis, Visualization and Machine Learning.
  • Strong experience in Business and Data Analysis, Data Profiling, Data Migration, Data Integration and Metadata Management Services.
  • Knowledge of CRISP - DM methodology for prediction
  • Experienced with machine learning algorithm such as logistic regression, random forest, KNN, SVM, neural network, linear regression, lasso regression and k-means
  • Implemented Bagging and Boosting to enhance the model performance.
  • Experience of working on Python 3.5/2.7 (Numpy, Pandas, Matplotlib, NLTK and Scikit-learn)
  • Experience in implementing data analysis with various analytic tools, such as Anaconda 4.0 Jupiter Notebook 4.X and Alteryx
  • Comprehensive knowledge and experience in normalization/de-normalization, data extraction, data cleansing and data manipulation
  • Solid ability to write and optimize diverse SQL queries, working knowledge of RDBMS like SQL Server 2008, Oracle, Redshift, Neteza
  • Expert in Informatica Power Center 9.x, 8.x (Designer, Workflow Manager, Workflow Monitor), and Power Connect, Power Exchange.
  • Experienced in the Analysis, Design, Development, Testing, and Implementation of Data Warehouse solutions for Financial and Retail Sectors.
  • Excellent in analyzing and documenting the business requirements in functional and technical terminology.
  • Experience of working on Agile and Waterfall Methodology.
  • Excellent interpersonal and communication skills.
  • Worked on Data Cleaning and Statistical techniques like Regression Estimates, Time Series Analysis and Cohort Analysis.
  • Extensive experience in project management best practices, processes, & methodologies including Rational Unified Process (RUP) and SDLC
  • Ability to understand current business processes and implement efficient business process.
  • Strong knowledge on open source search technologies - Elastic Search, SOLR and Lucene
  • Excellent analytic, logical, programming and problem solving skills
  • Experience in developing SOAP and REST based Web Services design development.
  • In depth understanding/knowledge of Hadoop Architecture and various components such as HDFS, Map Reduce and YARN concepts.
  • Experience in implementation of machine learning programs in Python.


Machine Learning: Prediction, Classification, Clustering and Time series algorithms

Programming Language: Python, SQL, Visual Basic, C#, C++

Tools: Microsoft Office Pro (Word, PowerPoint, Excel, Access), Quick Books, Crystal Reports, SQL Server

Platforms: Windows, Linux 95/98/NT/2000/XP/2003/2007

Database: MySQL,Oracle, Mongodb

Reporting: Tableau, QlikView, D3JS and Excel


Confidential, Kent, OH

Data Scientist/ Analyst


  • Worked to fix income trading, structured fixed income portfolios, econometrics, and financial time series analysis using advanced analysis methods, such as PCA, autocorrelation, GARCH, Kalman filtering; and critical use of software such as MATLAB and Python.
  • Managed and coded application development projects using C++ and Python for clinical trials, market research, and capital markets trading risk management systems.
  • Managed global large-scale data analysis project with multicore system using MapReduce ("divide and conquer") using Hadoop software to produce deliverables in brief timeframes.
  • Served on speaker panels as both a moderator and speaker on topics such as data science, quantitative finance, and information systems. Delivered customized in-depth training on financial concepts and risk management practices.
  • Performed data analysis and data profiling using complex SQL on various sources systems including Oracle and Teradata.
  • Written several shell scripts using UNIX Korn shell for file transfers, error logging, data archiving, checking the log files and cleanup process.
  • Conducted GAP analysis so as to analyze the variance between the system capabilities and business requirements.
  • Interacted with teams in AFS, ACBS and infolease to extract the information for the reports.
  • Involved in defining the source to target data mappings, business rules, business and data definitions
  • Metrics reporting, data mining and trends in helpdesk environment using Access
  • Interacted closely with business users, analysts and developers. Wrote software for quantitative analysis of capital markets in statistical languages: MATLAB and Python.
  • Performed Bayesian time series and econometric analysis of exogenous market variables, modeled in open source software.
  • Collected historical data and third party data from different data source
  • Worked on data cleaning and ensured data quality, consistency, integrity using Pandas, Numpy
  • Worked on outlier’s identification with box-plot, K-means clustering using Pandas, Numpy
  • Participated in features engineering such as feature intersection generating, feature normalize and Label encoding with Scikit-learn preprocessing
  • Modeled customers to discover untapped business opportunities.


Jr. Data Scientist/ Analyst


  • Member of design team and developed application to load data coming from different sales system, validated data and loading data into targets.
  • Requirement gathering and performing data modeling for new requirements
  • Created logical and physical data models using Erwin
  • Design 3NF schema design for the OLTP applications
  • Identified and evaluated various distributed machine learning libraries like Mahout, MLLib (Apache Spark) and R.
  • Evaluated the performance of Various Classification and Regression algorithms using R language to predict the future power.
  • Worked closely with business, data governance, SMEs and vendors to define data requirements.
  • Worked with data investigation, discovery and mapping tools to scan every single data record from many sources.
  • Designed the prototype of the Data mart and documented possible outcome from it for end-user.
  • Developed Extraction, Transformation and Loading of data from different source systems using Informatica Power Center tools - Mapping Designer, Repository Manager, Workflow manager and workflow Monitor
  • Created complex mappings using transformations like Source qualifier, Sequence generator, Lookup, Joiner, filter, Update Strategy, Rank and aggregators.
  • Implemented Slowly Changing Dimension Type 1 and Type 2 for maintaining Targets.
  • Created workflows and work lets taking into consideration the interdependencies between sessions and mappings and various commands like command, assignment, control and session tasks.
  • Performance enhancement of the Mappings and data access
  • Involved in preparing test plan and testing for ETL development.
  • Proactively engage with product and development teams to define next generation product features, specifications and requirements, and research on existing web technologies to design and implement these requirements
  • Performed data formatting involves cleaning up the data.
  • Designed and prepared technical specifications and guidelines.
  • Developed and maintained high performance, high-available, scalable data processing software frameworks and data models.
  • Validation of data integrity by running different API in Elastic Search
  • Adopted best engineering practices and develop high quality maintainable code.

Hire Now