Data Scientist Resume
Houston, TexaS
PROFESSIONAL SUMMARY:
- Having 6 years of experience in large datasets of Data Visualization, Data Acquisition, Predictive modeling, Data Validation, Machine Learning, Data mining with large data sets of structured
- Result oriented, quick learner with excellent communication and presentation skills
- Proficient in Predictive Modeling, Data Mining Methods, Factor Analysis, ANOVA, Hypothetical testing, normal distribution and other advanced statistical and econometric techniques.
- Developed predictive models using Decision Tree, Random Forest, Logistic Regression,
- Cluster Analysis, and Neural Networks.
- Experienced the full software life cycle in SDLC, Agile and Scrum methodologies.
- Skilled in Advanced Regression Modeling, Correlation, Multivariate Analysis, Model Building, Business Intelligence tools and application of Statistical Concepts.
- Experienced in Machine Learning and Statistical Analysis with Python Scikit - Learn.
- Strong SQL programming skills, with experience in working with functions, packages and triggers.
- Experienced in Visual Basic for Applications and VB programming languages to work with developing applications.
- Decision Forests, natural language processing (NLP) etc.
- Experienced in Python to manipulate data for data loading and extraction and worked with python libraries like Matplotlib, Numpy, Scipy and Pandas for data analysis.
- Worked with complex applications such as R, Statsa, Scala, SAS, Matlab and SPSS to develop neural network, cluster analysis.
- Worked with RDBMS including MySQL, DB2 and Oracle SQL.
- Worked with NoSQL Database including Hbase, Cassandra and MongoDB.
- Experienced in Data Integration Validation and Data Quality controls for ETL process and Data Warehousing using MS Visual Studio SSIS, SSAS, SSRS.
- Proficient in Tableau and R-Shiny data visualization tools to analyze and obtain insights into large datasets, create visually powerful and actionable interactive reports and dashboards.
- Experienced in Big Data with Hadoop, HDFS, MapReduce, and Spark.
- Experienced in Spark 2.1, Spark SQL and PySpark.
- Automated recurring reports using SQL and Python and visualized them on BI platform like Tableau.
- Worked in development environment like Git and VM.
- Expertise in defining scope of the project post gathering business requirements including constraints
- Excellent communication skills. Successfully working in fast-paced multitasking environment both independently and in collaborative team, a self-motivated enthusiastic learner.
PROFESSIONAL EXPERIENCE:
Confidential, Houston, Texas
Data Scientist
Responsibilities:
- Performed statistical analysis of data using SAS and SPSS. Applied descriptive and inferential methodologies to identify disease trends & warning signals and to undertake impact assessment
- Monitor, collate, and synthesize information to produce reports and program e-related documents
- Worked to implement Meta Data & Data Standards (MDDS) Coding as published by NIC to better map districts & health facilities
- Performed other administrative tasks like drafting & designing informational materials
- Discover and track customer behavior data to identify trends and business impact.
- Profile raw data sets across platforms and develop KPI/dashboard to measure product performance.
- Perform quantitative analysis of product sales trends to recommend pricing decisions.
- Leveraging data using BI tools Tableau for Data visualization and storytelling.
- Monitored and enhanced performance of existing Sales and Marketing models including customer life time value models and retention models using statistical techniques.
- Worked on Noise Reduction methods exponential smoothing and Fast Fourier Transformation methods and made comparison with regression methods.
- Created T-SQL stored procedures to improve the performance of aggregate queries dynamically.
- Assisted in logical and physical Modeling for OLAP and OLTP systems.
- Data analyzed for scientists, research scholars, medical doctors, and also to post.
- Performed regular research and gathered new statistical evidence at every opportunity.
- Published manuscripts in scientific journals and papers presented in various conferences.
- Utilized SPSS and Minitab soft ware to randomize, analyze and interpretation of data.
- Summarized findings, created reports, & maintained database of statistical information.
Environment: Machine learning, R Studio, Erwin R 9.0, Metadata, Hive, Rational Rose, MS Excel, Mainframes MS Vision, ODS, OLTP, Oracle 10g, Informatica 9.0, OLAP, DB2.
Confidential, Minneapolis, Minnesota
Data Scientist
Responsibilities:
- Setup storage and data analysis tools in Amazon Web Services cloud computing infrastructure.
- Built an algorithm to identify Customers how likely purchased insurance.
- Fine-tuned the Machine Learning algorithm to meet the acceptable standards in Python using the packages NLTK and Scikit-learn.
- Gathered, analyzed and translated business requirements into relevant analytic approaches.
- Designed Logistic Regression Model, Support Vector Machine, and Random Forest to calculate precision, recall and F-factor.
- Analyzed the dataset, performed feature selection, created new features for designing the predictive model.
- Hands on with various Data Cleaning processes like handling missing values by using techniques such as replacing by mean, forward/backward fill, removing entire rows/columns/values, removing outliers and normalizing, and scaling data.
- Developed and designed NoSQL procedures for data export/import and for converting data.
- Prepared comprehensive documented observations, analyses and interpretations of results including technical reports, summaries, protocols and quantitative analyses.
- Handled importing data from various data sources, performed transformations using Hive, Map Reduce, Pig and loaded data into HDFS in Hadoop.
- Worked with stakeholders to troubleshoot issues, communicated to team members, leadership and stakeholders on findings to ensure well understanding of models and optimization.
- Analyzed, transformed, and contextualized a variety of ingested data - social data, GIS data, POI and AOI data, and some consumer behavior data for building direct marketing predictive models.
- Contribution for implementing NLP to identify, extract, summarize, and categorize the relevant qualitative financial input information like sentiment/feedback/news according to specific structures (templates) from a source text (digital news) to support decision-making.
- Classified text documents using Naive Bayes algorithm for Sentiment analysis and gathering insights from a large volume of unstructured/text data.
- Applied customer segmentation with clustering algorithms and developed geodemographic customer segmentation models.
Environment: Machine Learning, NLP, R, SAS, Hadoop, Spark, Python, HDFS, Pig, Hive, Microsoft Word and Microsoft Excel, HBase, MapReduce, Tableau, NoSQL and AWS.
Confidential, Bartlesville, OK
Data Modeler
Responsibilities:
- Responsible for preparing data and exploratory analysis for machine learning to develop models
- Created standard data summaries, extracted subset of data and split data and created data partitions
- Created various types of data visualizations using R and Tableau
- Gathered requirements for various data mining projects
- Involved in loading data from Hive and imported to R for data analysis and visualization
- Exploratory analysis and model building to develop predictive insights
- Ground up Data understanding, Hypothesis formulation, data preparation and model building experience.
- Responsible for development of configuration, mapping and Java beans for Persistent layer (Object and Relational Mapping) of Hibernate.
- Accountable for business requirements gathering process and converting them into functional and technical requirements (HLD's, LLD's)
- Statistical data Analysis, modeling/machine learning, data visualization and reporting of big data related to digital advertising.
- Created Revenue optimization algorithm to divert click traffic to different advertiser throughout the day to maximize Revenue.
Environment: HTML, CSS, Javascript, Java, Agile methodology, SQL, Pig, Tableau, Hive, HBase, MapReduce, R, Microsoft Visio, MS Excel, Microsoft Project, and MS Word.
Confidential
Java Developer
Responsibilities:
- Designed asynchronous messaging using Java Message Service (JMS) to exchange of critical business data and events among J2EE components and the legacy system.
- Involved in injecting dependencies into code using concepts like IOC of Spring Framework
- Expose and Consume REST web services to get the data from different contracts from different clients and also expose the warehouse inventory details for consumer tracking
- Involved in integrating the business layer with DAO layer using Hibernate ORM.
- Responsible for development of configuration, mapping and Java beans for Persistent layer (Object and Relational Mapping) of Hibernate.
- Involved in build management and build resolution activities of e-commerce project.
- Involved in configuring and deploying the application using WebSphere.
- Transaction Management using the Hibernate configurations
- Write and modify database stored procedures, triggers, functions, and PL/SQL Scripts.
- Used CVS as version control system to check in and check out the data.
- Involved in writing shell scripts for deploying the application on UNIX..
Environment: HTML, CSS3, JavaScript, jQuery, Java, AJAX, J2EE, Spring, JDBC, JSP, Web Services, REST, Oracle, JUnit 4, SVN..
Confidential
Java Developer
Responsibilities:
- Knowledge in struts tiles framework for layout management.
- Worked on design, analysis, and development and testing various phases of the application.
- Used JDBC for the Database connectivity.
- Developed user interface using JSP and HTML.
- Configured spring managed beans.
- Consistently met deadlines as well as requirements for all production work orders.
- Involved in projects utilizing Java, Java EE web applications in the creation of fully-integrated client management systems.
- Development and integration of the application using Eclipse IDE.
- Executed SQL statements for searching contactors depending on Criteria.
- Involved in building, testing and debugging of JSP pages in the system.
- Developed Junit for server side code.
- Involved in the development of front end screens using technologies like JSP, HTML, AJAX and JavaScript.
- Involved in multi-tiered J2EE design utilizing spring (IOC) architecture and Hibernate..
Environment: HTML, CSS, XML Schema, SOAP, Java Script, PL/SQL, Java, J2EE, JSP, Hibernate, Struts, Junit, AJAX, JSP, JDBC.
