Data Scientist And Data Analyst Resume
Chicago, IL
SUMMARY:
- Have more than 7 years of extensive experience in Customer engagement and Data management and experience in analytics, digital or social media related with proven technical and analytical abilities. Strong Analytics and data management skills.
- Have 5 years of data scientist experiences in various organizations including strong problem solving, and advance statistical analysis skills and abilities.
- Have ability to build predictive models, such as generalized linear, decision tree, neural network models and ensembles
- Experience to perform model selection and evaluation
- Experience on Designing and maintaining databases using Python as well as Testing and implementing applications built using Python
- Have understanding on using Tableau to analyzetrendsand visualize organization's data, connecting various data sources such astext, excel file, databases to big data queries to Tableau
- Experience on SQL programming to design databases, store procedures, reports, create and update applications for internal company operations using complex SQL programming
- Experience on designing efficient algorithms with Statistical programming environments
- Experience with integration and cleansing of disparate or incomplete data sets to deliver insight
- Understand traditional business intelligence and data warehousing concepts (ETL, data modeling, and reporting
- Have ability to identify meaningful insights from complex datasets, and develops machine learning based models for improving the prediction models.
- Maintain awareness of emerging analytics and big - data technologies.
- Experience on designing efficient algorithms with programming languages and tools for data manipulation and statistical analysis.
- Experience on designing efficient data mining and text mining frameworks with related tools.
- Self-motivated combined with hands-on entrepreneurial spirit and exceptional drive, with the ability to work in a fast-paced multi-disciplinary environment.
TECHNICAL SKILLS:
Microsoft Office including: Outlook, Excel, Word, PowerPoint, and other computer applications Tableau/Qlikview, Market Research, Data understanding, HMM, LSTM, Mixture modeling, Stochastic.and Spark, Neural networks, AI, Teradata, MySQL SQL Server, Oracle, SQLite, PostgreSQL, R (Packages Stats, Zoo, Matrix, data, table, OpenSSL)., Generalized Linear Models, Logistic Regressions, Boxplots, K-Means, Clustering, SVN, PuTTY, WinSCP, Redmine (Bug Tracking, Documentation, Scrum)Mapreduce, HDFS, Eclipse, Ananconda, 2.7and 3.3, (Packages NumPy, SciPy, Pandas, scikit-learn,Matplotlib, seaborn), data visualization platforms like Tableau, spark 2, 2.3, Spark Sql, Spark Streaming, Hadoop, Java 1.7,1.8, maven, scala
Python version: 2.7and 3.3, (Packages NumPy, SciPy, Pandas, scikit-learn,Matplotlib, seaborn). Strong SQL programming knowledge and experience, Hadoop including Hive, HDFS, MapReduce
PROFESSIONAL EXPERIENCE:
Confidential, Chicago, IL
Data Scientist and Data Analyst
Responsibilities:
- Predicted customer flow using different models such as Linear, regression, Logistic regression, Decision Tress, Random Forests, Ensembles (Bagging, Boosting), Support Vector Machines, Neural Networks, KNN, K means clustering, XGBoost, graph/network analysis, and time series analysis using pandas, python, and Scikit-learn, numphy, scipy and another python library.
- Trained and tested different data science models, evaluated and compared and selected the best models based on classification accuracy for prediction and forecasting. Tested model evaluation for K - fold validating, confusion matrix.
- Apache Zeppelin was used to bring data ingestion, data exploration, visualization, sharing and collaboration features to Hadoop and Spark. Zeppelin, Spark SQL and Spark MLLib was combined to simplify exploratory Data Science
- Used Tableau to analyzetrendsand visualize organization's data, connect it with various data sources such astext, excel file, databases to big data queries, and represent data in different views, applying filters, formatting, creating sets, groups, generating trend lines and performing forecasting.
- Used Spark SQL to process the data using ETL
- Designed databases, store procedures, reports, and data input interfaces using SQL programing
- Created and updated applications for internal company operations using complex SQL programming. Worked with cloud compute (e.g. AWS, Azure), NoSQL databases, Apache Flume, and Apache Sqoop,
- Worked with Big data Analysis such as Hadoop, MapReduce, NoSQL, Pig/Hive, Spark/Shark, MLlib and Scala, numpy, scipy, Pandas, scikit-learn.
- Built and tested different Ensemble Models such as Bootstrap aggregating, Bagged Decision Trees and Random Forest, Gradient boosting, XGBoost, and AdaBoost to improve accuracy, reduce variance and bias, and improve stability of a model
- Azure Machine Learning Model Management tool was used to manage and deploy machine-learning workflows and models into production
- Keras deep learning library was used to develop LSTM recurrent neural network (RNN) in Python to solve the time series analysis to predict the number of customer.
- Used H2O to import files, build models, and improve the model for data analytics and prediction.
- H2O Flow notebook that is user interface for H2O was used to capture, rerun, present, and share workflow. Created POC document on new technical products and applications.
- Advanced statistical methods were used to utilize statistical Natural Language Processing using Artificial Intelligence solution for sentiment analysis, mine unstructured data, and create insights
- Spark Streaming along with Kafka Streaming was used to populatereal-time sentiment analysis, crisis management and service adjusting
- Worked with AR (Autoregressive), MA (Moving Average), and ARIMA (Autoregressive Integrated Moving Average) time series analysis models to predict the number of customer
- Used Python for web programming. Python data types such as String, list, tuple, dictionary and other python functions are used for programming.
- Copied data from Amazon S3 to HDFS, from HDFS to Amazon S3, and between Amazon S3 buckets
- Coordinated with different functional teams to implement models and monitor outcomes.
- Investigated and identified patterns in big data such as Hadoop to generate business insights.
- Sourcing data leveraging their expertise for product enhancements, and/or research & development activities as part of a team, with an orientation to generating practical business insights, solutions and delivering value.
- Drive management decisions through descriptive and inquisitive data analytics using tools such as SAS, machine learning
- Worked with core Java, Spring, Spring MVC, Hibernate, Web Services, JAXB SQL, JMS, RESTful, JSF, JDBC, JAX-WS, JSP, Servlets, EJB, JMS, XML, XSLT, Unix Shell scripting.
- Created Micro services with Spring Boot based services based on RESTful API using Spring Boot with Spring MVC.
- SOAPUI was used to create and test both restful service and SOAP.
- Convolutional Neural Networks (ConvNets or CNNs) was used forimage recognition, classification, objects detection using tensor flow and, Keras libraries.
- Splunk ES was used for continuous monitoring, incident response, running a security operations center, improve security operations, increase detectionand investigation capabilities using advanced analytics
- K-means Clustering was used for grouping together a set of similar data for exploratory data mining
- Anomaly was detected using SVM model, K-means clustering and KNN and documented.
Confidential, Columbus, OH
Sr. Data Scientist/ Machine Learning Engineer
Responsibilities:
- Work with Big data technical analysis such as Hadoop, Scala, numpy, scipy, Pandas, MapReduce, NoSQL, Pig/Hive, Spark/Shark, MLlib and scikit-learn.
- Work with different data science models Machine Learning Algorithms such as Linear, Logistic, Decision Tree, Random Forests, Support Vector Machines, Neural Networks, KNN, Deep learning
- Plan, monitor and review various data and metrics.
- Use operational research methods to develop forecasts and predictive analytic sets.
- Set standards criteria and verifies that the standards are being met.
- Use statistical analysis to generate predictive analytic sets and mitigation plans for requested datasets
- Document observations and establishes predictability and trend reports to develop daily and operational plan for allocation of resources.
- Research and compile data to recommend sustainable and executable plans.
- Maintain daily documentation of special and unusual occurrences and other data pertinent to the daily operation.
- Provide input on graphical representation of data to assist business in answering questions or making decisions.
- Develop custom data models and algorithms to apply to data sets
- Use predictive modeling to increase and optimize customer experience, revenue generation and other business outcomes.
- Use Python for web programming. Python data types such as String, list, tuple, dictionary and other python functions are used for programming.
- Use Tableau to analyzetrendsand visualize organization's data, connect it with various data sources such astext, excel file, databases to big data queries, and represent data in different views, applying filters, formatting, creating sets, groups, generating trend lines and performing forecasting.
- Design databases, store procedures, reports, and data input interfaces using SQL programing
- Create and update applications for internal company operations using complex SQL programming
- Coordinate with departments to implement models and monitor outcomes.
- Utilize discovery, real world evidence, and published data, and publically available information and tools to provide economic and integrated models to answer key questions in all phases
- Advance the use of integrated quantitative approaches, focusing on economic and predictive modeling and simulation in partnership with other internal groups and external organizations.
- Maintain awareness of external data, models and IT environment to collaborate across relevant internal stakeholders to enhance Merck capabilities in modeling, simulation, and data visualization.
- Work closely with external collaborators and partners to assure that new methodologies, models and advanced analytics are fully utilized for our products.
- Share work product and relevant best practices across the Merck enterprise.
Confidential, Raleigh, NC
Sr. Data Scientist/Big Data Analyst
Responsibilities:
- Design develop and programs methods, processes, and systems to consolidate and analyze unstructured, diverse big data sources to generate actionable insights and solutions for client services and product enhancement.
- Analyze complex business problems and issues using data from internal and external sources to provide insight to decision-makers.
- Identify and interpret trends and patterns in datasets to locate influences.
- Construct forecasts, recommendations and strategic/tactical plans based on business data and market knowledge.
- Create specifications for reports and analysis based on business needs and required or available data elements.
- Provide consultation to users and lead cross-functional teams to address business issues.
- Produce datasets and reports for analysis using system reporting tools.
- Use advanced mathematical and statistical concepts and theories to analyze and collect data and construct solutions to business problems.
- Perform complex statistical analysis on experimental or business data to validate and quantify trends or patterns identified by business analysts.
- Construct predictive models, algorithms and probability engines to support data analysis or product functions; verifies model and algorithm effectiveness based on real-world results.
- Design experiments and methodologies to generate and collect data for business use.
- Work with Big data technical analysis such as Hadoop, Scala, numpy, scipy, Pandas, MapReduce, NoSQL, Pig/Hive, Spark/Shark, MLlib and scikit-learn.
- Work with different data science models Machine Learning Algorithms such as Linear, Logistic, Decision Tree, Random Forests, Support Vector Machines, Neural Networks, KNN, Deep learning
Confidential, Chandler, AZ
Data Scientist/ Data analyst
Responsibilities:
- Design market response analytics models and approaches
- Design and write modules for analytics platforms using Python, R and Spark MLlib
- Extract data using SQL
- Create various prototypes for research and development purposes
- Document and present methodology inside and outside the company
- Partner with our Software Engineering department to build best-of-class web-based analytical solutions
- Develop intricate algorithms based on deep-dive statistical analysis and predictive data modeling that were used to deepen relationships, strengthen longevity and personalize interactions with customers.
- Analyze and process complex data sets using advanced querying, visualization and analytics tools.
- Identify, measure and recommend improvement strategies for KPIs across all business areas.
- Work with Big data technical analysis such as Hadoop, Scala, numpy, scipy, Pandas, MapReduce, NoSQL, Pig/Hive, Spark/Shark, MLlib and scikit-learn.
- Work with different data science models Machine Learning Algorithms such as Linear, Logistic, Decision Tree, Random Forests, Support Vector Machines, Neural Networks, KNN, Deep learning
- Responsible for the definition, design, construction, integration, testing, and support of reliable and reusable software solutions, addressing business opportunities. Includes systems analysis, creation of specifications, coding, testing, and implementation of application programs and data interfaces.
- Assures that application designs are consistent with industry best practices application attributes (including scalability, availability, maintainability, and flexibility)
- Experience in Java, Spark, NoSQL Databases such as Mongo/Cassandra, Cloud Platform (AWS/GCP/Azure)
Confidential
Agricultural Data Analyst and Data Engineer
Responsibilities:
- Identify measure and recommend improvement strategies for KPIs across all business areas.
- Conduct the analysis of billions of customer transaction records
- Write software to clean and investigate large, messy data sets of numerical and textual data
- Integrate with external data sources and APIs to discover interesting trends
- Build machine learning models from development through testing and validation to our customers in production
- Design rich data visualizations to communicate complex ideas to customers or company leaders
- Develop and Deploy of Machine Learning / Statistical Learning models for predictive analytics
- Full cycle data management from collection and cleaning to processing
- Identify potential data sets (internal and external) which could be used to enhance analytics
- Stay up to date with the latest in statistical and machine learning methods
- Apply sound software and architectural development practices in development and deployment of models as software products.
- Use cloud and distributed computing platforms for model development and deployment
- Communicate of results to business stakeholders and decision makers
- Combine alternative sources of data with fundamental equity research to determine points of inflection for publicly traded companies
- Work directly with Fundamental Equity Research Analysts and PM’s to in corporate a quanta mental and data informed approach to investment ideas
- Source and evaluate new alternative, nontraditional datasets for potential investment ideas and signals
- Work Closely with Fundamental Equity Analysts and Portfolio Managers to identify and develop data -oriented solutions that contribute to investment research
- Present and communicate data driven research and ideas that will help to identify changes in business conditions and shape the narrative on an investment decision