We provide IT Staff Augmentation Services!

Data Scientist/data Analyst Resume

3.00/5 (Submit Your Rating)

Minneapolis, MN

SUMMARY

  • 6 years of experience as Data Scientist and Data Analyst and good experience in domains like Banking, Finance, Insurance and Retail.
  • Experience in Statistical Analysis, Data Mining and Machine Learning using R, Python and SQL
  • Professional experience in Machine Learning algorithms such as Linear Regression, Logistic Regression, Random Forests, Decision Trees, K - Means Clustering and Association Rules
  • Expertise in manipulating large data sets from multiple sources (SQL, Hadoop)
  • Experience in analyzing the data using Hive-QL and Spark, Spark SQL.
  • Hands on experience in importing, cleaning, transforming, and validating data and making conclusions from the data for decision-making purposes
  • Categorizing content, implement in Python with Scikit-learn/Sklearn, and with classification algorithms (Support Vector Machine (SVM), K-nearest neighbours (KNN), logistic regression).
  • In-depth knowledge of Machine Learning and NLP algorithms and techniques
  • Working knowledge ofSpark and Spark SQL
  • Good understanding and knowledge of No-SQL databases like HBase and Cassandra.
  • Extensive hands-on experience and high proficiency with structured, semi-structured and unstructured data, using a broad range of data science programming languages and big data tools.
  • Actively used Agile methodology to develop and complete the project within the sprint time.
  • Strong experience in Data Migration and conversion projects.
  • Strong experience in RDBMS like DB2
  • Hands on Experience in building applications using build tools (Maven and Gradle).
  • Experience of using HDFS, AMBARI, SPARK, Horton Works Data Platform (HDP) to migrate code base from R to Python using SparkML
  • Expertise in working with the AGILE Methodology using Rally devSCRUM Methodology using scrum works pro tool.
  • Well versed in system analysis, ER/Dimensional Modeling, Database design and implementing RDBMS specific features.
  • Experience with different RDBMS like Oracle 9i/10g/11g, SQL Server 2005/2008, MySQL
  • Extensive involvement being developed of T-SQL, Oracle PL/SQL Scripts, Stored Procedures and Triggers for business rationale execution
  • Experience in designing stunning visualizations using Tableau software and publishing and presenting dashboards, Storyline on web and desktop platforms
  • Develop dashboards adhering to the visualization best practices and communicate a story
  • Competent communicator and confident presenter in reporting analytical findings to members of senior management
  • Responsible in development and design the Java/J2EE and Big Data applications. Involved in the testing for applications.

TECHNICAL SKILLS

Scripts: Python, UNIX Shell, WINDOWS batch, awk, sed, HTML, XML.

Database: PostgreSQL, DB2 (SQL), HIVE, IBM-IMS DB/DC, UNIX-DB, Cassandra

Operating Systems: WINDOWS 95/98/NT/XP, UNIX, LINUX, MS-DOS, MVS-ES.

Software Languages: Java, C, C++, Fortran, COBOL, PL-I, Assembly Language, MIPS, Lisp.

Platforms: Spark, Sun Solaris, Sparc, IBM PowerPC, INTEL 80X86, Pentium-X, IBM Mainframes.

Developer Environment: Visual Studio, Eclipse.

Big data technologies: Hadoop, Spark, Sqoop, Hive, Pig, HBase

PROFESSIONAL EXPERIENCE

Confidential, Minneapolis, MN

Data Scientist/Data Analyst

Responsibilities:

  • Gathered and analyzed business requirements, interacted with various business users, project leaders, developers and took part in identifying different data sources
  • Compiled data from multiple data sources, used SQL and Python packages for data extraction, loading and transformation
  • Write codes in Python, Scala, Spark, Hive to implement the product
  • Performed Data Cleaning, features scaling, features engineering,feature prioritization using pandas and NumPy packages in Python
  • Build predictive models and machine-learning algorithms
  • Developed a mixed effects linear regression model to predict student MAP test scores based on historical data, used lme4 package to predict student scores
  • Performed exploratory data analysis (EDA), summarized descriptive statistics
  • Handled anomalies in the data - removing duplicates, imputing missing values and treating null values using Python Scikit- learn
  • Participated in implementation of Hadoop platform and Big data technologies: Spark, Hive, Pig, and HBase on Amazon Web Services S3, Redshift and Athena use cases.
  • Helped in configuring Cassandra database
  • Visualized the data with the help of box plots and scatter plots to understand the distribution of data using Tableau, Python libraries
  • Developed a binary classification model to predict the risk of school dropouts with an accuracy of 89% using two class boosted decision trees
  • Collaborate with data scientist to prototype predictive models for converting data to insights
  • Involved in creating charts and graphs of the data from different data sources by using Matplotlib and SciPy libraries in python

Environment: Python 2.x/3.x, R Programming, SQL (Structured Query Language), Anaconda 3.x, Jupyter Notebooks, R Studio, Tableau Desktop, SQL Server 2012, Azure ML Studio, Jira, Git, Microsoft Excel

Confidential, Dallas, TX

Data Scientist / Data Analyst

Responsibilities:

  • Converted data into insights by predicting and modelling future outcomes
  • Utilizing MS SQL, Tableau, and other dashboard tools for data intelligence and analysis
  • Involved with Data Analysis primarily Identifying Data Sets, Source Data, Source Meta Data, Data Definitions and Data Formats.
  • Developed Python scripts for cleaning and pre-processing the client data for consumption by data pipeline, leading to decreased effort in data cleaning from 5 days to less than 5 hours.
  • Analyze customer data in Python using Matplotlib, Seaborn to track correlations in customer behaviour, define user segments to implement process and product improvements.
  • Involved in coding and debugging to solve business problems.
  • Implemented custom built machine learning application in Scala
  • Prepared reports that interpret customer behaviour, market opportunities and conditions, market results, trends, and investment level.
  • Solved client's analytics problems and effectively communicating results and methodologies
  • Installed and configured Hive and also written Hive UDFs.
  • Experience with NoSQL databases (HBase, Accumulo, Cassandra), in-memory databases (Redis, GridGain, Ignite), batch and streaming data processing (Spark, MapReduce, Kafka, Kinesis) and cloud services (AWS)
  • Extracted data by writing complex SQL queries, created meaningful data visualizations, dashboard using Tableau BI to improve user engagement rate
  • Created Heat Map showing current customers by colour that were broken into regions allowing business user to understand where we have most users vs. least users using Tableau
  • Used Log4J to trace the flow of the application and logging, debugging the application.
  • Extensively followed Agile principles like Continuous Integration, Pair Programming and Test Driven Development.
  • Blended data from multiple databases into one report by selecting primary key from each database for data validation

Environment: Python 2.x/3.x, SQL (Structured Query Language), Java Script, Tableau Desktop, SQL Server MS PowerPoint, MS Excel, Anaconda, Jupyter Notebooks

Confidential

Data Analyst

Responsibilities:

  • Performed ETL data extraction, transformation, cleansing and preparation of analytical datasets from structured formats such as RDBMS tables, Excel, CSV files
  • Analysed customer demographics, credit card transaction data to reveal factors influencing customer churn rate
  • Performed extensive prototyping using Mock-Up Screens and Tableau, Mock-Up Screens
  • Performed financial analysis to determine the credit worthiness of clients; shortened the pre-approval process and accelerated the application submission to banking institutions
  • Developed the database model and wrote the application from scratch
  • Developed segmentation, propensity, and look-alike modelling of individuals across known, identified and anonymous audiences
  • Evaluate Provider level data to assess provider data quality, profiling the data and building data quality reports using DQ analyzer, SQL server, MS Excel.
  • Reviewed Test Plans that contains test scripts, test cases, test data and expected results for the User Acceptance testing.
  • Investigate and conduct studies on the forecasts and demand, and capital of products
  • Strong knowledge of statistical methods (regression, hypothesis testing, randomized experiment), data structures and data infrastructure.
  • Performed A/B test analysis, simplified navigation in bank website with 70% improvement in conversions

Environment: Java, Microsoft Excel, TFS, Eclipse IDE, Optimizer for A/B Testing, UI Prototype, Proto Fluid for UI Testing, SQL Server, HTML/CSS

Confidential

Data Analyst

Responsibilities:

  • Involved in technical design, development and end-to-end testing (Regression, System, UAT) for 5 different projects adopting Agile standards achieving 90% quality compliant and on-time product delivery
  • Analysed customer data and created reports using Excel features pivot tables, VLookup, Charts Performed database testing validating database tables, stored procedures and triggers
  • Heavily used the JSP'S and HTML for designing the screens.
  • Involved with Data Analysis primarily Identifying Data Sets, Source Data, Source Meta Data, Data Definitions and Data Formats
  • Created and modified database triggers, stored procedures, or complex analytical queries including multi-table joins, nested queries, and correlated sub queries optimized the performance
  • Deployed application on JBOSS Application Server to get efficient performance.
  • Extensively used SQL function such as SUM, COUNT, CASE Statements for creating, transformation logic for data mapping
  • Managed the planning and development of design and procedures for metric report.
  • Optimized data collection and procedure to generate reports weekly, monthly and quarterly
  • Developed the application using Agile methodology and planned the scrum meetings.
  • Involved in exhaustive documentation for technical phase of the project and training materials for all data management functions
  • Led offshore efforts for code debugging and testing of 100+ critical/major functional tickets in an enhancement project
  • Conducted major stakeholder interviews involving SME’s, Business Analysts and other stakeholders
  • Used various SQL commands like Create, Delete, Update, and Inner, Outer, Left, and Right Joins to update the database and retrieve data for data analysis and validation.
  • Created action plans to track identified open issues and action items related to the project Prepared analytical and status reports and updated the project plan as required

Environment: PowerPoint, MS Visio, MS Excel, MySQL, T-SQL, Tableau, MS Access, VBA, PL-SQL, TOAD for Data Analysis

We'd love your feedback!