We provide IT Staff Augmentation Services!

Data Scientist Resume

4.00/5 (Submit Your Rating)

Irving, TX

SUMMARY:

  • Over 8 years of experience in Data science / Data analysis, ETL Development, and Project Management.
  • Having Experience in all phases of diverse technology projects specializing in Data Science and Confidential Learning.
  • Proven expertise in employing techniques for Supervised and Unsupervised (Clustering, Classification, PCA, Decision trees, KNN, SVM) learning, Predictive Analytics, Optimization Methods and Natural Language Processing (NLP), Time Series Analysis.
  • Experienced in MachineLearning Regression Algorithms like Simple, Multiple, Polynomial, SVR (Support Vector Regression), Decision Tree Regression, Random Forest Regression.
  • Experienced in advanced statistical analysis and predictive modelling in the structured and unstructured data environment.
  • Expertise in Hadoop ecosystem components HDFS, Map Reduce, Yarn, HBase, Pig, Sqoop, Spark, SparkSQL, Spark Streaming, and Hive for scalability, distributed computing, and high - performance computing
  • Strong knowledge of NOSQL column-oriented databases like HBase, Cassandra, Mongo DB, and MarkLogic and its integration with the Hadoop cluster.
  • Strong expertise in Business and Data Analysis, Data Profiling, Data Migration, Data Conversion, Data Quality, Data Governance, Data Lineage, Data Integration, Master Data Management(MDM), Metadata Management Services, Reference Data Management (RDM).
  • Hands on experience of Data Science libraries in Python such as Pandas, NumPy, SciPy, Scikit-learn, Matplot lib, Sea born, Beautiful Soup, Orange, Rpy2, LibSVM, neurolab, NLTK.
  • Solid understanding of AWS(Amazon Web Services) S3, EC2, RDS and IAM, Azure ML, Apache Spark, Scala process, and concepts.
  • Good Understanding of working on Artificial Neural Networks and DeepLearning models using Theano and TensorFlow packages using in Python.
  • Experienced in MachineLearning Classification Algorithms like Logistic Regression, K-NN, SVM, KernelSVM, Naive Bayes, Decision Tree & Random Forest classification.
  • Experience in various phases of Software Development life cycle (Analysis, Requirements gathering, Designing) with expertise in writing/documenting Technical Design Document (TDD), Functional Specification Document (FSD), TestPlans, GAP Analysis and Source to Target mapping documents.
  • Strong understanding of project life cycle and SDLC methodologies including RUP, RAD, Waterfall, andAgile.
  • Very good knowledge and understanding of MicrosoftSQLServer, Oracle, Teradata, Hadoop/Hive.
  • Strong expertise in ETL, Datawarehousing, Operational Data Store (ODS), DataMarts, OLAP and OLTP technologies.
  • Analytical, performance-focused, and detail-oriented professional, offering in-depth knowledge of data analysis and statistics; utilized complex SQL queries for data manipulation.
  • Expertises in using Linear & Logistic Regression and Classification Modelling, Decision-trees, Principal Component Analysis (PCA), Cluster and Segmentation analyses, and have authored and co-authored several scholarly articles applying these techniques.
  • Assist in determining the full domain of the MVP, create and implement its relevant data model for the App and work with App developers integrating the MVP into the App and any backend domains.
  • Ensure REST-based API including all CRUD operations integrate with the App and other service domains.
  • Installing and configuring additional services on appropriate AWS EC2, RDS, S3 and/or other AWSservice instances.
  • Integrating these services with each other and ensuring that user access to data, data storage, and communication between various services.

WORK EXPERIENCE:

Data Scientist

Confidential, Irving, TX

Responsibilities:

  • Design and develop state-of-the-art deep-learning / machine-learning algorithms for analyzing the image and video data among others.
  • Develop and implement innovative AI and Confidential learning tools that will be used in the Risk
  • Experience with Tensor Flow, Cafe and other Deep Learning frameworks.
  • Develop project requirements and deliverable timelines; execute efficiently to meet the plan timelines.
  • Analyzing large data sets apply Confidential learning techniques and develop predictive models, statistical models and developing and enhancing statistical models by leveraging best-in-class modeling techniques.
  • Developed Map Reduce/Spark Python modules for Confidential learning & predictive analytics in Hadoop on AWS.
  • Worked with various Teradata15 tools and utilities like Teradata Viewpoint, Multi-Load, ARC, TeradataAdministrator, BTEQ and other TeradataUtilities.
  • Utilized Spark, Scala, Hadoop, HBase, Kafka, Spark Streaming, Caffe, Tensor Flow, MLLib, Python, a broad variety of machinelearning methods including classifications, regressions, dimensionally reduction etc.
  • Involved with DataAnalysis Primarily Identifying Data Sets, Source Data, Source Meta Data, Data Definitions and DataFormats.
  • Well experienced in Normalization and De-Normalization techniques for optimum performance in relational and dimensional database environments.
  • Understanding requirements, the significance of weld point data, energy efficiency using large datasets
  • Develop necessary connectors to plug ML software into wider data pipeline architectures.
  • Creating and supporting a data management workflow from data collection, storage, and analysis to training and validation.
  • Wrangled data, worked on large datasets (acquired data and cleaned the data), analyzed trends by making visualizations using mat plot lib and python.

Environment: R 9.0, R Studio, Confidential learning, Informatic a 9.0, Scala, Spark, Cassandra, ML, DL, Scikit-learn, Shogun, Data Warehouse, MLLib, Cloud era Oryx, Apache.

Data Scientist

Confidential, Durham, NC

Responsibilities:

  • Involved in the design, development and testing phases of application using AGILE methodology.
  • Designed and maintained databases using Python and developed Python based API (RestfulWebService) using Flask, SQLAlchemy and PostgreSQL.
  • Designed and developed the UI of the website using HTML, XHTML, AJAX, CSS and JavaScript.
  • Participated in requirement gathering and worked closely with the architect in designing and modeling.
  • Worked on Restful web services which enforced a stateless client server and support JSON few changes from SOAP to RESTFUL Technology Involved in detailed analysis based on the requirement documents.
  • Involved in writing SQL queries implementing functions, triggers, cursors, object types, sequences, indexes etc.
  • Created and managed all of hosted or local repositories through Source Tree's simple interface of GIT client, collaborated with GIT command lines and Stash.
  • Responsible for setting up PythonRESTAPI framework and spring frame work using Django
  • Develop consumer based features and applications using Python, Django, HTML, behavior Driven Development (BDD) and pair based programming.
  • Designed and developed components using Python with Djangoframework. Implemented code in python to retrieve and manipulate data.
  • Involved in development of the enterprise social network application using Python, Twisted, and Cassandra.
  • Used Python and Django creating graphics, XML processing of documents, data exchange and business logic implementation between servers.
  • Worked closely with back-end developer to find ways to push the limits of existing Webtechnology.
  • Designed and developed the UI for the website with HTML, XHTML, CSS, Java Script and AJAX
  • Used AJAX&JSON communication for accessing Restful web services data payload.
  • Designed dynamicclient-sideJavaScript codes to build web forms and performed simulations for web application page.
  • Created and implemented SQLQueries, Stored procedures, Functions, Packages and Triggers in SQL Server.
  • Successfully implemented AutoComplete/AutoSuggest functionality using JQuery, Ajax, WebService and JSON.
  • Identified and added the report parameters and created the reports based on the requirements using SSRS 2008.
  • Tested and managed the SSIS 2005/2008 and SSIS 2007/8 packages and was responsible for its security.

Environment: Python, Java/J2EE, Django1.0, HTML, CSS Linux, Shell Scripting, Java Script, Ajax, JQuery, JSON, XML, PostgreSQL, Jenkins, ANT, Maven, Subversion, Python

Data Scientist

Confidential, Fremont, CA

Responsibilities:

  • Used SSIS to create ETL packages to Validate, Extract, Transform and Load data into DataWarehouse and Data Mart.
  • Maintained and developed complex SQL queries, stored procedures, views, functions and reports that meet customer requirements using MicrosoftSQLServer 2008 R2.
  • Created Views and Table-valued Functions, Common Table Expression (CTE), joins, complex subqueries to provide the reporting solutions.
  • Optimized the performance of queries with modification in T-SQL queries, removed the unnecessary columns and redundant data, normalized tables, established joins and created index.
  • Created SSIS packages using Pivot Transformation, Fuzzy Lookup, Derived Columns, Condition Split, Aggregate, Execute SQL Task, Data Flow Task and Execute Package Task.
  • Migrated data from SAS environment to SQL Server 2008 via SQL Integration Services (SSIS).
  • Developed and implemented several types of Financial Reports (Income Statement, Profit& Loss Statement, EBIT, ROIC Reports) by using SSRS.
  • Developed parameterized dynamic performance Reports (Gross Margin, Revenue base on geographic regions, Profitability based on web sales and smartphone app sales) and ran the reports every month and distributed them to respective departments through mailing server subscriptions and SharePoint server.
  • Designed and developed new reports and maintained existing reports using Microsoft SQL Reporting Services (SSRS) and Microsoft Excel to support the firm's strategy and management.
  • Created sub-reports, drill down reports, summary reports, parameterized reports, and ad-hoc reports using SSRS.
  • Used SAS/SQL to pull data out from databases and aggregate to provide detailed reporting based on the user requirements.
  • Used SAS for pre-processing data, SQL queries, data analysis, generating reports, graphics, and statistical analyses.
  • Provided statistical research analyses and data modeling support for mortgage product.
  • Perform analyses such as regression analysis, logistic regression, discriminant analysis, cluster analysis using SAS programming.

Environment: SQL Server 2008 R2, DB2, Oracle, SQL Server Management Studio, SAS/ BASE, SAS/SQL, SAS/Enterprise Guide, MS BI Suite (SSIS/SSRS), T-SQL, SharePoint 2010, Visual Studio 2010, Agile/SCRUM.

We'd love your feedback!