Data Scientist/ Machine Learning Resume
Cupertino, CA
PROFESSIONAL SUMMARY:
- Above 8+ years of experience in Machine Learning, Data Mining wif large datasets of Structured and Unstructured data, Data Acquisition, Data Validation, Predictive Modeling, Data Visualization.
- Designing of Physical Data Architecture of New system engines.
- Extensive experience in Text Analytics, developing different Statistical Machine Learning, Data Mining solutions to various business problems and generating data visualizations using R, Python and Tableau.
- Having good experience in NLP wif Apache, Hadoop and Python.
- Hands on SparkMlib utilities such as Including Classification, Regression, Clustering, Collaborative Filtering, Dimensionality Reduction.
- Proficient in Statistical Modeling and Machine Learning techniques (Linear, Logistics, Decision Trees, Random Forest, SVM, K - Nearest Neighbors, Bayesian, XG Boost) in Forecasting/ Predictive Analytics, Segmentation Methodologies, Regression based models, Hypothesis testing, Factor analysis/ PCA, Ensembles.
- Hands on experience in implementing LDA, Naïve Bayes and skilled in Random Forests, Decision Trees, Linear and Logistic Regression, SVM, Clustering, Neural Networks, Principal Component Analysis and good knowledge on Recommender Systems.
- Developing Logical Data Architecture wif adherence to Enterprise Architecture.
- Expertise in transforming business requirements into analytical models, designing algorithms, building models, developing data mining and reporting solutions dat scales across massive volume of structured and unstructured data.
- Adept in statistical programming languages like R and also Python including BigData technologies like Hadoop, Hive.
- Strong experience in Software Development Life Cycle (SDLC) including Requirements Analysis, Design Specification and Testing as per Cycle in both Waterfall and Agile methodologies.
- Experience working wif data modeling tools like Erwin, Power Designer and ER Studio.
- Skilled in using dplyr and pandas in R and python for performing exploratory data analysis.
- Experience in designing stunning visualizations using Tableau software and publishing and presenting dashboards, Storyline on web and desktop platforms.
- Experience in designing star schema, Snow flake schema for Data Warehouse, ODS architecture.
- Experience in designing and developing teh Tableau and updating teh existing desktop, developing ad-hoc reports, scheduling teh processes and administering teh tableau activities using tableau.
- Experienced in designing customized interactive dashboards in Tableau using Marks, Action, Filters, Parameter and Calculations.
- Good understanding of Teradata SQL Assistant, Teradata Administrator and data load/ export utilities like BTEQ, Fast Load, Multi Load, Fast Export.
- Experience and Technical proficiency in Designing, Data Modeling Online Applications, Solution Lead for Architecting Data Warehouse/Business Intelligence Applications.
- Experience in maintaining database architecture and metadata dat support teh Enterprise Data warehouse.
- Prediction - Prediction of a numerical value using Regression or CART.
- Experience wif Data Analytics, Data Reporting, Ad-hoc Reporting, Graphs, Scales, Pivot Tables and OLAP reporting.
- Highly skilled in using visualization tools like Tableau, ggplot2 and d3.js for creating dashboards.
- Highly skilled in using Hadoop (Pig and Hive) for basic analysis and extraction of data in teh infrastructure to provide data summarization.
TECHNICAL SKILLS
Languages: SQL, PL/SQL, ASP, Visual Basic, XML,SAS, Python, SQL, T-SQL, SQL Server, C, C++, JAVA, HTML, Shell Scripting, PERL, R, Matlab, Scala.
DataModeling Tools: Erwin r 9.6/9.5, ER/Studio 9.7, Star-Schema Modeling, Snowflake-Schema Modeling, FACT and dimension tables, Pivot Tables.
Databases: Oracle 11g/12c, MS Access, SQL Server 2012/2014, Sybase and DB2, Teradata14/15, Hive.
Big Data Tools: Hadoop, Hive, Spark, Pig, HBase, Sqoop, Flume.
BI Tools: Tableau 7.0/8.2, Tableau server 8.2, Tableau Reader 8.1,SAP Business Objects, Crystal Reports
Packages: Microsoft Office 2010, Microsoft Project 2010, SAP and Microsoft Visio, Share point Portal Server
Applications: Toad for Oracle, Oracle SQL Developer, MS Word, MS Excel MS Power Point, Teradata, Designer 6i.
Methodologies: RAD, JAD, RUP, UML, System Development Life Cycle (SDLC), Waterfall Model.
Operating Systems: Microsoft Windows, Linux/UNIX.
PROFESSIONAL EXPERIENCE:
Confidential, Cupertino, CA
Data Scientist/ Machine Learning
Responsibilities:
- Built models using Statistical techniques like Bayesian HMM and Machine Learning classification models like XG Boost, SVM, and Random Forest.
- A highly immersive Data Science program involving Data Manipulation & Visualization, Web Scraping, Machine Learning, Python programming, SQL, GIT, Unix Commands, NoSQL, MongoDB, Hadoop.
- Designing and develop Tableau Reports, Documents, Dashboards for specified requirements and timelines.
- Setup storage and data analysis tools in Amazon Web Services cloud computing infrastructure.
- Used pandas, numpy, seaborn, scipy, matplotlib, scikit-learn, NLTK in Python for developing various machine learning algorithms.
- Installed and used Caffe Deep Learning Framework
- Worked on different data formats such as JSON, XML and performed machine learning algorithms in Python.
- Worked asDataArchitectsand ITArchitectsto understand teh movement ofdataand its storage and ER Studio 9.7.
- Purchasing, Setting up and configuring a Tableau Server and MS-SQL 2008 R2 server for Data warehouse purpose.
- Preparing Dashboards using calculations, parameters in Tableau.
- Designed, developed and implemented Tableau Business Intelligence reports.
- Participated in all phases of data mining; data collection, data cleaning, developing models, validation, visualization and performed Gap analysis.
- Data Manipulation and Aggregation from different source using Nexus, Toad, Business Objects, Power BI and Smart View.
- Implemented Agile Methodology for building an internal application.
- Focus on integration overlap and Informatica newer commitment to MDM wif teh acquisition of Identity Systems.
- Good knowledge of Hadoop Architecture and various components such as HDFS, JobTracker, Task Tracker, NameNode, DataNode, Secondary NameNode, and MapReduce concepts.
- AsArchitectdelivered various complex OLAP databases/cubes, scorecards, dashboards and reports.
- Programmed a utility in Python dat used multiple packages (scipy, numpy, pandas)
- Implemented Classification using supervised algorithms like Logistic Regression, Decision trees, KNN, Naive Bayes.
- Used Teradata15 utilities such as Fast Export, MLOAD for handling various tasks data migration/ETL from OLTP Source Systems to OLAP Target Systems
- Experience in Hadoop ecosystem components like Hadoop MapReduce, HDFS, HBase, Oozie, Hive, Sqoop, Pig, Flume including their installation and configuration.
- Updated Python scripts to match trainingdatawif our database stored in AWS Cloud Search, so dat we would be able to assigneach document a response label for further classification.
- Datatransformation from various resources,dataorganization, features extraction from raw and stored.
- Validated teh machine learning classifiers using ROC Curves and Lift Charts.
- Extracted data from HDFS and prepared data for exploratory analysis using data munging.
Environment: ER Studio 9.7, Tableau 9.03, AWS, Teradata 15, MDM, GIT, Unix, Python 3.5.2,, Machine learning, MLLib, SAS, regression, logistic regression, Hadoop, NoSQL, Teradata, OLTP, random forest, OLAP, HDFS, ODS, NLTK, SVM, JSON, XML, MapReduce.
Confidential, Palo Alto, CA
Data Architecture/Data Modeler
Responsibilities:
- Coded R functions to interface wif Caffe Deep Learning Framework.
- Working in Amazon Web Services cloud computing environment
- Used Tableau to automatically generate reports, Worked wif partially adjudicated insurance flat files, internal records, 3rd partydatasources, JSON, XML and more.
- Interacting wif business stake holders, gathering requirements and managing teh delivery, covering teh entire Tableau development life cycle.
- Created BI interactive Dashboards and submitting to teh server using Tableau Publisher.
- Worked wif several R packages including knitr, dplyr, SparkR, CausalInfer, spacetime.
- Implemented end-to-end systems for Data Analytics, Data Automation and integrated wif custom visualization tools using R, Mahout, Hadoop and MongoDB.
- Gathering all teh data dat is required from multiple data sources and creating datasets dat will be used in analysis.
- Performed Exploratory Data Analysis and Data Visualizations using R, and Tableau.
- Perform a proper EDA, Univariate and bi-variate analysis to understand teh intrinsic TEMPeffect/combined TEMPeffects.
- Worked wifDatagovernance,Dataquality,datalineage,Dataarchitectto design various models and processes.
- Independently coded new programs and designed Tables to load and test teh program TEMPeffectively for teh given POC's using wif BigData/Hadoop.
- Designeddatamodels anddataflow diagrams using Erwin and MS Visio.
- As an Architect implemented MDM hub to provide clean, consistent data for a SOA implementation.
- Developed, Implemented & Maintained teh Conceptual, Logical & PhysicalDataModels using Erwin for Forward/Reverse Engineered Databases.
- EstablishedDataarchitecture strategy, best practices, standards, and roadmaps.
- Lead teh development and presentation of a data analytics data-hub prototype wif teh help of teh other members of teh emerging solutions team
- Performed data cleaning and imputation of missing values using R.
- Worked wif Hadoop eco system covering HDFS, HBase, YARN and Map Reduce
- Take up ad-hoc requests based on different departments and locations
- Used Hive to store teh data and perform data cleaning steps for huge datasets.
- Created dash boards and visualization on regular basis using ggplot2 and Tableau
- Creating customized business reports and sharing insights to teh management
- Worked wif BTEQ to submit SQL statements, import and export data, and generate reports in Teradata.
- Interacted wif teh other departments to understand and identify data needs and requirements and work wif other members of teh IT organization to deliver data visualization and reporting solutions to address those needs.
Environment: R 3.0, Erwin 9.5, Tableau 8.0, MDM, QlikView, MLLib, PL/SQL, HDFS, Teradata 14.1, JSON, HADOOP (HDFS), MapReduce, PIG, Spark, R Studio, MAHOUT, JAVA, HIVE, AWS.
Confidential, Princeton, NJ
Java Developer
Responsibilities:
- Designed & developed teh application using Spring Framework
- Developed class diagrams, sequence and use case diagrams using UML Rational Rose.
- Designed teh application wif reusable J2EE design patterns
- Developed test cases for Unit testing using JUnit and performed integration and system testing
- Involved in coding for teh presentation layer using Struts Framework, JSP, AJAX, XML, XSLT and JavaScript
- Closely worked and supported teh creation of database schema objects (tables, stored procedures, and triggers) using Oracle SQL.
- Designed DAO objects for accessing RDBMS
- Designed & developed Data Transfer Objects to carry teh data between different layers
- Developed web pages using JSP, HTML, DHTML and JSTL
- Designed and developed a web-based client using Servlets, JSP, Tag Libraries, JavaScript, HTML and XML using Struts Framework.
- Developed views and controllers for client and manager modules using Spring MVC and Spring Core.
- Used Spring Security for securing teh web tier Access.
- Business logic is implemented using Hibernate.
- Developed and modified database objects as per teh requirements.
- Involved in Unit integration, bug fixing, acceptance testing wif test cases, Code reviews.
- Interaction wif customers and identified System Requirements and developed Software Requirement Specifications.
- Implemented Java design patterns wherever required.
- Implemented Multi-threading concepts.
Environment: Java, PL/SQL, SQL, HTML, CSS, JavaScript, hibernate, Middleware Technologies, Ajax, Servlets, JSP, Web logic, JBoss, WebSphere, XML, XHTML, Eclipse, JMS, Oracle11g, EJB.
Confidential
Java Developer
Responsibilities:
- Participating in system design, planning, estimation, and implementation.
- Involved in developing Use case diagrams, Class diagrams, Sequence diagrams and process flow diagrams for teh modules using UML and Rational Rose.
- Developed teh presentation layer using JSP, AJAX, HTML, XHTML, CSS and client validations using JavaScript.
- Developed and implemented teh MVC Architectural Pattern using Spring Framework.
- TEMPEffective usage of J2EE Design Patterns Namely Session Facade, Factory Method, Command, and Singleton to develop various base framework components in teh application.
- Developed various EJBs (session and entity beans) for handling business logic.
- Developed Session Beans and DAO classes for Accounts and other Modules.
- Worked on generating teh web services classes by using WSDL, UDDI, and SOAP.
- Consumed Web Services using WSDL, SOAP, and UDDI from teh third party for authorizing payments to/from customers.
- Designed and developed systems based on JEE specifications and used Spring Framework wif MVC architecture.
- Used Spring Roo Framework Design/Enterprise Integration patterns and REST architecture compliance for design and development of applications.
- Involved in teh application development using Spring Core, Spring Roo, Spring JEE, Spring Aspects modules and Java web-based technologies such as Web Service (REST /SOA /micro services) including micro services implementations and Hibernate ORM.
- Used LDAP and Microsoft active directory series for authorization and authentication services.
- Implemented different design patterns such as singleton, Session Façade, Factory, and MVC design patterns such as Business delegate, session façade and DAO design patterns.
- Used JPA - Object Mapping for teh backend data persistence.
Environment: R 3.0, Erwin 9.5, Tableau 8.0, MDM, QlikView, MLLib, PL/SQL, HDFS, Teradata 14.1, JSON, HADOOP (HDFS), MapReduce, PIG, Spark, R Studio, MAHOUT, JAVA, HIVE, AWS.
