Data Scientist Resume
San Antonio, TX
SUMMARY:
- 7 plus years of IT experience (as full time Employee) including 2 years of experience in Machine Learning and Artificial intelligence area and 5 years of extensive development experience in J2EE applications in Money Movement, Fraud and Detection, Enterprise, Insurance & Banking domains
- Currently associated with TATA Consultancy Services Ltd, USA.
- Machine Learning Engineer with 2+years of experience interpreting and analyzing machine learning models to drive successful business solutions.
- Expert in writing SAS programs for Ruled Based Scenarios
- Expertise in building models with a very good accuracy using Linear Regression, Logistic Regression, Support Vector Machines (SVM), Decision trees with large datasets with proper null handling, outlier detection, and applying class imbalance techniques
- Proficient in in doing Statistical and Exploratory Data Analysis (EDA) on large amount of datasets
- Got very good exposure in neural networks techniques: LSTM, CNN, RNN, GRU,3d - CNN. implemented few projects using CNN too
- Got good exposure in NLP Techniques: POS Tagging using NumPy libraries, HMM and CRF Algorithms
- Expertise in conducting hypothesis testing on model
- Worked and learnt great deal from Amazon Web Services (AWS) Cloud services like EC2,S3,EBS,RDS,VPC and IAM
- Currently working in agile methodology.
- Proven ability to lead, mange project resources, interact with clients, coordinate and work in achieving high levels of productivity and efficiency in complex, dynamic and challenging environments
- Strong Experience in Design and Development of Applications based on Java frame works like Wicket, Spring and using Restful webservices as backend
- Experience in building web applications using Spring Framework features like MVC (Model View Controller), DAO (Data Access Object) and template classes
- Proficient in developing SOAP and Restful webservices
- Excellent Technical, interpersonal & business communication skills with strong Customer Orientation and Client Interfacing Skills. Able to work independently and supervise development teams.
- A self-motivated professional and natural communicator possessing good technical, problem-solving and leadership skills and proven to be a good team player.
- Extensive experience in production support and troubleshooting issues arising post-deployment
- Suggested new solutions for varies projects and successfully implemented them.
TECHNICAL SKILLS:
Programming Languages: Java, Python, Pandas, R, SAS, ETL, SQL
Web Technologies: JavaScript, Webservices (REST,SOAP)
Machine Learning Model Designs: LinearRegression,LogisticRegression,SVM,Decision Tree,Random Forest,NeuralNetworks(FeedForward,Backwardpropagation),NaturalLanguage Processing(NLP)
Interface Development Environment: (IDE) IBM RSA, Anaconda Navigator
Operating System: Windows, Mac
Database: SQL Server, ORACLE, DB2,Nettezza, Hadoop Clusters
Hypothesis Testing: P-Test,Z-Test,Anova,Chi-Square Test
PROFESSIONAL EXPERIENCE:
Confidential, San Antonio, TX
Data Scientist esponsibilities:
- Made an EDA analysis on demographic data (internal to client),personal data related to customer which was derived from third party vendor .
- Performed feature engineering, created derived variables, plotted correlation matrix to find the multi collinearity between the variables.
- Explored the WOE(Weight of evidence technique) to impute values for the binned variables
- Created dummy variables for the specific categorical variables in the dataframe created by joining the internal customer data with third party vendor data
- Handled the missing values in data using data imputation techniques and business understanding of the variables
- Calculated the optimal cutoff value for the model by plotting the ROC curve
- Applied VIF(Variance Inflation Factor ) and P value to find the multicollinearity between variables and significant variables
- Performed the outlier detection on the data by analysing the data using box plotting
- Handled the class imbalance in the data using SMOTE technique
- Calculated the metrics of Precision,Recall,Accuracy,sensitivity and specificity for the model
- Created the logistic regression model on the customer data with an accuracy of 83 percent and working on increasing the performance of the same using boosting techniques (which is in process)
Environment: Anaconda, Scikit-Learn libraries,python,seaborn,StatModel and matplotlib
Confidential, San Antonio, TX
Data Scientist
Responsibilities:
- Made an EDA analysis on demographic data (internal to client),MAAC data, data related to customer which was derived from third party vendor .
- Performed feature engineering, created derived variables, plotted correlation matrix to find the multi collinearity between the variables.
- Calculated the optimal cutoff value for the model by plotting the ROC curve
- Handled the missing values in data using data imputation techniques and business understanding of the variables
- Applied the RFE technique on the features to find the most significant variables in the data as the variables count is greater than 50
- Performed the chi square hypothesis testing on the data
- Handled the class imbalance in the data using SMOTE technique
- Calculated the metrics of Precision,Recall,Accuracy,sensitivity and specificity for the model
- Created the logistic regression model on the data with an accuracy of 84 percent
Environment: Anaconda, Scikit-Learn libraries,python,seaborn,StatModel and matplotlib
Confidential, TX
Data Scientist
Responsibilities:
- Made an EDA analysis on alert data, created a random forest model to predict whether an AML alert is suspicious activity or a false positive
- Created a confusion matrix to find the false positive rate in the data and worked as per the business requirement to increase the sensitivity parameter.
- Created the model for AML Model tuning with an accuracy of 89 percent
Environment: Anaconda, Scikit-Learn libraries,python,seaborn and matplotlib
Aspiring Data Scientist
Confidential
- Customer Identification needs to be performed whenever a new member is having financial relationship. This project is to replace the existing non-IT supported system with the new IT supported system so that the business users can be productive and easy to follow the business rules.
Environment: Java,J2EE, JBoss, SOAP Web Services, Eclipse, Oracle, Hibernate, JSP, Servlets, Spring, Oracle WebLogic server, AJAX, JavaScript, HTML, XML, JUnit, SQL, Apache Tomcat, Maven.
Confidential
Module Lead
Environment: JAVA,J2EE, JUnit, AJAX, JSP, JavaScript, Eclipse, PL/SQL, SQL server, Web services,SOAP SOSOSOAPXSLT, Servlets, Struts, XML, Spring, Hibernate, Log4j, Apache Tomcat, Unix, Rational Rose
Responsibilities:
- Interacting with the client to understand the project and finalize its scope.
- Estimation, design and development of various modules using agile and Scrum methodology.
- Preparing High Level Design and Low Level Design of the Components as per the requirement
- Worked as Team lead in assigning functionalities and communicating between team members.
- Coordination with offshore team to communicate the Business/System requirements
- Created functional and technical specification for the project.
- Reviewing the Components which were developed by offshore team and get the same reviewed with Project Team and M&P Team
- Created and run the Junit test suites for the unit testing the application.
- Deploy and run the application using Quick Build.
- Used RTC to maintain the different versions of codes
- Supporting the Quality Assurance Team and M&P Team during the Project Warranty/Implementation periods
- Fixing the code right away if there are any production defects
- Responsible for Configuration Management activities (code deployment in RTC)
- Deployed the application in Apache Tomcat Application Server.
- Complete ownership of the project and end to end coordination with all interfacing teams.