We provide IT Staff Augmentation Services!

Software Engineer Resume

2.00/5 (Submit Your Rating)

Little Elm, TX

SUMMARY:

  • 6 years of experience in Data Science/Machine Learning which includes 1 year on Graph Platforms) with demonstrated history of extracting actionable insights from massive amounts of data and decision making across various domains, setting infrastructure on Linux platform, Skilled in data visualization, building predictive models using Machine Learning techniques to provide best results on unseen data.
  • Expertise in buildingdata models usingmachine learning techniques like Clustering Analysis, Market Basket Analysis, Association Rules, Naïve Bayes, Recommendation System, Dimension Reduction, Principal Component Analysis (PCA), Neural Networks, Natural Language Processing techniques.
  • Expertise to build predictive models like Decision Tree, Linear Regression, Logistic Regression, Support Vector Machine.
  • Understanding hidden patterns and detecting outliers to build sophisticated models.
  • Using Big data technology to build Recommender Systems.
  • Applying Deep learning models: Convolution Neural Networks (CNN), MCNN (Multi Column CNN) to real time implementations.
  • Expertise in working on Linux platforms and computing on HPC clusters, NVIDIA JETSON TX2.
  • Experience in Tableau, Power BI, Statistical Modeling & Graph Analytics.
  • Proficiency in operating and deploying models on Docker, Kubernetes.
  • Hands on experience with Azure data platform stack: Azure Databricks, Azure Data Factory, Azure Data Lake storage, Azure DevOps.
  • Expertise in Scrapping the data, Data Labelling, Data Analysis, Data Migration, Data Cleansing, Transformation, Integration, Data Import, and Data Export.
  • Demonstrated technical project “ Image Based Head Counting Using Machine Learning Models” at IEEE conference held at Double - Hilton tree, San Jose, CA
  • Expertise with data analysis languages such as C++, Python, SQL, SAS.
  • Participated in “Google Landmark Detection” Kaggle competition.
  • Expertise in Model Evaluation using Performance metrics like Precision, Recall, Accuracy, Confusion Matrix, K-fold Validation.
  • Proficiency in Hadoop ecosystem( Map Reduce, Pig, Hive, Apache Spark, YARN, HDFS, FLUME)
  • Expertise in Amazon Web Service(AWS) and Google Cloud Platform(GCP).
  • Strong, dynamic, and dedicated team player with effective communication skills and ready to address corporate challenges.

TECHNICAL SKILLS:

Big Data and Web: Map Reduce, Pig, Hive, Apache Spark, YARN, HDFS, FLUME, ZOOKEEPER; JavaScript, HTML, CSS.

Cloud Technologies: Azure, AWS, Google cloud, Docker, Kubernetes.

Databases and OS: Cassandra, SQL Server, MongoDB, TigerGraph, RAI, Neo4J; Windows, Linux.

Programming Languages and ML Libraries and Tools: Python, R, SQL; NumPy, Pandas, SciPy, Scikit-learn, NLTK, Keras, Seaborn, Matplotlib, TensorFlow, Pytorch, OpenCV, Tableau, Ploty; Anaconda Navigator, PyCharm, Apache Spark, Apache Hadoop, CentOS(Linux), POWER BI.

Statistical Methods: Time Series, Regression models, Confidence intervals, Principal Component Analysis and Dimensionality Reduction.

WORK EXPERIENCE:

Data Scientist, RNR IT Solutions Inc, Little Elm, TX

Software Engineer

Responsibilities:

  • Employed programming languages such as Python and C++ to write software prototypes. Analyzed 50+ complex simulation datasets with logistic regression models.
  • Researched the software market for solutions to client needs.
  • Predicted product sales to an accuracy of 2% through predictive analytics algorithms. Translated business problems into deep learning models to produce results from data inputs.
  • Improved simulation accuracy by 15% through ML algorithms.
  • Managed full SDLC(Software Development Lifecycle) - reviewed requirements, coordinated with the application development team, distributed tasks, created test plans, coordinated production installation, and participated in support.
  • Involved in Risk Management Plan (RMP), standard part of every RNR IT Solutions Inc’s Software project and which is d in the overall Project Management Plan.
  • Customer Service Chatbot was created using NLP and Fundamentals of Machine Translation, Python, TensorFlow, nltk, NumPy, scikit learn, Spacy, TextBlob, Word2Vec, RNN.
  • Used NLP for preprocessing, cleaning text data, Tokenization of texts and words.
  • Implementation of state-of-the-art algorithms like LSTM and RNN.
  • Performing Data Visualization using Matplotlib and familiarity with NumPy and Pandas packages.
  • Building deep learning models using Keras, TensorFlow and Pytorch for product recommendation and deploying model on k8(Kubernetes) clusters.
  • Communicated weekly application status to business users and end users.
  • Unit testing done on completion of development of each unit.
  • Utilized SQL skills to query database.
  • Implemented a new incident management system to provide real time SLA and KPI reports to key stakeholders.
  • Monitored corporate endpoints for malware and unapproved software. Advised on and executed approval methods.
  • Provided guidance on deployment strategy of a security solution for over 80,000 endpoints.
  • Implemented SQL queries for ad-hoc reporting at management and client request.
  • Building Ontology for 26 profiles, cleaning profiles and traversing through graph and designing object relational mapping for AT&T data.
  • Working on Atlantis feature store, performing semantic search using DevOps tools.
  • Deploying, managing, and operating scalable, highly available, and fault tolerant systems on MS Azure.
  • Strong presentation skills using MS PowerPoint, Power BI with the ability to digest complex technical concepts and present them to technical teams for monitoring, logging and cost management tools that integrate with Azure Databricks.
  • Develop and create test data, retrieve test data from servers by SQL queries.
  • Writing python scripts for loading data through Azure notebooks.
  • Creating Data Engines and using Rel programs to build graphs to further feed into ML pipelines.
  • Working on postman and Norma to perform Semantic search.
  • Working on Automatic Semantic Data Integration
  • Semantic Search and Recommendation Systems to suggest mobile plans and various subscription types to different category customers.
  • Implemented Graph Neural Networks to classify networks based on coverage, various types of broadband used, types of subscribers, types of account, churn rate using Azure Data Bricks.

MACHINE LEARNING ENGINEER

Confidential, Addison, TX

Responsibilities:

  • Working closely with the Data Science team to help prioritize the work and provide acceptance criteria and build data analytics models for substantial number of graphs.
  • Creating graph schema using GSQL queries, installing Page-rank and Louvain algorithm to assist fraud detection team in detecting fraud rings.
  • Working with Application development teams and Data Modeling team to design and develop platforms to load huge data on Neo4j and Tiger graph platforms.
  • Predicting Loan Eligibility using Machine Learning Models, Logistic Regression, Decision Tree Random Forest, XGBoost.
  • Credit Card Fraud Detection using Logistic Regression, Linear Discriminant Analysis, K Nearest Neighbors (KNN), Classification Trees, Support Vector Classifier, Random Forest Classifier XGBoost Classifier.
  • Configuring LDAP and HA clusters to provide secure connection on platform.
  • Building credit card fraud detection system and detecting money laundering using customer service usage data present in graph databases, Tiger Graph and Neo4j.
  • Building Graph architecture using Neo4j for optimizing ML fraud detection model.
  • Performing Disaster recovery on Neo4j, setting up Autosys Jobs on POC, Dev, Prod servers and writing shell scripts for file watcher.
  • Backup and restoring data on Tiger Graph platform.
  • Designing and building Azure end-to-end data pipelines using Python, Azure data bricks and Azure data factory.
  • Working on Azure CI/CD Stories, Bugs, and Issue Management.
  • Raising ARM request to get access to Autosys, upgrading servers, monthly ingestion of data through Kafka pipelines, exporting and importing multiple graphs, defining User Privileges and Authentication roles, and working on Graph Studio User Interface.
  • Provide continuous testing with selenium testing which involves end-to-end, API and UI frameworks.
  • Used pytest framework to integrate several python tools for testing applications which involves xdist, mock, parallel, selenium, chrome, Firefox and provide html/xml reports.
  • Helped individual teams to set up their repositories in bit bucket and maintain their code and help them setting up jobs which can make use of CI/CD environment.
  • Effective maintenance of resources using ansible and VMware and monitoring the health every day.
  • Made use of Gradle and maven for building applications and written structured pom which can be consumed by Jenkins.
  • Built custom tools in python for generating email templates which are powerful enough to consume large amount of data and convey the testing results in a simpler way.
  • Built custom dashboards with Smashing/dashing open-source widgets to display Operational tasks on displays.
  • Installed a Helm chart from the rich library of existing charts .Iterated on a Helm chart that was used to build.
  • Used Tilt with Helm charts for deployment to Kubernetes clusters.

MACHINE LEARNING ENGINEER

Confidential

Responsibilities:

  • Designed and implemented Capital Acceptance Model for marketing team using LSTM, XGBoost, K-means Clustering for sales forecasting and efficiently increased sales by 24%.
  • Implemented 5 different models, ARIMA modeling, regressive models like Linear Regression, Random Forest Regression, XG Boost, LSTM to predict product sales using python.
  • Implemented Logistic regression model for flagging adopted users by targeting responsive customers based on service usage behavior.
  • Data Migration onto Hadoop(YARN) and writing pig queries for Job Scheduling.
  • Effectively scaled deep learning workloads on GPUs to increase throughput and reduce latency.
  • Monitored ML-based applications for performance issues with ML-centric capabilities like data drift analysis, model-specific metrics, and alerts using Data Robot MLOps.
  • Conducted ExploratoryDataAnalysis using Python and created reports using Tableau.
  • Analyzing and classifying customer reviews using NLP spacy and trained deep learning models on AWS Data sage Maker.
  • Performing Data Visualization using Matplotlib and familiarity with NumPy and Pandas packages.
  • Building deep learning models using Keras, TensorFlow and Pytorch for product recommendation and deploying model on k8(Kubernetes) clusters.
  • Wrote Ansible Playbooks with PythonSSH as the Wrapper to Manage Configurations of AWS Nodes and Test Playbooks on AWS instances using Python.
  • Created scripts in Python which integrated with Amazon API to control instance operations
  • Experienced in Automating, Configuring, and deploying instances on AWS, Azure environments and Data centers, also familiar with EC2, Cloud watch, Cloud Formation and managing security groups on AWS.
  • Manage the planning and development of design and procedures for metric reports.
  • Performed Data Collection, Data Preparation, Feature Engineering, Hypothesis Testing, Data Reduction and Data Mining to help Data Scientists developing mathematical and statistical models and working on graph database using TigerGraph, Neo4j.
  • Claim Management, Investigating and conducting studies on forecasts, demands, customer needs and capital of products.
  • Implemented Machine Learning algorithms for fraud detection along with SAS rules, filters join, and other processing logic required to finish out the workflow. Also, performed visual mining using SAS.
  • Overcame the limitations in market research innovation, by researching current data on industrial trends, positioning, and customer needs.
  • Implemented logistic regression model for flagging adopted users by targeting responsive customers based on service usage behavior.
  • Connecting the software running on the user’s local server and running the model on a cloud HPC cluster of the required size.
  • Trained deep learning deep learning models on neural network frameworks such as TensorFlow, Pytorch.
  • Designing and implementing testing frameworks for NVIDIA Deep Learning software stack.
  • Working with Big query ML(Machine Learning) on GCP platform to train and build models.
  • Collaborate and interact with internal GPU library teams to analyze and optimize and inference for deep learning.
  • Work in a distributed computing setting to optimize for both scale-up (multi-GPU) and scale-out (multi-node) systems
  • Design and implement new systems for parallel and GPU processing of Deep Learning networks and layers
  • Exploiting parallelism on GPUs to effectively scale deep learning workloads to increase throughput and reduce latency.
  • Performed Web scraping, Sentiment Analysis and Natural Language Processing.
  • Data wrangled and collected client data for further processing.
  • Performed ETL(Extract Transform Load) using Data Stage and working with Onperm Linux on AWS to train the models.
  • Built predictive model for client stay duration which improved the returns over 7% in revenue.

MACHINE LEARNING ENGINEER

Confidential

Responsibilities:

  • Trained ML models on Kubernetes (K8s) with Nvidia NGC containers using docker to allocate the resources.
  • Worked with Linux threading architecture and OpenMP to train ML models.
  • Implemented natural language processing techniques and information search using retrieval tools like Solr, Lucene, NLTK, OpenNLP to translate patient’s records into Risk adjustment and hierarchical condition categories and monitoring CI/CD pipelines.
  • Building Graph architecture to provide Real-time product recommendations.
  • Reduced time by comparing single GPU with multi-CPU solution on GCP(Google Cloud Platform).
  • Designed and implemented an email servicing model to generate automated suggestions on email inquiries.
  • Used Neo4j for Regulatory Compliance and keeping customer-employee tracks.
  • Maintain database performance and capacity planning for Cassandra.

JUNIOR DATA SCIENTIST

Confidential, San Jose, CA

Responsibilities:

  • Installation, configuration, integration, and management of HPC systems, clusters, operating systems, peripherals, and system interface.
  • Solved BAC (Binding Affinity Calculation) problem with enormous computing power on HPC nodes.
  • Collaborating closely with the architecture, research, libraries, tools, and system software teams to implement next-generation ML models.
  • Created a generalized data representation architecture that can be applied on different raw event company data.
  • Applying Classification Algorithms to segregate patients’ records based on summary of diagnosis.
  • Building predictive models, working with clinical experts (e.g., doctors) to translate their subject matter expertise into models, extract and manipulate data from multiple large data sources, and present her / his work to technical and non-technical stakeholders using JETSON TX2 GPU.
  • Also, used Anaconda Navigator IDE(Integrated Development Environment) to build models.
  • Used shiny, ggplot2, glue to visualize the data and to perform data wrangling.
  • Building software engineering pipelines to integrate with third party services to handle fraudulent activities.
  • Implemented ResNet, a computer vision-based pre-trained model to identify 12 different pathologies from chest X-ray using multiple GPUs.
  • Development and implementation of tools for continuous testing and for efficient GPU job scheduling.
  • Bypassing the kernel in HPC bare-metal environments through standard way and achieved highest bandwidth and lowest latency, which is critical for many MPI applications.
  • All the calculations necessary to train the ML model are performed inside the K8s cluster.
  • Worked on Kubernetes (K8s) with Nvidia NGC containers using docker.
  • Enabled K8s based workloads to allocate the resources, train models.
  • Working with Linux threading architecture and OpenMP.
  • Proficient at building robust Machine Learning, Deep Learning models, Convolution Neural Networks (CNN), Recurrent Neural Networks (RNN), LSTM using Tensor Flow and Keras. Adapt in analyzing large datasets using Apache Spark, PySpark, Spark ML and Amazon Web Services (AWS).
  • Ensure that the model has low False Positive Rate and Text classification and sentiment analysis for unstructured and semi-structured data.
  • Create and design reports that will use gathered metrics to infer and draw logical conclusions from past and future behavior.
  • Use MLLib, Spark's Machine learning library to build and evaluate different models.
  • Perform Data Cleaning, features scaling, features engineering using pandas and NumPy packages in python.
  • Create Data Quality Scripts using SQL and Hive to validate successful data load and quality of the data. Create various types of data visualizations using Python and Tableau
  • Tools and Techniques used: Python, R, Multi-Class Logistics Regression Classifier, Boosted Regression Tree, Random Forest, Association Rules, Support Vector Machine, Clustering Analysis, Collaborative Recommended System, Time-series Analysis, Tableau, JETSON TX2, Linux, Docker, Kubernetes.

RESEARCH ASSISTANT

Confidential, San Jose, CA

Responsibilities:

  • Designed Real time Pothole Detection technique using Python following ML models like CNN, RNN, MCNN in Linux environment.
  • Successfully setup LabelMe tool converting video to image and performing ETL(Extract, Transform, Load) process using Informatica.
  • Implemented Real time Wildfire Detection using Deep Learning models and fine-tuned the models using test results and statistical analysis.
  • Built Data pipeline using Kafka API to stream the data and store collected text in MongoDB server.
  • Performed Data Preprocessing (Missing Value Treatment, Outlier Detection, Data Exclusion, Feature Engineering) on data obtained from Yellow Pages.
  • Created reports and dashboards using Tableau to explain and communicatedatainsights, crucial features, models scores and performance of new recommendation system for restaurant data from yelp using Python, TensorFlow, Seaborn, Naïve Bayes, Random Forest, Regression Model.
  • Implemented “Fraud Judger: Real-World Data Oriented Fraud Detection on Digital Payment Platforms” using Graph Adversarial Network and performed extensive graph mining on Neo4j.
  • Designed and implemented cross-validation and continuous statistical tests on ML model.
  • Data Preprocessing (Missing Value Treatment, Outlier Detection, Data Exclusion, Feature Engineering)
  • Improved pre-existing loss forecasting model for the Capital Flex Loan product using ML modeling Techniques(Linear Regression, Logistic Regression, Count Data Regression, K-Nearest neighbors, K-Means Clustering).
  • Inferred customer ratings to build a proper model and uncovered latent features from customer datasets.
  • Improved the Internal rate of return(IRR) forecasting model for Clothing, Electronic gadgets, Furnitures using Naïve Bayes, Bayesian Data Analysis, Support Vector Machines, Random Forests, DIM RED: PCA/stepAIC, Regularization, Cross Validation.
  • Used Matrix factorization approaches to use Alternating Least Squares algorithm and conducted hyperparameter tuning to return the best recommendations possible.
  • Assessed the model’s performance by cross validation.
  • Tools and Techniques used: Linux, Java, Angular JS, Kubernetes, CNN, RNN, MCNN, R, Python, MATLAB, Naïve Bayes, Random Forest, Regression Model, Tableau, NLTK, ggplot2.

ASSOCIATE SOFTWARE ENGINEER

Confidential

Responsibilities:

  • Developed web-based application for a client in banking domain using HTML,CSS, JAVASCRIPT, REST API.
  • Accomplished on Python, C++, SQL, Web Technologies.
  • Worked on Graph database, Neo4j to build Recommendation systems.
  • Regulated monthly database reports, including user reports and systems information, to spot problems and ensured that all databases and support systems are working at peak levels in MySQL server.
  • Developed software solutions by studying user needs, existing systems flows, data usage, and work processes along with creation custom dashboards.
  • Worked on Python OpenStack APIS and used NumPy for numerical analysis.
  • Implemented rally OpenStack benchmarking tool on the entire Cloud environment.
  • Understanding Python files in OpenStack environment and make necessary changes if needed.
  • Worked as an SQL Developer and establish strong relationships with end users and technical resources throughout established projects
  • Worked closely with the development team to provide support for database objects (stored procedures/triggers/views) and performing code review
  • The SQL Server Developer will work closely with Application Developers to ensure proper design and implementation of database systems
  • Database administration including performance monitoring, access management and storage management
  • Partners with development teams/vendors to provide reporting on in flight software development projects.
  • Create, develop, and maintain SQL processes for channel performance data warehousing.
  • Closely working with development manager and onsite team to deliver solutions.

DATA ANALYST

Confidential

Responsibilities:

  • Designed and implemented cross-validation and continuous statistical tests on ML model.
  • Data Preprocessing (Missing Value Treatment, Outlier Detection, Data Exclusion, Feature Engineering)
  • Improved pre-existing loss forecasting model for the Capital Flex Loan product using ML modeling Techniques(Linear Regression, Logistic Regression, Count Data Regression, K-Nearest neighbors, K-Means Clustering).
  • Inferred customer ratings to build a proper model and uncovered latent features from customer datasets.
  • Improved the Internal rate of return(IRR) forecasting model for Clothing, Electronic gadgets, Furnitures using Naïve Bayes, Bayesian Data Analysis, Support Vector Machines, Random Forests, DIM RED: PCA/stepAIC, Regularization, Cross Validation.
  • Created reports and dashboards to explain and communicatedatainsights, significant features, models scores and performance of new recommendation system to both technical and business teams.
  • Used Matrix factorization approaches to use Alternating Least Squares algorithm and conducted hyperparameter tuning to return the best recommendations possible.
  • Assessed the model’s performance by cross validation.
  • Implemented Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM), Bidirectional LSTM, Gated Recurrent Unit (GRU), Auto encoder, Recurrent Neural Network (RNN) classifiers to categorize tweets along with emojis, into positive and negative comments.
  • Built a spam detection algorithm to remove bots and fake accounts.
  • Integrated real-time twitter review classifier web application using Keras web flask server with Bidirectional Long Short-Term Memory (LSTM) resulting in 99.58% accuracy over 5 epochs with a batch-size of 128 model yields.

We'd love your feedback!