Senior Software Engineer Resume
SunnyvalE
OBJECTIVE:
To become a versatile professional in the field of Computer Science (Software Engineer/Data Science) and to exhibit and enhance my problem solving, decision making and analytical skills.
PROFESSIONAL SUMMARY:
- 5+ years of IT experience as a Developer, Data Scientist experience using Hadoop, Apache Spark, Kafka, Scikit learn, Pandas, Numpy.
- Experience in feature engineering of Machine Learning models building various features from raw datasets and dimensionality reduction
- Experience in building machine learning and natural language processing data pipelines
- Excellent understanding of machine learning algorithms such as Naive Bayes, Decision Trees, Logistic Regression, Perceptron, Neural Networks,SVM, K - Nearest Neighbour
- Excellent understanding of classification and regression and supervised and unsupervised techniques such as k-means clustering, hierarchical clustering
- Excellent understanding of Hadoop architecture, Hadoop Distributed File System and various components such as HDFS, Name Node, Data Node, Job Tracker, Task Tracker, YARN and Map Reduce concepts.
- Experience in writing Pig Latin scripts.
- Experience in working with Apache Sqoop to import and export data to and from HDFS and Hive.
- Hands on experience in setting up workflow using Apache Oozie workflow engine for managing and scheduling Hadoop jobs.
- Good experience in configuring and working with Flume to load data from multiple sources to HDFS
- Proficient in programming with Resilient Distributed Datasets (RDDs).
- Experience in tuning and debugging Spark application running on both standalone and YARN cluster mode.
- Knowledge on Spark and Yarn and Pyspark.
- Working experience with large scale Hadoop environments in building and supporting which includes planning, designing, installing, configuring, performance tuning and monitoring.
- Have knowledge of Banking, Retail and HR possessing relevant experience in the same Domains.
- Conversant with UNIX operating system.
TECHNICAL QUALIFICATION:
Programming Languages: C, C++, Java, Python, R, Unix Scripting, Scala
Databases: SQL Server, DB2, MySQL, Teradata
Big Data Technologies: Hadoop HDFS, MapReduce, YARN, Oozie, Apache Kafka, Flume, ZooKeeper, HCatalog, Pig, Hive, Mahout, Apache Spark, Sqoop
NoSql Databases: HBase, Cassandra
Web Technologies: Spring MVC, Angular.js, React.js
WORK EXPERIENCE:
Senior Software Engineer
Confidential, Sunnyvale
Responsibilities:
- Developed a pyspark application over hive that generates log reports daily
- Developed Apache flume jobs to stream data from Java Messaging Service and Kafka to HDFS.
- Developed a kafka sampler that replicates and samples data between kakfa clusters.
- Developed machine learning models for personalization of customer segments.
- Developed custom feature engineering for personalization models..
- The software components are designed, developed and deployed in Confidential Cloud.
Software Engineer (Java/Big Data/ML/NLP)
Confidential, Los Angeles
Responsibilities:
- Designed and implemented Restful web services, web components and microservices
- Performed Data Migration ETL jobs using Spark and exploratory data analysis using Pig, Impala.
- Used Spark to implement scalable Machine Learning topic algorithms on large datasets
- Performed Information extraction using Elastic Search and Stanford NLP Toolkit and generated feature vectors for Statistical and Machine Learning models.
- Predicted future black box warnings from semi structured and unstructured datasets by co-relating signals from scientific literatures and adverse events (Life Sciences).
- Detected harms from boxed warnings using a combination of semantics, rule-based and natural language processing techniques such as POS tagging, NP chunking etc.
- Constructed a data pipeline that helps detect entities from scientific articles and modified the entity extraction process(C++) to help prepare training data and thereby helping in building classifiers using state-of-the-art Natural Language Processing techniques such as NER.
- Improved company nomination system between various entities of interest that nominates links between companies, chemicals and harms.
- Built a pdf reader that detects headers and body and extracts information from scanned and non-scanned pdf ’s efficiently.
Data Science Intern
Confidential, Kansas City
Responsibilities:
- Worked on Confidential ’s behavior and demographic datasets to provide insights that will help in targeted advertising and was involved in the design and implementation of A/B testing framework.
- Performed data preparation and exploratory data analysis and missing value imputation using external data sources like census data, social security data.
- Built a recommender system based on user behavioral data.
Operations Research Intern - Java, Hadoop
Confidential, Dallas, TX
Responsibilities:
- Developed a 2-tier client-server application tool using Java that helps in creating, maintaining and forecasting future set counts that tracks and measures freight activity.
- Worked on a track safety analytics project that helps in predicting the degradation of track geometry using historical inspection logs, freight information etc.
Internship - PHP, HTML5, CSS3
Confidential, Kansas City
Responsibilities:
- Provided technical expertise in evaluating/creating web solutions that meets user requirements/business needs.
- Ensured website remained consistent with organizational goals and implemented server side scripting to develop a responsive web page as well as client side user interface enhancements.
