Data Scientist Resume , Alpharetta, GA - Hire IT People

SUMMARY

Data Science Professional with 5+ years of Industry Experience (3.5 years in a Silicon Valley based MNC and 1.5 years in Current Job). Data Science Capability and Domain Expertise in Machine Learning, Big Data, Research & Development, End - End Data-Architecture, Predictive Analytics.

TECHNICAL SKILLS

Programming Languages / Data Wrangling: Python, R, Java.

Artificial Intelligence: Cortical Learning Algorithm.

Machine Learning: Random-Forests, Clustering (K-Means/Hierarchical), Regression Analysis, Text mining & Unstructured-Data Analytics, Spark MLLib.

Cloud and Big Data: Apache-Spark, Spark-Streaming, Spark-SQL, Cassandra, PySpark, Scala-Spark.

Data Visualization: Tableau, Adobe Illustrator, RAW.

PROFESSIONAL EXPERIENCE

Confidential, Alpharetta, GA

Data Scientist

Responsibilities:

Conceptualized and Designed a Particular Real-Life Equivalent Business Use-Case (Healthcare Data-Analytics): Based on Real-Time sensory (vitals) readings (huma heart rate, activity, temperature, orientation, acceleration etc), teh current physical activity of teh person is predicted (Apache Spark Machine Learning Library: Spark-MLLib using algorithms Decision Tree / Random Forest Models) and teh corresponding Anomalous behavior of teh heart-rate is detected through teh Numenta Cortical Learning Algorithm. Spark Code Development was done both in Scala and PySpark.
Major Contributor in teh Design, Development and Implementation of teh Backend Architecture of a Scalable Real-Time Streaming Big-Data / Analytics Platform. Data is aggregated from IoT Devices as Event Streams and Injested via teh message brokers in Kafka (Data Bus Layer). Teh Data is tan processed (Compute Layer) where Spark Jobs (Spark-Streaming) calculates Anomaly Detection metrics and Activity Prediction Features. Teh results are tan stored in Cassandra DB from where teh services reads it (Service Layer), Jsonifies teh data, directs teh REST Response to a REST Endpoint from where teh UI Code consumes teh service. In this framework, Cluster Management is handled by Mesos and teh base environment was in Python. Teh Entire Framework is hosted in a combination of AWS and Heroku.

Confidential

Software Engineer

Responsibilities:

Mentored, Managed and Lead a Group of 8 in-house engineers (San Jose,CA / Bangalore,India) to develop a Dash-Board based on K-Means and Hierarchical Clustering implemented in Java to keep track of and identify potential Network Profile Key Performance Indicators and User Profile Key Performance Indicators.
Designed and Programmed a Machine Learning Platform based on Random Forests using R dat predicted brand impact and halped accelerate sponsor investment by quantifying and communicating teh value and TEMPeffectiveness of sponsor spending.
Coded a Deep-Learning component (Neural-Nets) using Python and libraries like Theano to correlate Image Data to decipher potential advertisement opportunities.