We provide IT Staff Augmentation Services!

Text Analytics And Software Analyst Resume

5.00/5 (Submit Your Rating)

SUMMARY

  • Self - driven software developer with a public trust clearance, 2+ years in developing and productionizing J2EE and Hadoop-/Spark-based cloud applications. Recipient of the 2017 COBRA Hero Award, for proactive R&D and prototyping of application features resulting in 20% overall savings of time and budget.
  • 2+ years developing/engineering big data/cloud/Hadoop (Hortonworks) applications: HDP 2.5 - 2.7, YARN, Spark 1.6.3-2.2.x, Zookeeper, Hive
  • 2+ years real-time/streaming applications & batch processing: NiFi 0.6-3.0, Kafka
  • 2+ years document database: Lucene Solr 5.5 - 7.1
  • 2+ years NLP-based data science: Stanford NLP, TF-IDF
  • 2+ years machine learning algorithms & workflows: Support Vector Machine (SVM), Random Forest, Logistic Regression, Artificial Neural Networks (Deep Learning)
  • 2+ years on SQL and NoSQL databases: HBase, PostgreSQL
  • 2+ years on Windows 8 & 10, and 2+ years Linux Red Hat 6.x
  • 2+ years JVM languages: Java 8, Scala 2.10 - 2.12, Spring
  • 2+ years developing RESTful web services: Spring REST, Tomcat 8.5
  • 2+ years AGILE with Atlassian suite
  • 2+ years Git source control
  • < 1 year developing AWS applications

PROFESSIONAL EXPERIENCE

Text Analytics and Software Analyst

Confidential

Responsibilities:

  • Productionized Support Vector Machine (SVM), Random Forest, Logistic Regression, Artificial Neural Networks (deep learning), and Genetic Algorithms for document classification Spark 1.6.3 -2.2.x using both Java 8 lambda syntax and Scala 2.10 - 2.12.
  • Designed and constructed a document ingestion and cleaning system, with Stanford NLP techniques, Apache NiFi, and Lucene Solr 5.x - 7.1
  • Created Spring-based REST APIs for exposing the power of Apache Spark’s machine learning (ML and MLLib) to user’s fingertips, yet hiding workflow details of Apache Kafka, NiFi, and YARN
  • Architected system-wide features for auditing of document changes and user actions via ETL pipelines through NiFi and Kafka, and into NoSQL HBase databases and PostgreSQL databases
  • Managed and distributed back-end service-writing and machine learning tasks to a sub-team of four members to utilize their strengths, improve their weaknesses, and guide them to their career goals through AGILE methodologies and Atlassian (JIRA) products
  • Developed back-end systems for Technology Assisted Review (TAR) and Continuous Active Learning (CAL) in predictive analytics, resulting in 75% increase in performance of document analysis.
  • Boosted accuracy of binary classifiers by over 25% while analyzing and optimizing machine learning classifiers through Apache Zeppelin notebook and Python libraries
  • Engineered ETL and analytics workflows for email-threading within Neo4j graph databases
  • Administered multiple Linux- and Windows-based work environments through Red Hat 6 (RHEL) with OS virtualization via Hyper V Manager
  • Researched and developed preliminary pipelines for launching Apache Mahout computation via Spark execution on distributed GPUs through YARN on NVIDIA CUDA cards
  • Maintained and managed Git branch merges for machine learning and REST services

Back-end Developer

Confidential

Responsibilities:

  • Engineered ETL pipelines for digesting medical dictionaries and patient documentation for analytic work
  • Developed a workflow for parsing stored medical information and comparing medical terms to physician diagnoses
  • Designed a semantic search system that would provide a best-guess approach to similarities between medical terms using Random Index Vectoring (RIV)
  • Administered Apache NiFi and RapidMiner softwares onto AWS servers for building automated data ingestion workflows into MongoDB

We'd love your feedback!