Senior Big Data Architect Resume

SUMMARY:

I have sound theoretical and practical experience in Big Data high performance computing in distributed computing.
I am interested in proof - of-concept to development and production projects in Big Data and related work.
6 years experience in Big Data, open source Hadoop family, Hortonworks Data Platform.
Led team in POC to production healthcare solution.
Architect of big data solution using Kafka-Storm and Hadoop for semi-real time processing of EMR HL7 transactions, validation, transformation, registration and integration with legacy healthcare services.
1 year apache Spark development for graph optimization analytics. More than 10 years experience in Java development. Long experience in C#, C++. Moderate experience in Scala and Clojure.
Cloud engineer Benchmarking Big Data Accumulo in EMC cluster with Infiniband and AWS cluster of 1000 nodes
Market Data pipeline in real-time using apache Hadoop and integrate with legacy Teradata server.
Hadoop Map Reduce based mobile and sensor data processing in Big Data
20+ years as Computer Science professor at various universities in the US and at IIT, India

EMPLOYMENTS:

Senior Big Data Architect

Confidential

Responsibilities:

Lead architect for Apache Kafka and Storm based health care data ingestion and processing solution, integration with legacy applications such as Mirth, ICW MPI and ICW PXS.

Associate (Cloud Computing)

Confidential, Seattle

Responsibilities:

Apache Storm based streaming analytics. Benchmark test of Apache Accumulo Big Data on 1000 node EMC/AWB and Amazon EC2 clusters.
The entire planning of the benchmark, the configuration and the management scripts to set up cluster was developed by me.
Developed reference architecture for OFR, HHS and other government clients using Hadoop, HBase, Accumulo, Pig, Hive, Sqoop.

Hadoop Architect

Confidential, New York City

Responsibilities:

Design and development of ETL pipeline using Hadoop, Pig, Sqoop The development work for integration of Teradata and MySQL databases with HDFS using Hadoop Map Reduce were done by me. I led this project as the architect.
Hadoop Map Reduce code for parallelization of continuous moving data objects (CMDO) using sensor data and GPS data for spatiotemporal data was developed by me.

High Performance Computing (HPC) Developer

Confidential, Bellevue, Washington

Responsibilities:

Optimizing MPI version of Epidemiological Simulation Code on 1000 node in-house Windows Cluster using C++/MPI2, performance tuning.

Sr. Solution Architect

Confidential, Redmond, WA

Responsibilities:

Performance & reliability - incorporating features to improve reliability and performance.
Conducted investigations in OS resource based issues.
Reverse engineered Netcraft server uptime analysis (in Perl).

Senior Software Engineer

Confidential

Responsibilities:

Responsible for geographic mirroring for HACMP cluster of RS6000/AIX4.1 and AIX4.2
Spec features; maintain Confidential for AIX 4.1 and 4.2