We provide IT Staff Augmentation Services!

Sr. Big Data Architect Devops/cloud Resume

New York City New, YorK


Big Data Ecosystem/Programming Languages: Hadoop, HBASE. HDFS, Flume, Scoop, Oozie, Yarn Release 2.0, Mesos, Zookeeper, Storm, Kafka, Zookeeper, Spark 1.5(SQL/Core/GraphX/MLLib), Neo4J, MongoDB, Cassandra 2.0, Riak, Talend Open Studio 5.6, Pig, Hive, lambda architecture - batch versus real-time, Aerospike RDBMS ETL, Cassandra 2.0, Riak, MongoDB, R, Java 7/8/Eclipse - Helios/Juno, Python 2.6/3.3, AeroSpike; Avro/Parquet/JSON data compression, Kerberos 5, Java cryptography (AES-256 bit), Java and RestFul API; Elastic Beanstalk with Docker; Chaos Monkey

Cloud Technologies (AWS): EC2/EC2 Container Service(Docker), Elastic Beanstalk, Lambda(event driven architecture); Storage - S3, Elastic File System, Direct Connect, Route 53, CloudWatch(autoscaling), CloudFormation, CloudTrail, Config, OpWorks, Identity and Access Management, Inspector, EMR, Kinesis, Machine Learning algorithms, API Gateway, AppStream, enterprise AWS cloud design patterns, AWS (IoT)

Dev Ops: Splunk, Nagios, Ganglia, JVM Tuning, MemCache, Tachymon, Project Tungsten, scripting - bash, awk, sed, shells, Linux kernel / io device modifications, Puppet/Chef/Shellshock, Ansible Tower, Linux - Red Hat RHE 6, CentOS, Ubuntu 12.3, Fedora 15/16/17; RANCID, Cacti(Graph), lldpd, IPerf, MultiHost SSHWrapper, Jenkins, Hudson CI, Bluepill, Capistrano, Bcfg2, Supervisor, Graylog, runit, Squid, snort, system, netstat, iostat, vmstat, ltrace, strace, ftrace, perf, tcpdump, sar, man, Takapi, GitHub, ssh, Cygwin, WinDiff, putty, Maven, Ant, DTraceToolkit(Netflix)


Sr. Big Data Architect DevOps/Cloud

Confidential, New York City, New York


  • Created variation of the lambda architecture consisting of near real-time using Spark SQL; Spark cluster 1.4 consisting of 25 nodes running with 200Gb ram/24 Tb, about 1Pb of market data spanning 2000+ stocks with market ticks, number of shares traded, stock price, market ticks over 10 year period; Apache Open source version with Mesos job scheduler; developed, designed tested Spark SQL clients with Scala, PySpark and Java clients; selected best of breed in terms of time-to-deliver; created |Spark Contextx, DataFrames for Cassandra backend and HDFS clusters; designed multi-cluster JVM tuning techniques with Jprobe, Nagios/Ganglia for node and cluster tuning; tested Azul Technology Zing versus nation JVM concurrent mark and sweep algorithms; collaborated and advised data scientists for optimum in-memory algorithms using Spark MLlib cluster/interval analysis, pattern recognition, normal versus binomial distribution analysis; probability density and confidence experiments of DJ 30 versus SP 500; custom experiments with SP 500 indices with short term SP 500 futures; custom Spark applications designed with accumulators and broadcast variable to gain 4-5% in lowering network “chatter”
  • Spark cluster in dev environment benchmarked with the Google page-rank algorithm; set up benchmark based on the Daytona sort as reported by the University Of California Berkeley using 1 and 5 Tb; algorithmic comparisons f GraphX versus Neo4J of company ownership of Fortune 500 board of directors - business relationship connectivity analysis; tested Zeppelin(Spark UI) and Tachymede(off JVM memory management) options of Spark; utilized Spark Scala/Java API/Github; custom design and verification of Spark machine learning algorithms - feature extraction, pipelining, regression analysis, dimensionality reduction (PCA and SVD), k-means clustering
  • Comprehensive design, discovery, analysis of the SP Capital IQ software, infrastructure, analytics, hardware in conjunction with the internal architecture review board - concerns of duplicate service calls, improvement and enhancement of existing SLAs to determine, document inaccurate stock quotes and improvements in real-time calculations from the legacy Soalris 9 Unix servers(200+); established comprehensive migration plan to a Red Hat Linux(100+) server infrastructure, incorporating complete software stack redesign; collaborated with the EA review board for establishing a IQSF(Intelligence Quotient Service Framework) to cover all mutual fund, bond, equity instruments for corporate, munis, government fixed income instruments via a SOA REST API;
  • Weekly meetings with the SOA governance board utilizing the Websphere 7.0 SOA repository(WSRR); detailed service call documentation for input/output message passing, sample service call usage, error code dictionary(systemic, application based, 4000 different financial quotes services with integrated algorithmic dictionary) based upon landmark treatise4 volume set Encyclopedia of Quantitative Finance; established with collaboration of the EAB a comprehensive data dictionary of financial calculation artifacts based upon puts, calls, spreads, European, Asian, American style options, cross correlated with the type of risk algorithm used, vinomial, Black-Scholes; applied enterprise architecture “best of breed” methodologies of discrete modularization, separation of business versus system logic, detailed verification and documentation of existing 800 different application modules by operating system, programming language, frequency of operations runs, relational databases, feeds into data warehouses; established near-term milestones and accountability matrix of market data applications collaborated with security architect with state-of-art development of a custom AES 256 cipher key for corporate wide Confidential of securing customer services for market quotes; comprehensive review, modification and enhancement of over 500 SOAP service calls to REST API service calls; established and created SOA service call directory(on-line) for bid/ask/rate spreads for commodities - gold, silver, platinum, palladium futures,assisted peer architect for identifying use cases for Riak and MongoDB - annual reports, 1.2 M pages in Adobe text, searchable by financial keyword - asset, liability, receivable, payable, shares of stock
  • Successful integration of Cassandra 2.0 distributed logger; very high volume - supports the S&P 500/Dow Jones configured Cassandra.yaml to support 200 virtual nodes with default MD5 hashing algorithm; installed Cassandra ring with automated page scripts, CQL 3 with Python 2.6+, plus clustering (200 page installation and maintenance guide for offshore team); created various POCs for Windows C# and Java client for creating and altering keyspace and column families; utilized nodetool and cql to rebalance SSTABLESmemtables, commitlogs; diagnosed Cassandra problems by setting Log4J Debug mode for detailed tracing and analyzing Cassandra deferred reads and writes; designed and performed various benchmarks involving utilized Linux system commands to analyze Cassandra Java daemon - sar, iostatVmstatwith Python 2.6 and Perl 5 scripts; designed and ported csv batch files between Cassandra keyspaces and MySQL with Hadoop module(MapReduce facility); set up multi data center Cassandra ring topology for fault tolerant between South Brunswick and 55 Water Street with Gigabit Ethernet via Ciena Network
  • Service Ethernet 3190 optical switches; adjusted replication factor(s) for rack affinity topology; worked through numerous issues involving the JVM, jdk 1.7 and Cassandra operational parameters; Cassandra production ring of 60 nodes can absorb 3000 writes per second from 20+ market data aggregate feeds - domestic and international(Reuters, Nikkei, Paris Bourse, German DAX 100, Hang Seng Indices); recommended and delivered various technical strategies for file system performance with key and row caching; adjusted write consistencies to gain optimal low latency write operations; utilized nodetool to analyze Cassandra performance and adjusted the Cassandra yaml file for optimal performance and load balancing; responsible and technical review/factoring of schema design, API client deployment, administration(SSH keys)integration between Cassandra and Hadoop(Cloudera Brisk) for POC
  • Customized Map-Reduce jobs consisting of multiple HBASE tables using InputFormat Java classes, ptimized M-R jobs by using partitioners for 1 - to-many joins, saving execution time; designed and tested reliability of M-R jobs using unit testing in the HBASE/HDFS dev/qa platforms, unit testing on Mappers, Reducers and integration testing of Mappers, Partitioners and Reducers, designed reporting metrics with job counters across the distributed HDFS logs, instituted best practices for defensive programming;
  • Set up OOZIE automated job tasks/streams for ETL imports from Oracle batch files into the HDFS data artifacts for market data(bonds, equities) 20 Gb nightly OOZIE job, followed with M-R jobs using the OOZIE coordinator, bundle and EL(expression language) for parameters - stock symbols, SP 500; designed custom OOZIE job control options with the OOZIE Java API
  • Managed, configured, tuned and continuous deployment of 80 Hadoop nodes in a Red Hat Enterprise edition 5; configured via the AWS console for 2 medium scale AMI instances for the Name Nodess, 78 large scale Data Nodes with 8 Intel i5 cores,3.5 Tb of disk and 350 Mb for JVM per Data Node; automated deployment and Linux system configuration via Chef; utilized 25 different dev op tools to log, debug, discern diagnose performance problems at the database level, Linux daemon level, networking level; set up real-time alerts with custom scripting via awk/fgrep/grep for kernel thread utilization; JVM tuning and garbage collection of short versus long lived Java objects on different generation heap spaces with due diligence on “stop the world gc() algorithms, “mark and sweep”; Chef automated deployment on qa Hadoop cluster of 80 nodes (mirror of prod Hadoop cluster); deployment and configuration of 20 Hadoop nodes on AWS AMI Linux instances of medium size with 40 Tb of market data with 3X replication factor; installed, configured, bootstrapped the Nagios plug-ins(/usr/local/nagios) - SNMP, CPU, memory, disk, Check MK, Nagios sensors; downloaded Zookeeper tarball, configured Zookeeper ensemble of 3 nodes in standalone and multi-node cluster; established Java based shell; reconfigured Zookeeper znodes - ephemeral, sequential and persistent nodes; implemented custom logs for ZAB Zookeeper Atomic Broadcast; implemented a Zookeeper Watcher interface(Java API);
  • Installed, configured Ganglia - gmond, gmetad, gweb, set up multicast/UDP topologies and designed RRD files for high IO demand; set up the Web interface for grid/cluster/physical/host and node views; set up Ganglia advanced metric monitoring and debugging
  • Spearheaded the POCs for the AWS ecosystem via the AWS Management console, S3 buckets, security - multi-factor authentication, access keys, X.509 certificates, Eclipse ID plug-in. emphemeral/persistent storage options - Linux and Windows AMI instances, private subnets, designed and deployed Amazon CloudWatch, IAM, Elastic BeanStalk, AWS Simple notification; architected various cloud computing and service design patterns – snapshot, Vagrant, high availab ility – multi-server- floating IP; processing static data – private data delivery, direct storage hosting; patterns for uploading data – write proxy pattern, state ssharing, cache proxy pattern; cloud patterns for operation and maintenance - bootstrap, cloud dependency, stack deployment, weighted transition, hybrid pattern;
  • Analyzed t radeoffs for high availability of zones for fault-tolerance versus high availability; set up alarms for CloudWatch for recovery of a failed Linux server, and auto-scaling for guaranteed SLA’a for Linux servers for real-time streaming analytics via Kinesis; analyzed RTO/RDO availabilities for virtual servers for time-lapse of recovery scenarios; established a common network host naming convention with Route 53 with Class C address/VPC subnets; accessed from GitHub Chaos Monkey(Netflix) for arbitrary host/network high latency performance problem injections into a custom Dev Hadoop/NDFS cluster(10 nodes) with subsequent post enterprise engineering efforts to monitoring HA via Ganglia; collaborated with sr Web developers for custom
  • Web applications – AWS Elastic Beanstalk with multi-container Docker financial applications – custom high perfoming stocks in high technology and health care stocks; deployed Docker applications via custom Chef scripts onto dev/qa Linux AMI instances on AWS cloud

Corporate Security Expert

Confidential, Harrisonburg, Virginia


  • Comprehensive review and analysis with a complete top down assessment of corporate records retention, storage, destruction policies; complete review of all infrastructure artifacts – databases, middleware, firewalls, DMZ,\ network routers, subnets, honeypots, SSO/LDAP configurations, hardening and rotation policies of corporate and extgernal users of rosettastone.com, Web/Apache server/Ubuntu 11/12 kernel hardening/patch reinforcements; top down review, design and rollout of 3 million customer Visa and Mastercard numbers state-of-art encryption strategies – two keyTriple DES, Skipjack, Confidential advanced encryption standards and recommendations; review of all corporate email systems for virus and SPAM control, revised strategies and techniques for external and internal pen testing; comprehensive review and determination of “need to know”, corporate employee access of various company applications for accounts payable, corporate treasury, IT, HR/payroll, building maintenance and physical plant access, corporate IP renewal and sun setting, marketing and sales, global operations in 50 different countries for currency exchange, foreign payments and auditing, field activity reporting; instituted quarterly ethical hacking procedures, reporting, analysis and follow up IT engineering endeavors including establishing a corporate security lab to test the latest in pen tests for Windows, Linux, MacOS and Android/Apple smartphones and tablets; instituted a corporate wide systems responsibility and charter for hardening 4000 company laptops for common encryption/decryption procedures to prevent internal software program theft; instituted and rollout of Kerberos 5 for internal security/ticketing for all J2ee applications running JBOSS 6/6/1 cluster servers for QA and production environments

Enterprise Design Architect

Confidential, Minneapolis, MN


  • Launched and promulgated custom business rules engine framework, consolidated and interviewed key SME’s on pharmaceutical rules and medical conditions based on the National Drug database and PDR’s for traditional, HMO and PPO membership based on medical history, lab tests, co-pay criteria, Medicare part C and D; over 5000 rules created; researched into the IEEE/ACM IT repository for state-of-art algorithms based on the modified Rete II and Rete III algorithm; intiated and set up RFP to IBM. Oracle and Open Source Drools,to determine “best of breed” technology for rules engine, based upon dashboard capability, increased parallelism of rule/decision making capabilities, ease of use transformations from business use cases; refined syntactical and business rule exception handling and reporting; set up point selection criteria on which rules engine seamless integration with existing ESB/messaging backbone; created and designed the business rules “request and response” asynchronous message flows; created and established key performance indicators and benchmarking criteria for handling cascading style decision making graphs and exposing duplicate and redundant logic; architected the logical extensibility for decision making logic in a meta-language business rule repository for SMEs and business analysts to research, modify at from a team level perspective; spearheaded the initial and subsequence POCs for Drools as well as from IBM iLogJrules; compared and contracted the Java API for creating the logical business request and response payload and error handling;
  • Comprehensive review of all retail insurance process artifacts, rules engines, message buses, business transformation models, security enforcement of HIPPA /HL7 relating to scrubbing patient data, review of over 5000 + insurance policy due diligence of health and sickness criteria; developed the Aetna Comprehensive Insurance Screening Framework(ACMSF)based upon the precursor of the Affordable Health Care Act; integration of the the Kerberos 5 authentication and adjudication policy audit server tracking 3 mil+ inquiries into PPO/HMO/Medicare customers; ACISF built according to the TOGAF 9 methodology; bi-weekly meetings with key executives and stakeholders from the Aetna Enterprise Architure
  • Review Board for reporting and software and infrastructure component resilience, security, fault-tolerance, performance metrics and SLA’s (4 month effort with business constituencies) resulting in 250 pages of schematics with a 10 EAF steering committee; successful integration into Tibco and Websphere SOA Orchestration server; extensive utilization of best practices of various enterprise integration design patterns for message proxy, modified “spoke and wheel” topology for QA and production messaging frameworks across corporate messaging bus; integrated REST service APIs(over 400+) serice calls for insurance policy look ups, claim processing, special APIs created for high speed lookups for insurance actuary tables(Gigaspaces XA) in-memory cache

Credit Default Swaps Trading Architect

Confidential, Pennington, NJ


  • FIX 4.5, credit default swaps, fixed income trading, Dodd-Frank compliance; Monte Carlo risk analysis/payoff matrix scenarios; HFT custom algorithms analysis and design; custom Java/C++ software for multiple precision routines(up to several hundred places) interest calculations, factorial/Fibonacci series
  • Spring MVC/Acegi, Weblogic 10, custom Java/C++ v 1/Boost/STL software; Red Hat Linux/Solaris 10; Hudson CI/Jenkins/Maven; DB2 UDB 8.0; custom design pattern(s); PVCS; JProbe; Python/Jython scripting; Gemfire data caching; SAML 2.0/SSO/Java private/public key cryptography/X.509 digital certificate/passkey/passphrase generation/management/”honeypot” DMZs

Hire Now