Big Data Architect/consultant Resume
TN
SUMMARY
- Over 17+ years of Professional/Entrepreneur experience with Start - ups, Medium and Large Organizations in Professional Services Engagements, Pre-Sales, Solutions Architecture, Data Science, Technical Leadership and Product Definition/Management/Strategy
- Scale technology, engineering, and process.
- Design, Managed, Tuned and Implement Enterprise scale, highly transactional, complex, strategically important, business critical Database/Application.
- Hands-on experience with common machine learning algorithms with real time streaming data. Extract data, analyze information and communicate insights to non-technical stake holders.
- Hands on programming experience in multiple technologies, frameworks and methodologies and on varied platforms and technology stacks.
- Advanced knowledge & experience with all aspects of Hadoop ecosystem.
- Designed multiple Big Data Systems using (Scoop, Pig, Hive, Oozie, Kafka, Hue, Spark, Zeppelin, Atlas, Solr, LLAP, etc.) Messaging technologies Nifi, Rabbit Mq.
- Articulate and adopt latest in technology. Refactor messy code/config and vastly improve key algorithms. Multi-paradigm Programming (Java, Python, SQL, R). API. Services. Concurrent Programming. Self-taught Continuous Learner. Design for High Performance & Scalability with 100Ks of concurrent usage. Lead and Resolve 'War Rooms' during major crises.
TECHNICAL SKILLS
Hadoop Eco System: Hdfs(6+) ; Zookeeper (6+); OOzie (6+); Kafka (4); Spark(4); HBase (6); Pig (6+); Hive (6+); Spark Streaming; Apache Sentry; Impala; Nifi; Presto; Pentaho (8+).
Machine Learning Algorithm’s: Convolutional Neural Networks (CNN/ConvNet); recurrent neural network(RNN (LSTM, GRU); GAN(Generative Adversarial Net); SVM(support vector machine); MLP(Multi-Layer perceptron); k-NN; Linear Regression; Decision tree; clustering; Adaboost; reinforcement learning; babble lable; Cfa.
Libraries and Frameworks: OpenCV; LibSVM; Keras; NLTK (Natural language Toolkit); Tensor flow; sklearn; CUDA.
Deep Learning Networks: inception; Resnet; SSD; YOLO.
Database: Oracle DB(15+); Oracle Rac (10+), MySQL(8); PostgreSQL(6+); Microsoft SQL Server(10+);
NoSql: Cassandra(4); Mango DB(5);
Continuous Integration System’s: Jenkins(7); Cruise Control(5);
Config/Orchestration: Zookeeper(6); Puppet(5); Salt; Ansible; OOzie; Pig;
Build Systems: Maven; Ant; Perl;
Source Control Systems: StarTeam; Svn; Vss; Git;
DataBase Tool’s( 15+): Data Pump; Sql Loader; Rman; Data Guard; Grid Control (OEM); Oracle Streams; oradebug; Explain Plan; SharePlex; Hibernate; Swing;
Application Server’s: Tomcat(8); JBoss (5); WebLogic(8); Oracle Fusion Middleware;
Reporting Tool’s (10+): Oracle Reports; Business Objects; Infomatica; Pentaho 6.1; Sql Server Reporting.
OS: Redhat Linux 7; Oracle Linux 7; Windows 2012/14;
CRM: Siebel 7.7(3); Assignment manager(3); Siebel Eim; Workflow Administration;
Monitoring Tool’s: Nagios (7); Ganglia (5); Site Scope(4); pager Duty; Dynatrace;
PROFESSIONAL EXPERIENCE
Confidential, TN
Big Data Architect/Consultant
Responsibilities:
- Object detection, classification and segmentation, object detection by implementing new mathematical algorithm (patent in process).
- Developed data pipeline to perform pre-processing, NLP on collected data for sentiment analysis.
- Developed real-time data pipeline using Kafka, spark streams, Hbase.
- Designed and built data pipeline which stream data from client apps to server to Kafka Consumer to HDFS. Spark jobs to read HDFS data by using Spark-SQL for stream and batch processing jobs.
- Designing data queries against data in HDFS environment (hive, hbase)
- Apache Kafka with storm connectors to consume live streaming data, data lake creation for building RWI (Real World Intelligence) application and also used hdfs as consumer.
- Sqoop data from oracle/postgres/sql server database to HDFS and flume from sensors and web log to HDFS for malware analysis.
Confidential, Nashville,TN
Principal Big Data Architect
Environment: Oracle 11g/12c/Rac, ASM, Pentaho, Kettle, Rman, Solar Winds, Pl/sql, MySQL 5.6, MS SQL Server 2012/2014, Shell Scripting, Windows 2012/2014, Java, AWS, Hibernate, Spring, REST, PostgreSQL 9.2/9.6, Elastic Search, DynaTrace, OpenCV, PyMC, Hadoop, Kafka, Spark, HBase, Nifi, Mahout, Solr, Cassandra, MongoDB, In-memory, Map reduce, Pig Script, Hive queries, Nagios, Python, Perl, Hosted Graphite.
Responsibilities:
- Machine learning and deep learning methods for predictions.- Demonstrating the understanding of statistical inference and model comparisons, and in feature extraction
- Manage horizontal Architecture & Principal Engineers Team. Built Product Data Mart and implemented a consolidated Analytic Cloud on Hadoop. Performance/Availability/ Scalability leadership. Data capacity planning and node forecasting.
- Installed and build a Hadoop Cluster from scratch. Configure and tune the Hadoop Environment to ensure high throughput and availability.
- Implemented Multi-tenancy, Integrated Security, and Authentication with Kerberos and LDAP integration via PAM and ACL.
- Successfully created and deployed application on (AWS) Amazon managed instances in our VPC where each deployable micro service has an application Contain many environments, Environments deployed in appropriate subnets for Auto scales.
- Designed and developed Data Ingestion, Data processing and Data export and visualization frameworks.
- Installed Oozie workflow engine to run multiple Hive and pig jobs.
- Installed and Configured Hbase by installing Hbase Master and Hbase Regional Servers.
- Performed benchmarking on the Hadoop cluster using different bench marking mechanisms.
- Tuned the cluster by Commissioning and decommissioning the Data Nodes.
- Deployed high availability on the Hadoop cluster quorum journal nodes.
- Implemented automatic failover zookeeper and zookeeper failover controller.
- Data Ingestion: Involved in importing and exporting data from local/external file system and RDBMS to HDFS using Sqoop. Python scripting for slicing and dicing of data and Automating process.
- Designed and built data pipeline which stream data from client apps to server to Kafka Consumer to HDFS.
- Spark jobs to read HDFS data by using Spark-SQL for stream and batch processing jobs.
- Apache kafka with storm connectors to consume live streaming data, data lake creation for building RWI (Real World Intelligence) application and also used hdfs as consumer.I
- Installed Apache Solr cloud on cluster and configured it with Zookeeper, index documents using hive-Solr storage handler to import different datasets including xml, csv, and json.
- Build Kafka consumer to do spark streaming for business transformations of application log files.
- Design and build logging solution using Elastic Search, Log stash and Kibana.
- Design, developed and maintained and tuned highly available Data ware house using Postgres, Pgpool, Kettle (ETL) to execute 90000 view objects delta refresh for day to day reporting.
- Loading data from different servers to s3 bucket and setting appropriate bucket permissions.
- Collaborated with the infrastructure, network, database, application and BI teams to ensure data quality and availability.
Confidential
Performance Architect
Environment: Oracle 11g/10g, RAC, Oracle Apps 11i/12, SAP, SQL, PL/SQL, Red Hat Linux, TOAD 9.2.5, MySQL 5.1, Streams, Shell Scripting, AIX, ANT 1.6, Cruise Control, Star Team, Java, JSP, Hibernate, Spring, Struts, JUnit, GWT, Instant up.
Responsibilities:
- Data Strategy, Modeling, Code Profiling and Optimization; Worked directly with Engineers, Customers and other Stakeholders (PMs, Marketing, Pre-sales etc.). Led many teams e.g., hands on coding to improve installer performance from 18 hrs. to minutes (Java, Oracle).
- Overall Performance & Data/Database Architecture; Develop/code (Java, SQL, PLSQL) for key customers Integration (GE, Fidelity, Traveler's etc.).
- Responsible for architecting and delivering new high profile features for the flagship product.
- Design and implemented real time reporting solution for small and medium size customer who cannot afford to run enterprise edition of database.
- Driving team innovation by prototyping new ideas.
- Evangelizing agile development process.
- Taking features from early planning stages through user stories and leading team to successful implementation.
Confidential, San Francisco
Sr. Architect
Environment: Oracle10g/9i, RAC, Java, Sql, Pl/Sql, Toad, Web logic, Oracle Applications 11i, Sun Solaris 8/9/10, Shell Scripting, Sql Loader.
Responsibilities:
- Designed, implemented and delivered enterprise-level, full life-cycle Business Intelligence platforms and solutions for various Banking and Financial organizations including Commercial Cards, Mortgage, Commodities and Derivative Trading data infrastructure, security, applications, architecture solution, design and delivery
- Lead BI development using Business Objects Xi R2, built KPIs & dashboards.
- Consolidated various reporting tools inherited from Mergers and Acquisitions.
- Led the implementation of enterprise wide taxonomy and data dictionary.
- Actively participated and contributed to the data governance and architecture council.
- Delivered Data Warehouse data models and designs for Master Data & Metadata Management
- Delivered data visualization techniques other intuitive approaches to convey business insights.
Confidential, Cupertino CA
Development Architect/Team Lead
Environment: Siebel 7.5.3, Tools, Siebel EIM, Siebel Call Center, Oracle 10g/9i, Sql, Pl/sql, Java, ETL, Shell Scripting, Windows 2000, Sql Loader.
Responsibilities:
- Led 20+ member team for sustaining Confidential 's Siebel based CRM solution with a 45k user base and global usage. Designed and developed real time interfaces between Siebel and Confidential 's legacy systems. Created customized Update Strategy, Transforms and Lookups to massage and eliminate unqualified data from being loaded into Siebel EIM tables. Re-factored batch interfaces to support high volume interfaces.
- Build java based application to support backend business data transformation for partners.
- ETL data from legacy system to staging table for informatica crystal reports.
Confidential
Developer
Environment: Oracle 9i /8i, SQL, PL/SQL, Forms (4.5), SQL*Plus, Windows 2000, Shell Scripts, UNIX, Report (2.5), Sun Solaris 9.
Responsibilities:
- Develop backend pl/sql code for form based application.
- Developed and implemented utility solutions for user community (data cleansing, parsing, and loading)
