- 20 years of Software development experience specializing in Big Data, Data Warehouse, Java/J2EE and Internet Technologies with a wide range of applications including eCommerce, Content Management, Portal Applications, Recommender and Data Analytics Systems.
- Extensive knowledge in Big Data Management Platform using technologies in the Hadoop eco System and AWS Cloud environment.
- Experience migrating a Big data platform from an On - premise environment to the AWS Cloud.
Big Data: Hadoop, YARN, HDFS, MapReduce, Hive, Spark, Impala, AVRO
Cloud: AWS EMR, S3, EMRFS, DynamoDB, CloudWatch, CloudFormation, Redshift, Presto
Java/Web Platform: Java/J2EE, EJB, Servlets, JSP, Swings, JMS, Web Services, JAXB
Search: Apache Solr, Apache Lucene, Elastic Search
NOSQL: Membase/Couchbase, MongoDB, MapDB
Application Servers: Weblogic, Oracle Portal Server, JBoss, Tomcat
Database: Oracle, SQL Server, MS-Access, Foxpro
Build & Configuration: Maven, CVS, SVN, GitHub
Project Management: Rally, PVCS Tracker, JIRA
- Key contributor in the Cloud migration efforts. Involved in the data migration activities which included data preparation, data export and validation. Fine-tuned various aspects of the migration processes.
- Lead the Performance optimization team responsible for achieving the SLAs.
- Evaluated various AWS ec2 instance types(M4/C5/I3) to come up with a cost-effective and optimal solution with the correct number of cores, memory and network bandwidth to match and improve the processing times.
- Provided Confidential on the optimal number of core nodes and task nodes to use in the transient cluster resulting in cost savings.
- Benchmarked the existing system using various file formats (RCFile, ORC, Parquet), file systems (S3/HDFS), compression techniques (Snappy/Gzip/Bz2) and configuration settings (Hadoop/HDFS/Hive/Spark) to provide program level Confidential .
- Optimized the bi-directional data transfer process between S3 and HDFS improving the processing time by 3X.
- Fine-tuned Presto server configuration settings resulting in 10x improvement for queries.
- Provided solution to AWS specific challenges like s3-dist-cp 503 slowdown errors, EMRFS Sync errors, Hive metastore inconsistencies, RDS MySQL timeout errors, Queries with longer initialization times, etc.,
- Evaluated Databricks platform including Controller features and Notebooks to provide Confidential to the program and mentor the development team.
- Migration was a big success with the program going live in January 2019, meeting and beating the SLAs with 100% coverage.
Big Data Engineer/Solutions Architect
- Develop MapReduce programs assisting Data Ingestion, Analysis and Investigation activities.
- Automated creation of Data Dictionary that assists Business Analysts, Data Modelers and Developers in building data models saving man hours.
- Developed utilities based on HDFS library to perform data validation and integrity checks across Hadoop clusters.
- Developed Tools to automate the deployment process, validate code and schema Integrity, metadata analysis assisting developers and data modelers in the data analysis tasks.
- Analyze and optimize the performance of MapReduce jobs.
- Developed a Data Translator Engine to produce XML documents from Hive Tables.
- Active participation in Hadoop Migration efforts supporting the operations and development teams inv Worked with Impala Engineering team to resolve performance issues.
- Lead the development efforts to fix the Hadoop’s “Small Files” problem that involved redesigned existing MapReduce Jobs to output files close to HDFS block size. Developed a Custom Partitioner to enable even distribution of data output by the Reducers.
- Work as a Liaison with Vendor support teams to troubleshoot program level issues.
- Active involvement in POC efforts that the project undertakes to validate new technology under consideration.
Lead Hadoop Engineer
- Played an active role in architecture design and application development
- Write Java MapReduce program involving Mapper, Reducer and Combiner classes, Sorting and Grouping Comparators, Custom Partitioner classes, Distributed cache
- Extend MapDB implementation for caching metadata
- Optimize Hadoop Pipeline jobs using compression techniques, sequence file formats with custom writable classes and Code optimization
- Develop MRUnit test cases for testing Map-Reduce Jobs
- Implement Solr based Confidential engine, “DocSim” to generate Document Similarities
- Design Solr Schema to index metadata using custom field types with Analyzers/Tokenizers
- Analyze Metadata to measure the coverage and distribution of metadata tags
- Design the framework for Archetype Testing to evaluate Confidential quality
- Develop automated tools in Java to assist the RQA team with A/B testing
- Create well documented User stories with acceptance criteria, Tasks and Test Cases
- Developed search interface based on JSP that interacts with internal Web Services based on SOAP and matching rules.
- Used JAX-WS APIs and WSClient tool
- Developed batch programs using Spring framework’s Quartz scheduler to update lookup and business data from master database.
- Configured CronTriggerBean to schedule the batch programs
- Configured Spring AOP point cuts to apply transaction advice across the application
- Developed an Export feature to export Confidential data in excel format using Java Excel API
- Developed JUnit test cases to validate the business logic with externalized xml data
- Designed and Developed an Agent in Java extending the server-side classes to automate the user-setup process involving JNDI, Content management SDK and Oracle PL/SQL stored procedures to synchronize the OID and Application
- Developed a multi-threaded Java module for dispatching faxes which accesses RightFax DLLs via JNI
- Developed the User Interface for Calendar module in Java server Faces
- Developed an Email Signature using Java Mail API to associate User's contact information to their internal messages/Emails
- Developed an Audit Trail module in Java and JSP to record all the transactions on a document
Tools: based on .Net framework for the Military service members and other eligible civilians within the Military Community, Services Officers and academic counselors
- Designed and developed My Compass as a personalized one-stop resource for military service members in exploring higher options, scholarships and careers
- Designed and developed a User Interface in ASP.NET using C#, Web Services, SOAP and XSLT Transformation to render live employment and labor market data
- Developed XML Import and Export utility for the Schools to upload/download their latest information on the system with support for XML Validation through XML Schema
- Designed Portal pages, specified Portlet & Provider definitions using Portlet parameters and Events
- Designed and developed Universal Inbox in JSP/JAVA providing access to Internal messages, Emails, Faxes, Drafts, Tasks and Scanned documents in one place
- Developed a Multi-threaded Email Filing module in Java to download exchange server emails to desired workspace locations
- Developed a desktop application in Java Swings to setup data structures and Layouts for Business Objects used in the system
- Developed Architecture to achieve application level security and for implementing ‘Internet Fraud Protection’
- Developed an Online Payment processor module to process the Customer’s Credit Card Authorization and Transactions using XML interface with EPX and Paradata.
- Developed the Confidential wallet architecture using HTML, JSP, Signed Applets and EJB