We provide IT Staff Augmentation Services!

Sr. Data Engineer Resume



  • Over 20 years of experience in all phases of design, development, and implementation of enterprise applications.
  • Worked for clients within the financial, insurance, manufacturing, car dealership, energy, research laboratories and gaming, food, technology, hotel, sports and services industries.
  • Well versed in various programming languages and scripting and skilled in writing complex but highly optimized SQL queries.
  • Experienced as administrator/technical lead on multiple databases like Snowflake, Vertica, Redshift, Netezza, Oracle and operating systems like Unix/Linux, Open VMS and Windows.
  • Excellent technical skills combined with business knowledge makes him highly dependable and sought - after consultant.
  • Extensive experience in architecting and implementing HP PPM enterprise applications.
  • Project management on various client engagements.


Programming: Scala, Java, Python, C/C++, Servlets, Portlets, JSP, JSTL, SQL, PL/SQL, HTML5, JavaScript, JQuery, Cobol, RPG II, Basic

Database: Snowflake, Netezza, Vertica, Impala/Hive, Oracle, Redshift

Build Tools: Gradle, SBT, Activator, Maven, Ant

Enterprise Deployment Tool: HP Deployment Management (aka Kintana)

Framework/ORM: Play Framework, Spring Frameworks, JPA/Hibernate, iBatis

Operating Systems: Unix/Linux, Open VMS, Windows

Scripting: Bash, Korn, C Shell, Perl, DCL

Version Control: GIT, SVN, Perforce, CVS, PVCS

IDE: Eclipse, IntelliJ, Spring STS, NetBeans, JDeveloper

Others: HDFS, Apache Spark, Akka, Hive2, Sqoop, R StudioJSR-168/286, JDBC, Java Mail, SOA AXIS/REST, TomcatApache, Jboss, Oracle Forms/Reports, Postgres, H2Informatica, Derby, MySql, and Graph/Neo4j, Kafka, Protegrity, CyberArk


Confidential - Arizona

Sr. Data Engineer

Technologies Used: Scala, Akka, Play, Slick, Python, Bash, Javascript/JQuery, HTML5, CSS, AWS/Snowflake, Netezza, Informatica, Control-M, Protegrity, CyberArk


  • Led the design and development of UI and cluster-aware data file ingestor application - from on-prem to AWS/Snowflake
  • Developed cli Netezza to Snowflake data move tool
  • Led the team that handles historical data load effort for Insurance Policy domain
  • Led offshore team on the refactoring effort of Incremental data load cycles
  • Written a custom cli tool that displays the relationships of Informatica/Powercenter mappings and various inputs and outputs with Control-M job as context.
  • Integrated with third party tools such as Protegrity, CyberArk, and others

Confidential - Arizona

Sr. Data Engineer

Technologies Used: Vertica, Java, Scala, Neo4j, Akka, Python, Bash, Hadoop, Apache Spark, Impala, Hive, Amazon Cloud, Amazon S3 buckets, Redshift, Cloudera,Hue, Parquet, JSON, HTML5, JQuery, Spring Framework, JPA/Hibernate, SQL


  • Perform administration activities of a 6-node Vertica database, ensuring smooth operations while maintaining the overall performance at acceptable level
  • Implemented database backup and restore procedures, including specific object replication processes
  • Led extraction development effort on data sources requiring more intense business logic
  • Development of internal tools and utilities such as data replication across clusters, dynamic Omniture Clickstream url parsing, etc.
  • Work load analysis, performance tuning, running designer, creation of custom projections, table partitioning, running statistics/histograms, etc
  • Maintain database clusters, ensuring data is well balanced among the participating nodes.
  • Provide support to developers and analysts on their access and data needs.
  • Administer users - providing right access and privileges.
  • Enforce coding standards, code deployment and other Release Management practices.
  • Conduct various proof-of-concept development works before embarking to a more serious product development
  • Development of various monitoring scripts to ensure the overall health of the database is maintained such as disk usage, active monitoring of long running sqls, health of Vertica cluster, etc.
  • Developed a Vertica license utilization monitoring tool in Scala/Akka, sending notifications to table owners and auto-revocation of some privileges of repeat offenders.
  • Written various UDXs (in Java, Scala, and C++) such as dynamic table privilege grants, on-demand url parsing, database designer programmatically, etc.
  • Architected and developed a lightweight extension that enables a user to initiate DBD on a table level via client tool (vsql, Db Visualizer) in one go using Scala.
  • Automated Vertica weekly/daily full and incremental backups using bash/vbr.
  • Built an automated Linux system performance data gathering tool using Scala, Akka, and Activator.
  • Built an automated Vertica load test by simulating sql execution with varying virtual users in a multi-threaded fashion using Java.
  • Development of custom object replication utility (using Java) that allows end-users to clone tables to other Vertica environments within a client tool such as Db Visualizer or via vsql.
  • Led migration effort of entire Vertica cluster to a new set of servers.
  • Development of a Vertica to Hive/Impala ingestor - using Java, Spring Framework, Apache Spark, Hive technologies.
  • Written a Vertica to JSON file converter into Amazon S3 storage utilizing Scala and Apache Spark.
  • Developed a tool to load S3 JSON file into Hive table in parquet format in Scala and Apache Spark.
  • Written a tool that scrubs numerous files in Amazon S3, getting rid of unwanted characters and other housekeeping activities using Scala and Akka.
  • Written Vertica to Impala data loader using Java, Spring, Spark.
  • Written a CSV/TXT to Parquet (with auto-Hive table creation) data loader utility in Scala, Akka, and Spark whose tables are housed in Amazon S3 buckets. This solution involves heavy AWS api calls.
  • Built a Vertica license monitoring tool that reports the usage to the offenders and other concerned parties - and allowing the users to easily drop the tables that they own.
  • Led investigation effort in determining appropriate tools to be utilized for the BI team - including R Studio, Hql/sql, Squirrel, Kafka, Sqoop, Hue, Impala, Trifacta and other related technologies.
  • Development of decryption and encryption utility using El Gamal algorithm in java.
  • Designed/developed a generic SFTP/File Handler solution that will allow easy data load from SFTP servers into Amazon S3 landing zone.
  • Written a daily feed process to Medallia, combining LDAP user information and some other hotel-specific data utilizing Apache Spark on Java, and Spring Framework.
  • Developed a prototype generic table profiler using Scala on Impala tables housed by Amazon S3 storage.
  • Architected the migration process of BI group from Vertica environment to Cloudera/Amazon cloud infrastructure.
  • Conducted PoC on Neo4j to host Component Lineage apps.
  • Designed and development of generic Impala/Amazon S3 bulk loader using Scala/Spark - this involves heavy Amazon cloud/AWS api invocations.
  • Written a custom Amazon cloud-ready and light-weight sftp utility with more flexible parameters and without passing plain text password in Scala.
  • Conducted PoC on Anaconda Project - a Python Data Science Platform.
  • Written an Impala Metastore sync up in Scala for two separate clusters sharing the same metadata.
  • Conducted PoC on Apache Airflow.
  • Act as manager on special projects involving deeper technical know-how: Activities include assigning, following-up tasks on the entire team and reporting to higher management.
  • Re-wrote the SQL Load Simulator in Scala and Akka that was utilized to test the performance of Vertica VS Redshift on Amazon Cloud/AWS.
  • Currently developing a generic database data extractor and replicator that moves data across several database types - initially to cater Hive/Impala, Redshift, Vertica and Amazon S3 sources and/or destinations. The tool is being written in Scala and Apache Spark.


Technical Solutions Architect/DBA/Lead Consultant/Developer

Technologies Used: Java, J2EE, Servlets, JSON, basic HTML, JQuery, Spring, iBatis, Hibernate, JSR-168, PL/SQL, SQL, Bash, JSP, JSTL, Oracle, Vertica, PostgreSQL, Derby, C/C++, Load Runner.


  • Managed client engagements, ensuring deliverables are met on a timely manner. Activities include: assignment of tasks to the team members, follow ups, quality assurance and reporting project status/progress to the customer stakeholders and sponsors.
  • Installation, configuration and support of Vertica databases including development of custom UDx functions using C++.
  • Developed a custom ELT (Extract Load Transform) process (written in Java) that loads data from Oracle and SQL Server to HP Vertica database on batch mode and near real time fashion.
  • Assisted the BI team by supporting Vertica and Oracle databases in their solution development efforts.
  • Scripted backup and restore procedures for Oracle and Vertica databases.
  • Perform administration functions such as user management, performance monitoring, query optimizations, etc.
  • Activities include database administration and development and transporting sql scripts and building custom functions and procedures into Vertica.
  • Creating and fine-tuning projections, views and other related activities.
  • Installation and configuration of Oracle databases housing different customers’ software application needs.
  • Installation and configuration and support of Business Objects for customers’ reporting requirements.
  • Application performance tuning, including tweaks in Linux kernel parameters, Oracle init parameters, JVM, and sql-specific.
  • Architected a major customization that sits on top of HP PPM Deployment application. This custom module enables the developers to launch creation of deployment packages via custom web UI that auto-populate the deployment package with its associated components or objects in a change request. The solution proved to be extremely useful, time-saver, and a lot easier to use - especially that the customer has over 100 developers.
  • Led the design and development of Capacity-Based Resource planning module on top of HP PPM Project Management. One of the primary features is automatic scheduling of resources on assigned tasks based on current and future assignments, taking into consideration weekends, holidays and his/her regional and personal calendars. This also includes a simulation module allowing managers to take a peek of the overall resource capacity and assignments over a given future period without necessarily persisting the back to the database.
  • Built General Ledger integration component with HP PPM Financial Management module using Java/Axis web services technologies. Incoming GL data are fed daily into a staging area that the java/axis SOA client will process and post to PPM Financials System.
  • Designed and developed custom components within HP PPM’s Resource Management module for better and more flexible user interface. The custom UI components, allow the managers to execute resource allocation process en masse.
  • Led the implementation, configuration and integration of HP PPM enterprise application at various customer sites. Provided level 3 customer support following go-live.
  • Led the development of PoC components, including integration work that the customers require.
  • Provided product demonstrations to various prospects/customers - engaging active discussions on areas related to product functionalities and capabilities.
  • Led meetings, discussions among various customer players including dbas, developers, and stake holders - discussing areas relating to change processes, approvals, securities and other related topics.
  • Prepared statement of work (SOW) for various customer professional engagements for areas relating to HP Deployment Management implementations.
  • Re-engineered HP PPM built-in deployment object types to implement strict SOX requirements - ensuring tight security is embedded in any form of server to server interactions.

Hire Now