Senior Data Engineer Resume
Mountain View, CA
PROFESSIONAL SUMMARY
- Over 10 years of IT experience design and development of advance software solutions using Java/J2EE technologies in financial, Logistic and Insurance domain.
- More than 4 years of working experience in HDFS, Map Reduce programming model using Java8, Scala & Python.
- Expertise in Hadoop Eco system tools such as Hive, Sqoop, Impala, Kafka, Oozie & Zoo keeper.
- Hands on Experience in Microservices Architecture, Docker containers, Service Registry.
- Hands on experience in processing data in batch & streaming using Apache Spark with R, MLib and SQL libraries.
- Hands on experience in NOSQL databases such as MongoDB, HBASE and Cassandra.
- Solid Experience in Data Modeling and ETL/ELT Processing.
- Hands on experience in working with large volume of Structured and Un - Structured data.
- Working experience in 100 + nodes of On-premises and AWS cluster in production environment.
- Working experience in Hadoop batch job configuration, Batch jobs scheduling and performance tuning.
- Experience in parsing data from different document formats such as XML, Excel, JSON and CSV.
- Working experience in writing Complex SQL queries, Query Performance tuning, Analytical SQL functions with large and real-time data sets.
- Have experience in setting up big data environment of single node cluster as well as multi nodes cluster with Cloudera distribution.
- 6+years of Experience in Core Java and J2EE technologies such as Spring, Struts, EJB and Hibernate.
- Solid working knowledge in Object-Oriented Analysis and Design, Java/J2EE Design Patterns, UML diagrams and Enterprise Application Integration (EAI).
- Hands on experience using Agile methodology for software development.
- Have Good aptitude in multi-threading and concurrency concepts.
- Have good experience in UML designs, Functional and Technical Design documentation preparation.
- Having strong analytical and presentation skills to communicate complex quantitative analysis in a clear, precise, and actionable manner.
TECHNICAL SKILLS
Big Data Skills/Tools: Map Reduce, HDFS, Sqoop, Impala, Hive, Pig, Oozie, Zookeeper, Apache Solr, Flume, Kafka, Spark, Python, Scala, R, MongoDB, HBASE, Cassandra, YARN, S3, Redshift, AWS
Java Skills/Tools: Struts, Spring, Servlets, JSP, Hibernate, EJB, Java Beans, RMI, JDBC, JMS, Web Services (Restful API, SOAP, WSDL, JAX-WS), Microservices, Web Logic and Web Sphere Application Servers.
Java Script Skills/Tools: JQuery, Angular JS, Bootstrap JS, Node JS, Knockout JS, JavaScript 2
Database Skills/Tools: SQL/PL-SQL, Analytic Functions, Oracle, Sql Server, MySQL
PROFESSIONAL EXPERIENCE
Confidential, Mountain View, CA
Senior Data Engineer
Responsibilities:
- Translated complex functional and technical requirements into detailed design.
- Responsible for technical design and review of data dictionary.
- Worked with marketing team to understand teh business process and provided teh best solution.
- Used AWS S3 buckets for Click-stream data ingestion pipeline.
- Configured and developed Integrated events streaming data pipeline with Kafka and Spark sink.
- Generated model data functions for various customer segments using spark with R.
- Generated product ranking functions for promotions using spark with R.
- Used Python, Hive, Impala and UNIX shell for business logic scripting and deployment.
- Optimized and tuned teh configuration parameters for HIVE to process large volume data.
- Developed data validation framework to generate quality data.
- Developed UDF’s, UDAF’s, UDTF’s to enrich teh data transformation functions using Java and Scala.
- Proposed best practices/standards with agility.
Environment: HDFS, Java8, Python, Scala, HIVE, Impala, Sqoop, CDH5.x, Yarn, Spark2.1, Kafka, Shell Script, Oozie, Cron, AWS S3, Cassandra, Restful
Confidential
Delivery Engineer
Responsibilities:
- Involved in discussions and guiding regional and offshore teams on big data platform.
- Translated complex functional and technical requirements into detailed design.
- Developed teh Data products for various use cases and deployed in client’s cluster.
- Responsible for technical design and review of data dictionary.
- Worked with client to optimizing their cluster environment and capacity planning.
- Developed data products for User based segmentations and Location based segmentation.
- Designed data lake architecture which will be adapted to different business domains.
- Developed data pipeline to ingest data streams from various sources.
- Used Python with spark for scientific calculation and algorithms, and used Scala with Spark libraries for data ingestion, transformations and extraction functions.
- Proposed best practices/standards with Agility.
Environment: HDFS, Python, Scala, HIVE, Impala, Sqoop, HBase, CDH5.x, Yarn, Spark1.6, Kafka, Shell Script, Oozie, CronTab, PostgreSQL, Microservices, Kubernetes, Apache Solr.
Confidential
Senior Hadoop Developer
Responsibilities:
- Involved in discussions and guiding regional and offshore teams on Citi big data platform.
- Translated complex functional and technical requirements into detailed design.
- Migrated data from oracle data warehouse into HDFS using Sqoop tool.
- Responsible for technical design and review of data dictionary.
- Integrated Hive warehouse with HBASE.
- Designed and created HIVE external tables using shared meta-store instead of derby with partitioning, dynamic partitioning and bucketing functions.
- Written customized HIVE UDF in java for data formatting.
- Used apache Pig for ETL functions.
- Used Tableau to visualize teh transaction data report based on teh fraud category.
- Defined workflow and scheduled teh workflows for teh jobs defined using Oozie.
- Used Python, R, Scala supported Spark libraries for data processing, transformations and functions.
- Involved in various POC's to choose right big data tools for business use cases.
- Proposed best practices/standards.
- Used MR-Unit for Map Reduce code testing.
- Used Python iterators and generators packages to write Optimized code.
Environment: HDFS, Python, Scala, HIVE, Impala, Pig, Sqoop, HBase, CDH5.x, Yarn, Spark1.4, Kafka, Shell Script, Oozie and Autosys, MQ server, Oracle
Confidential
Confidential
Responsibilities:
- Developed and configured Kafka brokers to pipeline server logs data into Spark streaming.
- Transformed teh logs data into data model using Apache Pig.
- Have written UDF functions to format teh logs data.
- Used HIVE schema to define teh data.
- Imported User data from Oracle using Sqoop.
- Used Apache Solr to build indexing for teh documents.
- Responsible for technical design and review of data dictionary.
- Used Python map reduce package along with Shell script for code logic development.
Environment: HDFS, Python, MR, HIVE, Pig, Sqoop, CDH4, Spark, Solr, Kafka, Oracle, Shell Script, Oozie, Autosys.
Confidential
Technology Lead
Responsibilities:
- Involved in application migration from WPF to Spring MVC framework.
- Developed logging and session tracking component using MongoDB and Angular JS.
- Developed sample functions using Java script framework to guide other integrated component team.
- Was leading teh team with on-site and offshore model.
- Used Jenkins for release management and guided other teams to manage teh code using that tool.
- Used advanced Javascript technologies JSRender, JS Viewer, Steal JS for UI design.
- Used Node JS to communicate with integrated backend components and XML response messages for UI display.
- Used HTML 5 and CSS 3 for theme designs.
- Used SOAP web services and MQ as message components to send/receive information from other source system.
- Developed Use cases, Class Diagrams, Sequence Diagrams and Data Models using Microsoft Visio.
Environment: WPF, Dojo, Spring, Javascript MVC (Angular.js, Backbone.js, Node.js, Steal.js, JQuery, JSRender, JSViewer), WebSphere Server, BIRT, Mango DB, Oracle, Z/Linux
Confidential
J2EE Consultant
Responsibilities:
- Applied J2EE design patterns using Singleton, Facade, DAO, DTO and DI.
- Involved in business meetings to understand teh business use cases and created functional, technical design documents.
- Provided best solutions to teh business problems.
- Designed teh system architecture by analyzing different tools/components.
- Integrated Spring with Struts 2 framework.
- Designed and implemented teh DAO layer using Spring and Hibernate for online processing.
- Designed and implemented Batch processing components using Spring batch with EJB.
- Involved in development of Restful Web services.
- Designed UI components using JSP, Javascript and Ajax.
- Used Agile methodology for Project development.
- Lead teh team and delivered teh solution in time with zero percent defects.
- Designed Database components.
- Written ANT script for teh application build and used Log 4j for Debugging.
- Used JUnit Framework for teh Unit testing of all teh java components.
Environment: Spring 2.5, Struts 2, Javascript, Ajax, EJB, Spring Batch, Hibernate, JBoss, Oracle, Sql server 2005
Confidential
J2EE Consultant
Responsibilities:
- Involved in development of Service Order Management, Purchase order Management and fulfillment logic modules.
- Developed code logic to print shipment and pick list document using document printing.
- Developed code logic to print Label for packing and delivery details.
- Was involved in application and data server migration activities during new data center setup.
- Developed batch processing logic for feed files in XML format and Data archival jobs to do data backup.
- Developed Entity Beans for transaction management.
- Written PL/SQL stored procedures, functions, packages, sequences for business logic implementation.
- Involved in integrating teh product with other system to convert teh message queues information into XML document.
- Responsible for creating teh customized business reports using Crystal Reports.
- Developed teh product using front controller design pattern and integrated JPF with EJB and JULP components.
- implemented fine tuning of teh database objects with efficient schema design.
Environment: J2EE API’s (JPF, JSTL, JMS, JNDI) JULP, NetUI, EJB, XML, JavaScript, SQL/PL-SQL, Domain Object, Value List Handler, BEA Weblogic, Oracle.
Confidential
Java Developer
Responsibilities:
- Involved in development of questions administration, candidate administration and Testing modules.
- Involved in implementation of Candidate's Test report generation and bulk questions uploading.
- Developed teh logic to generate e-certificate for teh candidates using iText libraries.
- Developed Reporting dashboard for each candidature's domain knowledge report from test records using Crystal report.
- Involved in design and code logic to generate questions in random order for online test.
- Developed teh export and import logic for mass questions upload/download for questions set up using Apache POI API.
Environment: J2EE API’s (JSP, JSTL, Struts, JNDI), Hibernate, DAO, XML, JavaScript, Ajax, Microsoft SQL Server, JBoss 4.0.1
Confidential
IT System Analyst
Responsibilities:
- Batch processing of teh insurance policy certification based on teh insurance business policy rules.
- Developed teh product to generate e-certificate for teh policy using iText libraries.
- Developed Reporting dashboard using Jasper Report to generate teh insurance claim report for insurance agents.
- Design and code programs to select data from External sources (Flat file and DB2) and load data into DB2.
- Performed testing to make product reliable in heavy loads.
Environment: JSP, Servlets, Javascript, EJB, WSAD, Jasper Report, Web Sphere Application Server, DB2, MQ Server