We provide IT Staff Augmentation Services!

Big Data And Talend Developer / Team Lead Resume

4.00 Rating

Hartford, CT


  • Confidential has almost 8 years of experience spread across Hadoop, Java and ETL. He has extensive experience in Big Data Technologies and in development of standalone and web applications in multi - tiered environments using Java, Hadoop, Hive, HBase, Impala, Pig, Sqoop, J2EE Technologies (Spring, Hibernate), Oracle, HTML, and Java Script. He has3 years of comprehensive experience as a Hadoop Developer. He has strong experience working on Talend Integration Suite and Talend Open Studio. Experience in designing Talend jobs using various Talend components. Confidential has very good communication, interpersonal and analytical skills and works well as a team member or independently.
  • Passionate about working in Big Data and Analytics environment.
  • Experience in installation, configuration, management and deployment of Hadoop Cluster, HDFS, Map Reduce, Cloudera, Pig, Hive, Sqoop, Flume, Oozie, Hbase, and Zookeeper.
  • Experience in Extraction, Transformation, and Loading (ETL) of data from multiple sources like Flat files, XML files, and Databases. Used Informatica for ETL processing based on business.
  • Experience with Talend DI Installation, Administration and development for data warehouse and application integration.
  • Expertise in writing Hadoop Jobs for analyzing data using Hive and Pig.
  • Experience in working with MapReduce programs using Apache Hadoop for working with Big Data.
  • Data transformation, file processing, and identifying user behavior by running Pig Latin Scripts and expertise in creating Hive internal/external Tables/Views using shared Meta Store.
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa.
  • Has good experience with ETL and hands on experience with Informatica ETL
  • Diversed knowledge on Spark and Scala.
  • Experience in NoSql databases like Hbase, Cassandra and MongoDB for data extraction and storing huge volumes of data.
  • Experience in analyzing data using Hive QL, Pig Latin, and custom MapReduce programs in Java.
  • Extensive experience with SQL, PL/SQL and database concepts.
  • Good understanding of Storm and Kafka for monitoring and managing Hadoop jobs.
  • Knowledge of job workflow scheduling and monitoring tools like oozie and Zookeeper.
  • Experience with databased to include; DB2, Oracle 11g, MySQL, SQL Server and MS Access.
  • Strong programming skills in Core Java, J2EE and Phyton technologies.
  • Experience in Data Warehouse life cycle, methodologies and its tools for reporting and data analysis.
  • Extensive use of Open Source Software such as Web/Application Servers like Apache Tomcat 6.0 and Eclipse 3.x IDE.
  • Experience on Apache, Cloudera, Hortonworks Hadoop distributions.
  • Installation, configuration and administration experience in Big Data platforms Cloudera CDH.
  • Expert level skills in Java Multithreading, Object Oriented Design Patterns, Exception Handling, Servlets, Garbage Collection, JSP,HTML, Struts, Hibernate, Enterprise Java Beans, JDBC, RMI, JNDI and XML related technologies.
  • Experienced in Documenting the Software Requirements Specifications including Functional Requirements, Data Requirements and Performance Requirements.
  • Highly organized with the ability to manage multiple projects and meet deadlines.


Languages: C, C++, Java, J2EE, Phyton, PL SQL

Big Data Ecosystem: Hadooop/Big DATA HDFS, Hbase, Pig, Hive, Sqoop, Zookeeper,Oozie, Spark, ApacheHadoop, Kafka, cloudera, Hortonworks, Talend.

NoSQL Technologies: MongoDB, Cassandra, Neo4J

Databases: Oracle 11g/10g/9.i/8.X, MySQL, MS SQL Server 2000

Web technologies: Core Java, J2EE, JSP, Servlets, EJB, JNDI, JDBC, XML, HTML, JavaScript, Web ServicesFrameWorks: Spring 3.2/3.0/2.5/2.0, Struts 2.0/1.0, Hibernate 4.0/3.0, Groovy, Camel

App Server: Weblogic 12c/11g/10.1/9.0, WebSphere 8.0/7.0

Web Server: Apache Tomcat 7.0/6.0/5.5

IDEE: clipse 3.7/3.5/3.2/3.1/ Net Beans, Edit Plus 2, Eclipse Kepler

Tools: Teradata, SQL Developer, Soap UITestingJUnit, JMock

Operating System: Linux, UNIX and Windows 2000/NT/XP/Vista/7/8/10

Methodologies: Agile, Unified Modeling Language (UML), Design Patterns (Core Java and J2EE)

System Design: & DevRequirement gathering and analysis, design, development, testing, delivery


Confidential, Hartford CT

Big Data and Talend Developer / Team Lead


  • Worked on Military and Veterans Project, handled health care files from different sources such as PGBA, CMS, DOD, etc., and from different domains such as Access DB, Website files, Flat Files, Fixed Length files and delimited files.
  • Created the schema files(Meta) and Control Files for all different sources
  • Run all the source files through pre-processing to perform the data quality checks(DQ Checks).
  • Ingested the files into Data Management System (D.M.S){Hbase, Hive} by performing checks such as Schema Validation, Record Count Check, File Duplicate Check using Talend.
  • Created Trade Partner Profile both at the Source level and Entity Partner Level and update trade partner profile into the Hive Table by using Talend components.
  • Created Pig UDF’s to perform the enrichment according to the business requirements for both the simple enrichments and complex enrichments and executed pig UDF’s using Talend componets such as Tpigload,TpigMap,etc.,.
  • Developed UDF functions for Hive and wrote complex queries in Hive for data analysis.
  • Perforfed enrichement and stored the file back into Hadoop Distributed File System(HDFS).
  • Created Hive Tables and Hive views on top of the enriched as well as the ingested files.
  • Created the SQL Tables to provision the data to SQL for reference to the business objects.
  • Wrote SQL scripts for modifying the existing schemas and introducing the new features in the existing schemas of SQL.
  • Converted existing SQL queries into Hive QL queries.
  • Designed and developed multiple MapReduce jobs in Java for complex analysis.
  • Importing and exporting the data using Sqoop from HDFS to Relational Database systems and vice-versa.
  • Used Hive/HQL or Hive queries to provide Adhoc-reports for data in Hive tables in HDFS.
  • Implemented POC to migrate map reduce jobs into Spark RDD transformations using Scala.
  • Developed Spark jobs using Scala in test environment for faster data processing and used Spark SQL for querying.
  • Used Talend as the ETL Tool for data loading, cleansing and transformation.
  • Implemented Unit testing, test scripts to support test driven development and continuous integration.
  • Troubleshooting, debugging & fixing Talend specific issues, while maintaining the health and performance of the ETL environment.
  • Followed agile scrum methodology with Sprint planning, Sprint delivery of the developed software and participated in the Kaizan(continuous improvement processes) with the Sprint Retrospective.

Environment: Hadoop, Talend, MapReduce, MapR 5.x, Hbase, Hive, Pig, Sqoop, Teradata, SQL Server, Java 7.0, Eclipse, Maven, ZOOKEEPER, UNIX shell-scripting.

Confidential, New Jerse, NJ

Hadoop Developer


  • Created Hive Tables, loaded retail transactional data from Teradata using Scoop.
  • Loaded home mortgage data from the existing DWH tables (SQL Server) to HDFS using Scoop.
  • Responsible for Operating system and Hadoop Cluster monitoring using tools like Nagios, Ganglia, Cloudera Manager.
  • Talend administrator with hands on Big data ( Hadoop ) with Cloudera framework
  • Proactively managed Oracle/SQL Server backups, performance tuning, and general maintenance with capacity planning of the Talend complex.
  • Troubleshooting, debugging & fixing Talend specific issues, while maintaining the health and performance of the ETL environment.
  • Documented the Installation, Deployment, administration and operational processes of Talend MDM Platform (production, Pre-Prod, test30, test 90 and development) environments for ETL project.
  • Worked on POC and implementation & integration of Cloudera & Hortonworks for multiple clients.
  • Developed and designed ETL Jobs using Talend Integration Suite (TIS) in Talend 5.2.2.
  • Created complex jobs in Talend 5.2.2 using tMap, tJoin, tReplicate, tParallelize, tJava, tJavaFlex, tAggregateRow, tDie, tWarn, tLogCatcher, etc.
  • Used tStatsCatcher, tDie, tLogRow to create a generic joblet to store processing stats.
  • Created Talend jobs to populate the data into dimensions and fact tables.
  • Created Talend ETL job to receive attachment files from pop e-mail using tPop, tFileList, tFileInputMail and then loaded data from attachments into database and archived the files.
  • Used Talend joblet and various commonly used Talend transformations components like tMap, tDie, tConvertType, tFlowMeter, tLogCatcher, tRowGenerator, tSetGlobalVar, tHashInput & tHashOutput and many more.
  • Created Talend jobs to load data into various Oracle tables. Utilized Oracle stored procedures and wrote few Java code to capture global map variables and used them in the job.
  • Created Talend jobs to copy the files from one server to another and utilized Talend FTP components.
  • Involved in Hadoop administration on Cloudera, Hortonworks and Apache Hadoop 1.x & 2.x for multiple projects.
  • Built and maintained a bill forecasting product that will help in reducing electricity consumption by leveraging the features and functionality of Cloudera Hadoop.
  • Created ETL jobs to load Twitter JSON data into MongoDB and jobs to load data from MongoDB into Data warehouse.
  • Worked on analyzing Hadoop cluster using different big data analytic tools including Kafka, Pig, Hive and Map Reduce.
  • Collected and aggregated large amounts of log data using Apache Flume and staging data in HDFS for further analysis.
  • Real time streaming the data using Spark with Kafka.
  • Configured Spark streaming to receive real time data from the Kafka and store the stream data to HDFS using Scala
  • Imported and exported data into HDFS using Sqoop and Kafka.
  • Wrote Hive Queries to have a consolidated view of the mortgage and retail data.
  • Data is loaded back to the Teradata for the BASEL reporting and for the business users to analyze and visualize the data using Datameer.
  • Orchestrated hundreds of sqoop scripts, pig scripts, hive queries using oozie workflows and sub-workflows.
  • Loaded the load ready files from mainframes to Hadoop and files were converted to ASCII format.
  • Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
  • Responsible for software installation, configuration, software upgrades, backup and recovery, commissioning and decommissioning data nodes, cluster setup, cluster performance and monitoring on daily basis, maintaining cluster on healthy on different Hadoop distributions (Hortonworks & Cloudera).
  • Developed pig scripts for replacing the existing home loans legacy process to the Hadoop and the data is back fed to retail legacy mainframes systems.
  • Wrote Hive and Pig scripts as ETL tool to do transformations, event joins, filter both traffic and some preaggregations before storing into the HDFS.
  • Developed MapReduce programs to write data with headers and footers and Shell scripts to convert the data to fixed-length format suitable for Mainframes CICS consumption.
  • Used Maven for continuous build integration and deployment.
  • Agile methodology was used for development using XP Practices (TDD, Continuous Integration).
  • Participated in daily scrum meetings and iterative development.
  • Supported team using Talend as ETL tool to transform and load the data from different databases.
  • Exposure to burn-up, burn-down charts, dashboards, velocity reporting of sprint and release progress.

Environment: Hadoop, Talend, MapReduce, Cloudera, Talend Hive, Pig, Kafka, Sqoop, Avro, ETL, Hortonworks, Datameer, Teradata, SQL Server, IBM Mainframes, Java 7.0, Log4J, Junit, MRUnit, SVN, JIRA.

Confidential, Redmond, WA

Hadoop Developer


  • Worked on Distributed/Cloud Computing (MapReduce/Hadoop, Pig, HBase, Zookeeper, etc.), Amazon Web Services (S3, EC2, EMR, etc.), Oracle SQL Performance Tuning, ETL, Java 2 Enterprise, Web Development, Mobile Application Developement (Objective-C, Java Native Mobile Apps, Mobile Web Apps), Agile Software Development.
  • Designed techniques and wrote effective and successful programs in Java, Linux shell scripting to push the large data including the Text and Byte type of data to successfully migrate to No Sql Stores using various Data Parser techniques in addition to Map Reduce jobs.
  • Tuned the Hadoop Clusters and Monitored for the memory management and for the Map Reduce jobs, to enable healthy operation of Map Reduce jobs to push the data from SQL to No Sql store.
  • Designed ETL Jobs/Packages using Talend Open Studio (TOS)
  • Used Talend components (tMap, tDie, tConvertType, tFlowMeter, tLogCatcher)
  • Worked hands on with ETL process. Handled importing data from various data sources, performed transformations.
  • Managed works including indexing data, tuning relevance, developing custom tokenizers and filters, adding functionality includes playlist, custom sorting and regionalization with Solr Search Engine.
  • Imported data from kafka consumer into Hbase using spark streaming.
  • Provided continuous monitoring and management of the Hadoop cluster through Cloudera Manager.
  • Involved in Installing, Configuring Hadoop Eco System, Cloudera Manager using CDH4 Distribution.
  • Used Pig as ETL tool to do transformations, event joins, filter bot traffic and some pre-aggregations before storing the data onto HDFS.
  • Designed the Prototypes of the Cloud Architecture in Amazon EC2 environment and delivered end to end prototype software within the aggressive schedule.
  • Delivered Working Widget Software using EXTJS4, HTML5, RESTFUL Web services, JSON Store, Linux, HADOOP, ZOOKEEPER, NoSqul databases, Java, SPRING Security, JBOSS Application Server for Big Data analytics.
  • Delivered the Design and developed Business Intelligence and Reporting Software using BIRT and automated Reporting capabilities to extract data from the SQL stores.
  • Developed Java based programs both using the Restful (RestEasy) services, the resource oriented model and the SOAP based web services to enable access the resources to Graphical user Interfaces like EXTJS4.
  • Integrated Central Authentication Services (CAS) to Web applications.
  • Designed, developed and delivered JIRA Graphical User Interface Screens to enable integration with the web applications.
  • Implemented the Agile Development and Scrum practices with Sprint planning, Sprint delivery of the developed software and participated in the continuous improvement processes with the Sprint Retrospective.
  • Built data platforms, pipelines, storage systems using the Apache Kafka, Apache Storm and search technologies such as Elastic search.
  • Reviewed ETL application use cases before on boarding to Hadoop
  • Developed Use cases and Technical prototyping for implementing Pig, Hive and Hbase.
  • Analyzed the alternatives for NOSQL Data stores and intensive documentation for Hbase vs. Accumulo data stores.

Environment: Hadoop, HDFS, Talend 5.2.2, Cloudera, Hive, Pig, Zookeeper, Kafka, Oozie, spark, Flume, Hbase, ETL, Java (jdk1.6), Spring, RESTFUL, HTML5, EXTJS4, JSON, Linux Shell Scripting, SQL.

Confidential, Columbus, IN

Hadoop Developer


  • Installed/Configured/Maintained Apache Hadoop clusters for application development and Hadoop tools like Hive, Pig, HBase, Zookeeper and Sqoop.
  • Worked on analyzing Hadoop cluster using different big data analytic tools including Hive, MapReduce, Pig and kafka.
  • Involved in analyzing system failures, identifying root causes, and recommended course of actions.
  • Managed Hadoop clusters using Cloudera.
  • Extracted, Transformed, and Loaded (ETL) of data from multiple sources like Flat files, XML files, and Databases.
  • Migrated ETL jobs to Pig scripts do Transformations, even joins and some pre-aggregations before storing the data onto HDFS.
  • Developed Simple to complex MapReduce Jobs using Hive and Pig.
  • Wrote the shell scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
  • Installed, configured and optimized Hadoop infrastructure using Apache Hadoop and Cloudera Hadoop distributions.
  • Managed and scheduled Jobs on a Hadoop cluster.
  • Designed a data warehouse using Hive.
  • Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS.
  • Developed the Pig UDF’S to pre-process the data for analysis.
  • Develop Hive queries for the analysts.
  • Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
  • Cluster co-ordination services through ZooKeeper.
  • Collected the logs data from web servers and integrated in to HDFS using Flume.
  • Implemented Fair schedulers on the Job tracker to share the resources of the Cluster for the Map Reduce jobs given by the users.
  • Provided NoSql solutions in MongoDB, Cassandra for data extraction and storing huge amount of data.
  • Managed and reviewed Hadoop log files.
  • Use Spark to analyze point-of-sale data and coupon usage.

Environment: Hadoop, Hbase, Cloudera, Kafka, ETL, HDFS, Hive, Java (jdk1.6), Pig, Zookeeper, Oozie, spark, Flume.

Confidential, Chicago, IL

Java/J2EE Developer


  • Involved in Software Development Life Cycle (SDLC) which includes requirement Gathering, Design, Coding and Testing.
  • Designed and developed Business Services using Spring Framework (Dependency Injection) and DAO Patterns.
  • Implemented the business layer by using Hibernate with Spring DAO and also developed mapping files, POJO java classes using ORM tool.
  • Implemented theWebservicesand associated business modules integration.
  • Worked on generating the web services classes by using Service Oriented Architecture SOA, SOAP.
  • Developed and implemented the MVC Architectural Pattern JSP, Servlets and Action classes.
  • Specified the default initialization file is through the log4j.configuration system property and loaded the log4j.properties from WeblogicClasspath.
  • Used SOAP UI to test the different methods in the webservice.
  • Effective usage of J2EE Design Patterns namely Session Facade, Factory Method, Command and Singleton to develop various base framework components in the application.
  • Involved in Units integration, bug fixing, and User acceptance testing with test cases.
  • Used Stateless Session Bean to implement Business Process and interact with DA layer for DB Access.
  • Developed the presentation layer using JSP, HTML, XHTML, CSS and client validations using JavaScript.
  • Used Spring MVC and Angular JS framework for configuring the application.
  • Used SQL and PL/SQL Programming extensively to talk to Oracle database.
  • Responsible as CVS administrator and for deploying web application in the Oracle App Server.
  • ANT was used as a build tool. Also worked in an Agile work environment.
  • Used Log4j for logging errors, messages and performance logs.

Environment: Windows XP, JDK 1.6, Oracle 10g, Web sphere, CVS, Rational Clear quest, Servlets3.0, JSP 2.2,HTML, XHTML, XSLT, JDBC, JMS, EJB, SOAP, WSDL, Web Services,Eclipse 3.2, Ant 1.6.5, Maven, Agile development process,PL/SQL,JUnit, JMock, and Log4j.

Confidential, OH

JAVA Developer


  • Designed, configured and developed the web application using Jsp, Jasper Report, barbeque barcode scanner, JavaScript, HTML.
  • Developed Session Beans for JSP clients.
  • Configured and Deployed EAR & WAR files on WebSphere Application Server.
  • Defined and designed the layers and modules of the project using OOAD methodologies and standard J2EE design patterns & guidelines.
  • Designed and developed all the user interfaces using JSP, Servlets and Spring framework.
  • Developed the DAO layer using Hibernate and used caching system for real time performance.
  • Designed the application to allow all users to utilize core functionality, as well as business specific functionality based on log on ID.
  • Developed Web Service provider methods (bottom up approach) using WSDL, XML and SOAP for transferring data between the Applications.
  • Configured Java Messaging Services (JMS) on Web Sphere Server using Eclipse IDE.
  • Used AJAX for developing asynchronous web applications on client side.
  • Used JDBC for accessing database to track all credit aspects of accounts, which include financial review details, security held, actuarial exposure data and receivables.
  • Designed various applications using multi-threading concepts, mostly used to perform time consuming tasks in the background.
  • Wrote JSP & Servlets classes to generate dynamic HTML pages.
  • Designed class and sequence diagrams for Modify and Add modules.
  • Designed and developed XML processing components for dynamic menus on the application.
  • Adopted Spring framework for the development of the project.
  • Developed the user interface presentation screens using HTML.
  • Co-ordinated with QA lead for development of test plan, test cases, test code, and actual testing responsible for defects allocation and resolution of those defects.
  • All the coding and testing was performed using Eclipse.
  • Maintained the existing code based developed in Spring and Hibernate framework by incorporating new features and fixing bugs.
  • Involved in fixing bugs and unit testing with test cases using JUnit framework.
  • Developed build and deployment scripts using Apache ANT to customize WAR and EAR files.
  • Developed stored procedures and triggers using PL/SQL in order to calculate and update the tables to implement business logic using Oracle database.
  • Used Spring ORM module for integration with Hibernate for persistence layer.
  • Involved in writing Hibernate Query Language (HQL) for persistence layer.
  • Used Log4j for application logging and debugging.

Environment: Java SE 7, Java EE 6, JSP 2.1, Servlets 3.0, HTML, JDBC 4.0, IBM WebSphere 8.0, PL/SQL, XML, Spring 3.0, Hibernate 4.0, Oracle 12c, ANT, Java Script & JQuery, JUnit, Windows 7 and Eclipse 3.7.


Java Developer


  • Worked on Agile development environment. Participated in scrum meetings.
  • Developed web pages using JSF framework establishing communication between various pages in application.
  • Designed and developed JSP pages using Struts framework.
  • Utilized the Tiles framework for page layouts.
  • Involved in writing client side validations using Java Script.
  • Used Hibernate framework to persist the employer work hours to the database.
  • Spring framework AOP features were extensively used.
  • Followed Use Case Design Specification and developed Class and Sequence Diagrams using RAD, MS Visio.
  • Used JavaScript, AJAX for making calls to Controllers that get File from server and popup to the screen without losing the attributes of the page.
  • Coded Test Cases and created Mock Objects using JMock and used JUnit to run tests.
  • Configured Hudson and integrated it with CVS to automatically run test cases with every build and generate code coverage report.
  • Configured Data Source on WebLogic Application server for connecting to Oracle, DB2 Databases.
  • Wrote complex SQL statements and used PL/SQL for performing database operations with the help of TOAD.
  • Created User interface for Testing team which helped them efficiently test executables.
  • Mentored co-developers with new technologies. Participated in Code reviews.
  • Worked on a Data stage project which generates automated daily reports after performing various validations.

Environment: UNIX, RAD6.0, WebLogic, Oracle, Maven, JavaScript, JSF, JSP, Servlets, Log4J, Spring, Pure Query, JMock, JUnit,TOAD, MS Visio, Data Stage, CVS, SVN, UML and SOAPUI.

We'd love your feedback!