We provide IT Staff Augmentation Services!

Talend+hadoop Developer Resume

San Diego, CA

SUMMARY:

  • Overall 10 + years of experience in IT development in Data warehouse - ETL tool (Informatica,Talend(6.0/6.3), Datastage) along with Big Data.
  • Around 3 years working experience in Talend(ETL Tool) to developing & leading the end to end implementation of Big Data projects, comprehensive experience as a Hadoop Developer in Hadoop Ecosystem like Hadoop,Map Reduce, Hadoop Distributed File System (HDFS), HIVE,IMPALA,Yarn,Ozie, Hue,Spark.
  • Worked on Importing and exporting data from different databases like Oracle, Teradata into HDFS and Hive using Sqoop .
  • Worked on data integration projects with Greenplum/Exadata/Vertica/Oracle/SQL Server/Informatica/Talend/BigQuery env.
  • 3 Years experience to working with Java to write Map Reduce program.
  • Working experience to use Java utility to implement package,class,Inheritance,Polymorphism and Encapsulation.
  • Experience designing, reviewing, implementing and optimizing data transformation processes in the Hadoop and Talend/Informatica ecosystems. Able to consolidate, validate and cleanse data from a vast range of sources - from applications and databases to files and Web services.
  • Capable of extracting data from an existing database, Web sources or APIs. Experience designing and implementing fast and efficient data acquisition using Big Data processing techniques and tools.
  • Involved in Creating tables, partitioning, bucketing of table and creating UDF's in Hive.
  • Strong trouble-shooting and problem-solving skills with a Logical and pragmatic attitude
  • Team player with strong oral and interpersonal skills
  • Work with business to gather requirements and define the Data Quality solutions for data profiling, standardization and cleansing etc.
  • Define and contribute to development of standards, guidelines, design patterns and common development frameworks & components
  • Experience in analyzing data using HiveQL, Pig Latin, and custom Map Reduce programs in Java.

TECHNICAL SKILLS:

Hadoop ecosystem: Map Reduce, Sqoop, Hive,Impala, Oozie,Hue, PIG, HBase, HDFS, Zookeeper,Yarn, Spark

ETL: Informatica Power Center (6.2/7.1/8.1/8.6/9.1/9.5/9.6.1 ),IDQ,IDE, Talend Big Data Studio 6.0/6.3, Data stage (7.5/8.1)Databases: Teradata (13.0/14.10),Exadata,Oracle 9i /10g/11g (SQL, PL/SQL Basics), SQL Server 2005, DB2,Greenplum,Vertica

Tools: & Utilities: Toad,Microsoftvisio,Winscp,Appworx,Control-M,Remedy, AutoSys,Zenkins,Git hub,Jira,Talend Administration center.

Languages: C, C++, JAVA, SQL,PL/SQL, Pig Latin, HiveQL, Unix shell,Python scripting.

Operating System: Windows 98, 2000,7, XP, UNIX, Linux

Domain Knowledge: Finance, Banking, Telecom,Healthcare, Insurance,Manufacturing,playstation

Methodologies: Agile, Waterfall.

PROFESSIONAL EXPERIENCE:

Confidential, San Diego,CA

Talend+Hadoop Developer

Responsibilities:

  • Uderstand the current system(DPE/GFM),Analyzes high level system specifications,business requirements and/or use cases.
  • Attending daily meeting with Customer to find out the exact requirement and providing the technical solution to meet the customer requirement.
  • Migrating Exadata/Informatica projects to Talend.
  • Creating Hive and vertica queries to help market analysts spot emerging trends by comparing fresh data with reference tables use different Components of Talend (tOracleInput,tOracleOutput,tHiveInput,tHiveOutput,tHiveInputRow,tVerticaInput,tVerticaOutput,tVerticaRow,tUniqeRow,tAggregateRow,tRunJob,tPreJob,tPostJob,tMap,tJavaRow,tJavaFlex,tFilterRow etc) to develop standard jobs.
  • Use batch job to create spark job to load data on HDFS.
  • Use Job Conductor to deploy the job(.zar) files and scheduled and monitor the job.
  • Partitioning, Dynamic Partitions, Buckets of HIVE.
  • Data migration from relational (Oracle exadata) databases or external data to HDFS
  • Implement HIVE UDF’s for evaluation, filtering, loading, and storing of data.
  • Design Managed and External tables in Hive to optimize performance to improve performance
  • Using AutoMap join and avoid skew join, optimize limit operator, enable Parallel Execution, enable MapReduce Strict Mode, Single Reduce for Multi Group BY function.
  • Load data from different source (database and files) into Hive using Talend tool (standard, Map Reduce and Spark job), monitor System health and logs and respond to any warning or failure conditions.
  • Using API interface (HUE) to query data and managed tables.

Environment: Hadoop(Cloudera),HDFS,Vertica,Hive,Impala,Sqoop,Spark,Oracle(Exadata/HotMPP),UNIX,TalendBig Data Studio 6.3

Confidential, CA

Talend+Hadoop Developer

Responsibilities:

  • Provides expertise during the initial phases of the project, Analyzes high level system specifications,business requirements and/or use cases. Converts information into the appropriate level specifications and system design plan for the development.
  • Provides appropriate documentation for design decisions, estimating assumptions, code modules, and performance metrics as required by organization standards.
  • Uses comprehensive application knowledge and or technical knowledge to provide guidance and technical leadership to project resources or maintenance resources. Maintains an awareness of other projects and their possible effect on ongoing projects.
  • Build data systems and data pipelines that extract, classify, merge, and deliver new insights.
  • Data Ingestion, aggregating, Loading and transforming large data sets of structured, semi structured and unstructured data into hadoop(data lake).
  • Developed Spark code and Spark-SQL for faster testing and processing of realtime data structured and unstructured data.
  • Load the data into Spark RDD and performed in-memory data computation to generate the output response.
  • Created Hive queries that helped market analysts spot emerging trends by comparing fresh data with EDW reference tables and historical metrics.
  • Implemented Hive UDF's for evaluation, filtering, loading and storing of data.
  • Partitioning, Dynamic Partitions, Buckets in HIVE.
  • Data migration from relational(Oracle.Teradata) databases or external data to HDFS using Sqoop.
  • Designed both Managed and External tables in Hive to optimize performance.To improve performance we use automap join and avoid skew join, Optimize LIMIT operator,Enable Parallel Execution,Enable Mapreduce Strict Mode,Single Reduce for Multi Group BY.
  • Regular monitoring of Hadoop Cluster to ensure installed applications are free from errors and warnings.
  • Develop map reduce programs using Combiners, Sequence Files, Compression techniques, Chained Jobs, multiple input and output API.
  • Loading data from different source(database & files) into Hive using Talend tool.
  • Monitor System health and logs and respond accordingly to any warning or failure conditions.

Environment: Hadoop(Cloudera), HDFS, Map Reduce,Hive,Sqoop,Spark,Db2,Oracle,Teradata,Eclipse,UNIX,Talend Big Data Studio 6.0

Confidential, Northbrook, IL

Talend+Hadoop Developer

Responsibilities:

  • Responsible for building scalable distributed data solutions using Hadoop.
  • Handled importing of data from multiple data sources(Orcale,SqlServer) using Sqoop, performed Cleaning, Transformations and Joins using Pig.
  • Push data as delimited files into HDFS using Talend Big data studio.
  • Involved to write Map Reduce program using Java .
  • Load and transform data into HDFS from large set of structured data /Oracle/Sql server using Talend Big data studio.
  • Exported analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
  • Experience in providing support to data analyst in running Hive queries.
  • Continuous monitoring and managing the Hadoop cluster using Cloudera Manager.
  • Creating Hive tables, partitions to store different Data formats.
  • Involved in loading data from UNIX file system to HDFS.
  • Experience in managing and reviewing Hadoop log files.
  • Consolidate all defects, report it to PM/Leads for prompt fixes by development teams and drive it to closure.

Environment: -­­­­ Apache Hadoop x.2, Map Reduce, Hive, Sqoop, Spark, SQL, Eclipse, Unix Script, Oracle, Sql server, Talend Big Data Studio 6.0.

Confidential, Dublin, OH

ETL developer

Responsibilities:

  • Attending daily meeting with Customer to find out the exact requirement and providing the technical solution to meet the customer requirement.
  • Gather and analyze business and technical requirements.
  • Work with business to gather requirements and define the Data Quality solutions for data profiling, standardization and cleansing etc.
  • Define and contribute to development of standards, guidelines, design patterns and common development frameworks & components
  • Working effectively in a distributed global team environment
  • Prepare Unit Test Plan/ Design (with direction from customer).
  • Experience to prepare HLD, LLD, UTC, Tech Design Document.

Environment: ­­­­­­­Informatica 9.6.1,Oracle11g, Sql Server, DB2, Unix, Window(7).

Confidential

ETL Module Lead

Responsibilities:

  • Performed the data profiling and analysis making use of Informatica Data Explorer (IDE) and Informatica Data Quality (IDQ).
  • Provide solutions for data quality operations and Informatica ETL Processes to support Data Integration and Reporting requirements.
  • Data Profiling, Cleansing, Standardizing using IDQ and integrating with Informatica suite of tools
  • Develop and contribute to strategic vision for Data Quality and Data Archive
  • Perform hands-on development on the Data Quality tools (Informatica Developer), Analyzer.
  • Work with business to gather requirements and define the Data Quality solutions for data profiling, standardization and cleansing etc.
  • Define and contribute to development of standards, guidelines, design patterns and common development frameworks & components
  • Working effectively in a distributed global team environment
  • Informatica Administration activity (by using Admin console) to Create folders, user accounts, View and manage folder permissions and privileges, Start, Stop services, View service status and log.
  • Informatica application support and maintenance.
  • Act as tertiary escalation contact for issue resolution and problem management.
  • Develop and contribute to strategic vision for Data Quality and Data Archive
  • Perform hands-on development on the Data Quality tools (Informatica Developer),Analyzer.

Environment: ­­Informatica(8.6/9.1),IDQ, IDE, Business Object xir3, Teradata, Oracle 10g, Appworx, Unix, Remedy, Window XP.

Confidential

ETL Module Lead

Responsibilities:

  • Gathering the requirement with attending daily call with onsite.
  • As a Module lead involve in mentoring the team to achieve the goal on time.
  • Involve in Prepare the HLD, LLD, UTC document.
  • Proper understanding & analysis of the requirement.
  • Used different Client tool data stage Designer, Datastage Director,Datastage Manager extensible.
  • Involved to define or prepare Unit test Plan.
  • Involve reviewing the Code before deliver the objects.
  • Integration Testing.

Environment: -­­­­­­­ Informatica(8.6), Datastage(7.5/8.1), Oracle 10g, Linux, Autosys, WindowXP.­­­

Confidential

ETL developer

Responsibilities:

  • Successfully handled team member for BENCAP,DBP module.
  • Proper understanding & analysis of the requirement.
  • Designing according to the requirement. (High level to Low Level)
  • Implementation of mapping.
  • Unit test Plan preparation.
  • Code review and Unit Testing.
  • Product Testing, Integration Testing.
  • Code migration development to production environment.
  • High level ETL Production Support documentation.

Environment: -­­­­­­­Informatica(6.1/7.1), Oracle 9i, Unix, Windowxp.

Hire Now