Talend - Hadoop Lead Resume
Harrisburg, PA
SUMMARY
- Over 14+ years of experience in analysis, design, development, implementation and testing of web - based distributed applications
- Having 4+ years’ experience in Big Data technologies as Hadoop Developer/Lead with strong expertise in HDFS, Hive, IMPALA, Sqoop, Cassandra, ParAccel, Pig, Map Reduce, Hbase, Flume,Kafka, SPARK, Oozie, Bedrock Workflow, Talend 6.2 Big Data Platform and hands on experience in designing optimized solutions using various Hadoop components like Map reduce, Hive, Sqoop, Pig, HDFS, Flume have domain expertise in Media & Entertainment and Manufacturing, Healthcare Domain
- Proficiency in Hadoop and its Ecosystem and Java/J2EE related technologies
- Involved in development and enhancement projects and worked on Horton works HDP platform 1.3 and 2.1.4 distribution system/Cloudera/MapR, Hadoop ecosystems like HDFS, Map Reduce, Hive, Impala, Sqoop, Flume, Oozie,No SQL Databases - Cassandra, HBase and Analytical Database - Paraccel, Datalake
- Excellent understanding of Hadoop architecture and its components such as HDFS, JobTracker, TaskTracker, NameNode, DataNode and MapReduce programming paradigm.
- Have good experience in extracting and generating statistical analysis using Business Intelligence tool Tableau for better analysis of data.
- Experience in creating complex SQL Queries and SQL tuning,
- Very Good knowledge and Hands-on experience in Data warehousing
- Exposure to Cloudera development environment and management using Cloudera Manager.
- Worked on Ambari for cluster management(Cluster Health Check)
- Expertise in all major phases of a SDLC including Design, Development and Deployment, implementation and support.
- Working experience in AGILE and WATERFALL models.
- Expertise in preparing the test cases, documenting and performing unit testing and Integration testing.
- Expertise in cross-platform (PC/Mac, desktop, laptop, tablet) and cross-browser (IE, Chrome, Firefox, Safari) development.
- Skilled in problem solving and troubleshooting, strong organizational and interpersonal skills.
- Possesses professional and cooperative attitude, Adaptable approach to problem analysis and solution definition.
- Good team player with strong analytical and communication skills.
TECHNICAL SKILLS
Languages: Core Java, Scala
Programming Architecture: Map Reduce, PIG
Databases: Cassandra, Paraccel, HBase, Hive, Impala
File Systems: HDFS
Tools: & Utilities: Apache Spark SQL, Sqoop, Flume, Ambari, Jira, Putty, WinscpSquirrel, Talend 6.2, Tableau 8.2, Oozie
Domain Knowledge: Media & Entertainment, Manufacturing, Healthcare
PROFESSIONAL EXPERIENCE
Talend - Hadoop Lead
Confidential, Harrisburg, PA
Environment: CDH 5.5, HDFS, Hive, Impala, Talend 6.2 Big Data Platform, TAC, CONTROL-M, Git, Shell Scripts, Linux
Responsibilities:
- Responsible for Requirement gathering, preparation of design documents
- Involved Low level design for Talend Components, Hive, Impala, Shell scripts to process data.
- Worked on Talend ETL scripts to pull the data from TSV Files/Oracle Data Base into HDFS.
- Developed hive tables to upload data from different sources.
- Involved for Metadata Schema design.
- Developed a strategy for Full load and incremental load using Talend Hive.
- Mainly worked on Hive/Impala queries to categorize data of Parts/Sales data.
- Implemented Partitioning, Dynamic Partitions in HIVE.
- Created Context files and log files to any warning or failure conditions.
- Involved Sprint Planning and Sprint Retrospective meetings
- Worked in Agile development approach.
- Created the estimates and defined the sprint stages.
Sr. Hadoop Developer
Confidential, Peoria, IL
Environment: CDH 5, HDFS, Hive, Pig, Impala, Sqoop, Tableau, Oozie Workflows, Shell Scripts, IntelliJ, Gradle, Core Java, Junit, Remedy, Apache SPARK SQL (POC), Linux
Responsibilities:
- Responsible for Requirement gathering, preparation of design documents
- Involved Low level design for MR, Hive, Impala, Shell scripts to process data.
- Worked on ETL scripts to pull the data from DB2/Oracle Data Base into HDFS.
- Developed hive tables to upload data from different sources.
- Involved for Database Schema design.
- Involved Sprint Planning and Sprint Retrospective meetings
- Daily Scrum Status meeting.
- Proposed an automated system using Shell script to sqoop the job.
- Worked in Agile development approach.
- Created the estimates and defined the sprint stages.
- Developed a strategy for Full load and incremental load using Sqoop.
- Mainly worked on Hive/Impala queries to categorize data of different claims.
- Developed Apache Spark Scripts to connect Hive (POC)
- Implemented Partitioning, Dynamic Partitions in HIVE.
- Generate final reporting data using Tableau for testing by connecting to the corresponding Hive tables using Hive ODBC connector.
- Monitored System health and logs and respond accordingly to any warning or failure conditions.
Sr. Hadoop Lead Developer
Confidential
Environment: Apache Hadoop, HDFS, Hive, Pig, MapR, Data lake, Tableau, Bedrock Workflows, Shell Scripts, Linux, Rally, AnthilPro, Talend
Responsibilities:
- Responsible for Requirement gathering, preparation of design documents
- Involved Low level design for MR, Hive, Shell scripts to process data.
- Worked on ETL scripts to pull the data from DB2/Oracle/MS-SQL Data Base into HDFS.
- Developed hive tables to upload data from different sources.
- Involved for Database Schema design.
- Involved Sprint Planning and Sprint Retrospective meetings
- Daily Scrum Status meeting.
- Sprint Demos for each sprint to Product Owner
- Developed Bedrock Workflow for CDB and ECODS, BOSS, SMART Data sources.
- Have setup the 64 node cluster and configured the entire Hadoop platform.
- Migrating the needed data from DB2, Oracle, MySQL in to HDFS using Sqoop and importing various formats of flat files in to HDFS Data lake.
- Proposed an automated system using Shell script to sqoop the job.
- Data will be ingested and stored in Datalake
- Worked in Agile development approach.
- Created the estimates and defined the sprint stages.
- Developed a strategy for Full load and incremental load using Sqoop.
- Mainly worked on Hive queries to categorize data of different claims.
- Integrated the hive warehouse with HBase
- Written customized Hive UDFs in Java where the functionality is too complex.
- Implemented Partitioning, Dynamic Partitions in HIVE.
- Generate final reporting data using Tableau for testing by connecting to the corresponding Hive tables using Hive ODBC connector.
- Maintained System integrity of all sub-components (primarily HDFS, MR, HBase, and Hive).
- Monitored System health and logs and respond accordingly to any warning or failure conditions.
Sr. Hadoop Developer
Confidential
Environment: Apache Hadoop, HDFS, Hive, Pig, SQOOP, ParAcel, Cassandra, Shell Scripts, Map Reduce, Horton Works, CRON Jobs, Oracle, MySQL, Jira, Linux
Responsibilities:
- Responsible for Requirement gathering, analyzing Data Sources like Omniture, iTunes, Spotify etc, and preparation of design documents
- Responsible for designing & creating hive tables to upload data
- Developed Shell scripts for Data flow automation, scripts for uploading data into ParAccel server.
- Worked on SQOOP scripts to pull the data from ORACLE Data Base into HDFS
- Written the Map Reduce programs, Hive UDFs in Java.
- Develop HIVE queries for the analysts.
- Created an e-mail notification service upon completion of job for the particular team which requested for the data.
- Defined job work flows as per their dependencies in CRONTAB.
- Played a key role in productionizing the application after testing by BI analysts.
- Maintain System integrity of all sub-components related to Hadoop.
- Involved in orchestration of delta generation for time series data and Developed ETL from ParAcel database
- Involved for Cassandra Database Schema design
- Using BULK LOAD Utility data pushed to Cassandra databases
Software Engineer
Confidential
Environment: Java, J2EE, JCL, DB2, CICS
Responsibilities:
- Conducted requirements gathering sessions with the business user to collect business requirements (BRDs), data requirement, and user interface requirements.
- Responsible for the initiation, planning, execution, control and completion of the project
- Worked alongside the Development team in solving critical issues during the development.
- Responsible for developing management reporting using Cognos reporting tool.
- Conducted User Interview and documented reconciliation work flows.
- Conducted detailed analysis of current processes and developed new process flow, data flow, and work flow models, Use Cases using Rational Rose & MS Visio
- Prepared Use Cases, Business Process Models and Data flow diagrams, User Interface models.
- Gathered & analyzed requirements for EAuto, designed process flow diagrams.
- Defined business processes related to the project and provided technical direction to development workgroup.
- Analyzed the legacy and the Financial Data Warehouse.
- Participated in Data base design sessions, Database normalization meetings.
- Managed Change Request Management and Defect Management.
- Managed UAT testing and developed test strategies, test plans, reviewed QA test plans for appropriate test coverage.
- Coordinating with the Build Team, to deploying the application all along from Integration, Functional, and Regression and till Production.
- Preparing Skill Matrix for all Team Member.
Software Engineer
Confidential
Environment: JCL, DB2, IMS DB/DC, Stored Procedures
Responsibilities:
- Gathered business requirements and developed BRD & FSD
- Prepared project deliverables: Business Workflow analysis, process analysis, user requirement documents (Use Cases & Use case diagrams) and managed the requirements using Rational Requisite Pro.
- Assisting the team in extracting the rules behind the core Coverage’s program.
- Used UML methodology to prepare Use Cases from the gathered requirements and created diagrams using MS Visio.
- Involved in gathering the Data requirements, data analysis and defining the Data Mapping.
- Effectively implemented the change control processes and also updated the impacted business process documentations.
- Extensively used SQL and Excel for Data Analysis.
- Planned & implemented the Testing cycle schedules, timelines, test plan, Test cases & test data sets for system testing.
- Independently prepared UAT test plans, test conditions, test cases and developed user training manuals.
- Conducted User Acceptance Tests and maintained all defects in test director.
- Developed the training schedules, documents and trained the users.
- Designed the Java GUI in accordance with the CICS Screens.
Software Engineer
Confidential
Environment: JCL, DB2, IMS DB/DC, Teradata, Stored Procedures
Responsibilities:
- Responsible for the initiation, planning, execution, control and completion of the project
- Worked alongside the Development team in solving critical issues during the development.
- Conducted User Interview and documented reconciliation work flows.
- Conducted detailed analysis of current processes and developed new process flow, data flow, and work flow models, Use Cases using Rational Rose & MS Visio
- Prepared Use Cases, Business Process Models and Data flow diagrams, User Interface models.
- Gathered & analyzed requirements for EAuto, designed process flow diagrams.
- Defined business processes related to the project and provided technical direction to development workgroup.
- Analyzed the legacy and the Financial Data Warehouse.
- Participated in Data base design sessions, Database normalization meetings.
- Managed Change Request Management and Defect Management.
- Managed UAT testing and developed test strategies, test plans, reviewed QA test plans for appropriate test coverage.
- Coordinating with the Build Team, to deploying the application all along from Integration, Functional, and Regression and till Production.