Senior Hadoop Developer Resume Chicago, IL - Hire IT People

SUMMARY:

Around 9+ years of IT experience with gathering and analyzing customer’s technical requirements, development, management and maintenance projects on platforms like Hadoop, QlikView and Talend.
Expertise in Processing 25 PB data/ 700 Nodes including Dev/Prod clusters.
Excellent understanding of Hadoop architecture and different demons of Hadoop clusters which include Job Tracker, Task Tracker, Name Node and Data Node.
Working Experience on Snowflake Elastic data warehouse, cloud - based data-warehousing for storage and analyzing data.
Designed and Developed Automation Framework to save money and effort in Development Tasks.
Working Experience on Talend Components like tjavarow, tmap, thmap, tjdbc, context, hbase, hive and pig and big data batch processing and streaming processing components.
Experience in working with Hadoop in Stand-alone, pseudo and distributed modes.
Hands on experience on big data tools like Face book Presto, Apache Drill, Snowflake.
Hands on Experience on Kafka Streaming,Scala-Spark Streaming
Efficient in writing Map Reduce Programs and using Apache Hadoop API for analyzing the structured and unstructured data.
Experienced in handling different file formats like Text file, Sequence files and JSON files.
Expertise in implementing Ad-hoc queries using Hive QL.
Responsible for performing extensive data validation using HIVE Dynamic Partitioning and Bucketing.
Working Experience on Phoenix a massively parallel, relational database for analyzing data with Apache Hadoop.
Implemented Data Quality, Price Gap Rules in ETL Tool Talend.
Expertise in developing Hive Generic UDF's to implement complex business logic to in corporate into Hive QL.
Experienced in using Aggregate functions, table generated functions; implementing UDF's to handle complex objects.
Experienced in handling different optimization join operations like Map join, Sorted Bucketed Map join, Merge, Update, Delete, HUE etc.
Worked with QlikView Extensions like SVG Maps, HTML Content.
Developed Set Analysis to provide custom functionality in QlikView application.
Used Binary Load, Resident Load, Preceding Load, And Incremental Load during Data Model.
Experienced in Performance Tuning Hive queries using Hive configurable parameters.
Developed PIG Latin scripts using operators to extract data from data files to load into HDFS.
Experience in using Apache Sqoop to import and export data to and from (different sources) HDFS and Hive.
Worked with different file formats like RC file, Sequence file, ORC and AVRO file format.
Hands on experience in configuring and working with Flume to load the data from multiple sources directly into HDFS.
Good at working on low-level design documents and System Specifications.
Experience in working with BI team and transform Big Data requirements into Hadoop centric technologies.
Having good experience on Hadoop Administration like Cluster configuration, Single Node Configuration, Multi Node Configuration and Installing in distributed environment.
Trained around 50 associates on Hadoop, QlikView and its relative components.
Exposure to Shell Scripting for Build & Deployment Process
Worked as part of an agile team serving as a developer to customize, maintain and enhance a variety of applications for Hadoop.
Comprehensive knowledge of Software Development Life Cycle coupled with excellent communication skills.
Strong technical and interpersonal skills combined with great commitment towards meeting deadlines.
Experience working in both team and individual environments. Always eager to learn new technologies and implement them in challenging environment.

TECHNICAL SKILLS:

Hadoop Technologies and Distributions: Apache Hadoop, Cloudera Hadoop Distribution CDH3, CDH4, CDH5 and IBM Big Insights, Horton works

Hadoop Ecosystem: HDFS, Map-Reduce,Spark,Kafka,Hive, Pig, Sqoop, Oozie, Flume, HUE

NoSQL Database: Hbase

Databases: ORACLE,MySQL,Greenplum,Snowflake,Phoneix

Operating Systems: Linux (Red Hat, CentOS), Windows XP/7/8

ETL+Reporting: Talend

Cluster Management Tools: Cloudera Manager

BI Tools: QlikView

PROFESSIONAL EXPERIENCE:

Confidential, Chicago, IL

Senior Hadoop Developer

Responsibilities:

Designed and Developed Data lake Enterprise layer gold confirmed process which is available for the consumption team and business users to perform analytics.
Responsible for designing and developing an automated framework which creates automates the development process in Data lake.
Integrated Talend with HBase for storing the processed Enterprise Data into separate column families and column qualifiers.
Used CronTab and Zena scheduling to schedule trigger jobs in production.
Worked with cross functional consulting teams within the data science and analytics team to design, develop, and execute solutions to derive business insights and solve clients' operational and strategic problems.
Involved in migration of Teradata queries into the snowflake Data warehouse queries.
Worked in Agile Scrum model and involved in sprint activities.
Gathering and Analysis of Business requirements.
Worked in Various Talend Integrations with Hbase and Avro Format, Hive, Phoenix and Pig Components
Worked with GitHub, Zena, Jira, Jenkins Tools and deployed the projects in to production environments
Involved in Cluster coordination services through Zookeeper.
Worked On Integration with Phoenix Thick and thin clients and also involved in installing and developing Phoenix-Hive, Hive-Hbase integrations.
Wrote UNIX Automated Shell scripts and developed an automation framework with Talend and Unix.
Created Merge, Update, Delete Scripts in Hive and worked on performance tuning Joins in Hive.

Environment: Hadoop, HDFS, Hive, QlikView, UNIX shell scripting, Hue, Hbase, Avro Format, Phoenix, Talend, Snowflake.

Confidential, Deerfield, IL

Hadoop Developer

Responsibilities:

Responsible for designing and implementing ETL process to load data from different sources, perform data mining and analyze data using visualization/reporting.
Developed Big Data Solutions that enabled the business and technology teams to make data-driven decisions on the best ways to acquire customers and provide them business solutions.
Involved in migration of Teradata queries such as updates, inserts and deletes migration into the hive queries.
Development of PIG scripts for Noise reduction. Developed the Sqoop scripts for the historical data with BIG tables with more than 4 TB tables.
Worked in Agile Scrum model and involved in sprint activities.
Gathering and Analysis of Business requirements.
Involved with optimizing query performance and data load times in PIG, Hive and Map Reduce applications.
Expert in optimizing performance in hive using partitions and bucketing concepts.
Experienced to interact with data scientists to implement ad-hoc queries using Hive QL, Partitioning, bucketing and Hive Custom UDF's.
Experienced in optimizing hive queries, optimized joins and using different data files with Custom SerDe's.
Designed the process to do historical/incremental load.
Involved in Sqooping more than 20 TB of Data from Teradata to Hadoop.

Environment: Hadoop, HDFS, Hive, QlikView, UNIX shell scripting, Hue,Hbase,Pig,Sqoop, Talend.

Confidential, Bentonville, AR

Hadoop Developer

Responsibilities:

Developed Big Data Solutions that enabled the business and technology teams to make data-driven decisions on the best ways to acquire customers and provide them business solutions.
Analyze the assigned user stories in JIRA (Agile software) and create design documents.
Attend Daily stand ups and update the hours burned down in JIRA.
Worked in Agile Scrum model and involved in sprint activities.
Gathering and Analysis of Business requirements.
Created and Implemented Business, validation and coverage, Price gap Rules in Talend on Hive, Greenplum Databases using Talend Tool.
Involved in development of Talend components to validate the data quality across different data sources.
Involved in analysis of business validation rules and finding options for the implementation of the rules in Talend
Exceptions thrown out of the data validation rule execution will be sent back to the business users for remediating the data and ensuring clean data across data sources.
Worked on Global ID Tool to Apply the Business Rules.
Automated andSchedulingthe Rules on Weekly, Monthly Basis in TAC (Talend Administration Centre).
Created Scheduling Process with CRON Scheduling on a weekly Process.
Created and Maintained the Hive Tables and Greenplum Tables on weekly basis.
Collected the data from ftp server and loaded into the Hive tables.
Partitioned the collected logs by date/timestamps and host names.
Developed Data Quality Rules on top of the External Vendors Data.
Imported data frequently from MySQL to HDFS using Sqoop.
Supported operations team in Hadoop cluster maintenance activities including commissioning and decommissioning nodes and upgrades.
Used QlikView for visualizing and to generate reports.
Worked with systems engineering team to plan and deploy new Hadoop environments and expand existing Hadoop clusters.
Managing and scheduling Jobs using Oozie on a Hadoop cluster.
Involved in Data Modelling by Using QlikView Integration of Data Sources - ETL with QlikView reports.
Involved in defining job flows, managing and reviewing log files.
Monitored workload, job performance and capacity planning using Cloud era Manager.
Installed Oozie workflow engine to run multiple Map Reduce, Hive and Pig jobs.
Responsible for loading and transforming large sets of structured, semi structured and unstructured data.
Responsible to manage data coming from different sources.
Implemented pushing the data from Hadoop to Greenplum.
Worked on pre-processing the data using pig regular expressions.
Gained experience with NOSQL database.
Worked on scheduling the jobs through Resource Manager.
Created and maintained Technical documentation for launching HADOOP Clusters and for executing Hive queries and Talend Jobs.

Environment: Hadoop, HDFS, Hive, QlikView, UNIX shell scripting, Hue,Greenplum, Talend.

Confidential

QlikViewDeveloper

Responsibilities:

Worked in Agile Scrum model and involved in sprint activities.
Analysis of Business requirements and implementing Customer Friendly Dashboards.
Implemented Section Access for Security Implementation
Involved in Data Modelling by Using QlikView Integration of Data Sources - ETL with QlikView reports.
Identify and improve weak areas in the applications, performance reviews and code walk through to ensure quality.
Created QVD's and Designed QlikView Dashboards using different types of QlikView Objects.
Modified ETL Scripts while loading the data, resolving loops & ambiguity joins.
Wrote complex expressions using the Aggregation functions to match the logic with the business SQL.
Performance tuning by analyzing and comparing the turnaround times between SQL and QlikView.
Worked with QlikView Extensions like SVG Maps, HTML Content.
Developed Set Analysis to provide custom functionality in QlikView application.
Used Binary Load, Resident Load, Preceding Load, And Incremental Load during Data Model.

Environment: Hadoop, HDFS, Hive, SQL and QlikView.

Confidential

Hadoop Developer

Responsibilities:

Worked on analyzing Hadoop cluster using different big data analytic tools including Pig,Hive, and Map Reduce.
Collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis
Worked on debugging, performance tuning of Hive & Pig Jobs.
Created HBase tables to store various data formats ofPII data coming from different portfolios.
Implemented test scripts to support test driven development and continuous integration.
Worked on tuning the performance Pig queries.
Cluster co-ordination services through Zookeeper.
Experience in managing development time, bug tracking, project releases, development speed, release forecast, scheduling and many more.
Involved in loading data from LINUX file system to HDFS.
Importing and exporting data into HDFS and Hive using Sqoop.
Developed Java program to extract the values from XML using XPaths.
Experience working on processing unstructured data using Pig and Hive.
Supported Map Reduce Programs those are running on the cluster.
Gained experience in managing and reviewing Hadoop log files.
End-to-end performance tuning of Hadoop clusters and Hadoop Map/Reduce routines against very large data sets.
Implemented test scripts to support test driven development and continuous integration.
Involved in scheduling Oozie workflow engine to run multiple Hive and pig jobs.
Assisted in monitoring Hadoop cluster using tools like Cloudera Manager.
Experience in optimization of Map reduce algorithm using combiners and partitions to deliver the best results and worked on Application performance optimization for a HDFS
Created and maintained Technical documentation for launching HADOOP Clusters and for executing Hive queries and Pig Scripts.

Environment: Hadoop(CDH4), Map-Reduce, HBase,Hive,Sqoop,Oozie.

Confidential

Jr Java Hadoop Developer

Responsibilities:

Analysing the requirements.
Develop Map/reduce programs.
Developed components to interact web, HDFS and reports.
Developed the statistical charts using Primefaces.

Environment: Hadoop(CDH4), Map-Reduce, HBase,Hive,Sqoop,Oozie.

We provide IT Staff Augmentation Services!

Senior Hadoop Developer Resume

Chicago, IL

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship