Senior Hadoop Developer Resume Bellevue , WA - Hire IT People

SUMMARY:

5 years of professional IT experience and technical proficiency in Big data space with hands on expertise in development on Hadoop Platform and Java.
Extensive working experience on Hadoop eco - system components like MapReduce (MRv1, Yarn), Hive, Pig, Sqoop, Oozie.
Proficient in writing Map Reduce Programs and using Apache Hadoop Java API for analyzing the structured and unstructured data.
Good Understanding of Hadoop architecture and Hands on experience with Hadoop components such as Name Node, Data Node and Map Reduce concepts and HDFS Framework.
Experience with working on cloud infrastructure like Amazon Web Services(AWS)
Experience in launching EMR cluster, Redshift cluster,EC2 instances,S3 buckets, Amazon DataPipeline,SimpleWorkflowServices instances.
Experience in ingesting streaming data into hadoop using Spark, Storm Framework and Scala.
Expert in working with Hive data warehouse tool - creating tables, data distribution by implementing Partitioning and Bucketing, writing and optimizing the HiveQL queries.
Experience in writing Pig Latin scripts to sort,group,join and filter the data.
Experience in writing UDF’S in java for hive and pig.
Worked on UNIX shell scripts as part of the ETL process for implementing business logic and scheduled the jobs using CA7 Scheduler,Oozie Scheduler.
Experience in writing customized input formats using Mapreduce, working on various file formats like Avro,XML,JSON files,Log data.
Worked with different Hive file formats like RC file, Sequence file, ORC file format and Parquet.
Experience in using Apache Sqoop to import and export data to and from HDFS and Hive.
Hands on experience in setting up workflow using Apache Oozie workflow engine for managing and scheduling Hadoop jobs.
Good knowledge of No-SQL databases-Hbase, Cassandra and MongoDB.
Working experience on Pentaho Report Designer and Tableau visualization.
Experience in developing applications using Core Java and JSP,Html and CSS.
Worked on customizing Log4j.Properties redirecting hive/hbase logs to databases.
Good experience working with AWS, Cloudera and Pivotal HD Distribution.
Has knowledge on Kafka,Mahout machine learning,R.
Comprehensive knowledge of Software Development Life Cycle,Agile methodology, coupled with excellent communication skills.
Experience working in both team and individual environments. Always eager to learn new technologies and implement them in challenging environment.
Strong analytical and Problem solving skills.
Team player with good Inter personnel skills,communication and presentation skills. Exceptional ability to learn and master new technologies and to deliver outputs in short deadlines

TECHNICAL SKILLS:

Hadoop Technologies and Distributions: Apache Hadoop, HDP,Cloudera Hadoop Distribution CDH3, CDH4, CDH5, AWS, Pivotal HD(2.0)

Hadoop Ecosystem: HDFS, Map-Reduce, Hive, Pig, Sqoop, Oozie, Flume,Kafka,Zookeeper,HCatalog,Spark,StormNoSql Databases: Cassandra, MongoDB, HBase

Programming: C,Core Java 7,8, Advanced Java PL/SQL,Shell Scripting

AWS Hadoop Services: S3,EMR,SimpleWorkFlow,DataPipeline,Redshift Database

RDBMS: ORACLE, MySQL, SQL Server

Operating Systems: Linux (RedHat, CentOS), Windows XP/7/8

Web Servers: Apache Tomcat

ETL: Pentaho Report Designer

BI Tools: Tableau.

PROFESSIONAL EXPERIENCE:

Confidential, Bellevue, WA

Senior Hadoop Developer

Responsibilities:

Involved in injesting data into IDW staging directly from BEAM, (an inbuilt component for ingesting real time data into hadoop) using Apache Storm to push data into HDFS.
Used OOZIE Operational Services for batch processing and scheduling workflows dynamically to run multiple Hive, shell script and Pig jobs which run independently with time and data availability.
Part of the design team of the various generic components such as SCD and Data Validation.
Development of the solution for several data ingestion channel and patterns, also involved in production issues.
Extensively worked on creating End-End data pipeline orchestration using Oozie.
Used Shell scripting for automation of scripts.
Worked on QA support activities, test data creation and Unit testing activities.
Used HBase in accordance with Hive/Pig as per the requirement.
Worked on PIG joins, and Join optimization, processing the incremental data using hadoop.
Created oozie jobs using sqoop to export the data from Hadoop toTeradata development.
Involved in developing a customized in built tool Data Movement Framework(DMF) for ingesting data from external and internal sources into hadoop using Sqoop,Shell script.
Proposed an automated system using Shell script to implement import using sqoop .
Worked in Agile development approach and managed the Hadoop teams of various Sprints

Environment: HortonworksDataPlatform Hadoop Platform, HDFS, Hbase,Hive, Java, Sqoop, Oracle,MySQL,Storm .

Confidential, Bentonville, AR

Senior Hadoop Developer

Responsibilities:

Developed Big Data Solutions that enabled the business and technology teams to make data-driven decisions on the best ways to acquire customers and provide them business solutions.
Worked on automation of delta feeds from, Teradata using Sqoop, also from FTP Servers to Hive.
Involved in exporting data from Hadoop to Greenplum using GPload utility.
Developed MapReduce programs to cleanse the data in HDFS obtained from heterogeneous datasources to make it suitable for ingestion into Hive schema for analysis
Used Sqoop to import the data from RDBMS to Hadoop Distributed File System (HDFS) and later analysed the imported data using Hadoop Components
Established custom MapReduces programs in order to analyze data and used Pig Latin to clean unwanted data
Did various performance optimizations like using distributed cache for small datasets, Partition, Bucketing in hive and Map Side join’s.
Involved in loading and transforming large sets of Structured, Semi-Structured and Unstructured data and analyzed them by running Hive queries and Pig scripts
Participated in requirement gathering from the Experts and Business Partners and converting the requirements into technical specifications
Implemented daily workflow for extraction, processing and analysis of data with Oozie.
Involved in loading data from LINUX file system to HDFS.

Environment: Hadoop, Pig, Hive, Sqoop, Flume, MapReduce, HDFS, LINUX, Oozie.

Confidential

Hadoop Developer

Responsibilities:

Worked on analyzing Hadoop cluster using different big data analytic tools including Pig,Hive, and MapReduce.
Launching and Setup of HADOOP Cluster on AWS, which includes configuring different components of HADOOP
Managed the Hive database, which involves ingest and index of data.
Launching the EMR Cluster and Redshift cluster.
Implementing the Amazon EMR (Elastic MapReduce) job to process the data in zip format and converting to Gzip format.
Involved in customizing the Input format for zip files(ZipInputFormat).
Cleansing and processing the Zip file data in the MapReduce.
Creating jar file and uploading into S3 Bucket.
Adjustments of delimiters in data using EMR.
Creating Datapipeline jobs for automation process.
Scheduling the Dataload process into Redshift DB.
Monitoring the EMR jobs.
Implementing and running the queries in redshift cluster.
Implementing autoscaling for Redshift database.
Worked on debugging, performance tuning of Hive & Pig Jobs.
Worked on tuning the performance Pig queries.
Experience working on processing unstructured data using Pig and Hive.
Worked on evaluating complex business metrics in Pig,Mapreduce.

Environment: Amazon EMR,DataPipeline,,MapReduce(Java), S3, Redshift, Java, Map-Reduce, Hive, Pig,EMR,SWF Java API

Confidential

Hadoop Developer

Responsibilities:

Responsible for designing and implementing ETL process to load data from different sources, perform data mining and analyze data using visualization/reporting tools to leverage the performance of System.
Collected the logs from the physical machines and integrated into HDFS using Flume.
Developed custom MapReduce programs to extract the required data from the logs.
Performed performance tuning and troubleshooting of MapReduce jobs by analyzing and reviewing Hadoop log files.
Imported data frequently from Teradata to HDFS using Sqoop.
Used Tableau for visualizing and to generate reports.
Managing and scheduling Jobs using Oozie on a Hadoop cluster.
Experience in Hadoop stack, cluster architecture and monitoring the cluster
Involved in defining job flows, managing and reviewing log files.
Installed Oozie workflow engine to run multiple Map Reduce, Hive and Pig jobs.
Responsible for loading and transforming large sets of structured, semi structured and unstructured data.
Extracted files from different sources like Teradata,db2 and placed into HDFS using Sqoop and preprocess the data for analysis.
Created and maintained Technical documentation for launching HADOOP Clusters and for executing Hive queries and Pig Scripts.

Environment: JDK 1.5, Hadoop, HDFS, Pig, Hive, MapReduce, HBase, Sqoop, Oozie and Flume, Tableau.

We provide IT Staff Augmentation Services!

Senior Hadoop Developer Resume

Bellevue, WA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship