Hadoop Developer Resume
WI
SUMMARY
- Over 7 years of professional IT experience which includes experience in Bigdata, Hadoop ecosystem related technologies in Banking, Retail, Insurance and Communication sectors
- Experience in deploying and managing multi - node Hadoop clusters with different Hadoop components (HDFS, YARN, HIVE, PIG, SQOOP, OOZIE, FLUME, ZOOKEEPER, SPARK) using Cloudera Manager and Hortonworks Ambari.
- Hands on experience with various ecosystem tools like HDFS, MapReduce, Hive, Pig, Oozie, Flume, Zookeeper, Spark.
- Good Knowledge on message streaming technologies like Kafka, Flume.
- Hands on experience in writing MapReduce programs from scratch according to the requirement.
- Experience in performing writing joins and sorting algorithms in MapReduce using Java.
- Experience in writing MR-unit test cases and testing it across the Hadoop development cluster.
- Familiar with writing Oozie workflows and Job Controllers for job automation
- Good Understanding of Distributed Systems and Parallel Processing architecture.
- Experience in deploying Hadoop 2.0(YARN)
- Good knowledge in administering the Linux systems to deploy Hadoop cluster and monitoring the cluster using Nagios and Ganglia.
- Familiar with importing and exporting data using Sqoop into HDFS and Hive.
- Experience in using Flume to ingest the data from web servers into HDFS.
- Good knowledge in benchmarking, performing backup and recovery and data residing on cluster.
- Hands on experience in upgrading, applying patches for Cloudera distribution.
- Implemented and Configured High Availability Hadoop Cluster (Quorum Based) for both HDFS and MapReduce using Zookeeper and Journal nodes.
- Hands on experience in provisioning and managing multi-tenant Hadoop clusters on public cloud environment - Amazon Web Services (AWS) and on private cloud infrastructure - Openstack cloud platform
TECHNICAL SKILLS
- Java
- Scala
- HDFS
- MapReduce
- Pig
- Hive
- Spark
- Sqoop
- Flume
- Kafka
- Oozie
- Zookeeper
- Servlets
- MySQL
- Oracle10g
- HTML
- XML
- CSS
- JQuery.
PROFESSIONAL EXPERIENCE
Confidential, WI
Hadoop Developer
Responsibilities:
- Involved in BigdataHadoop cluster capacity planning, deployment, tuning and benchmarking with Operations team.
- Developed MapReduce programs used to extract and transform the data sets and results were exported back to RDBMS using Sqoop.
- Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS.
- Involved in extracting the data from various sources into Hadoop HDFS for processing.
- Effectively used Sqoop to transfer data between databases and HDFS.
- Worked on streaming the data into HDFS from web servers using flume.
- Implemented custom interceptors for flume to filter data and defined channel selectors to multiplex the data into different sinks.
- Developed Map-Reduce programs to cleanse the data in HDFS obtained from heterogeneous data sources to make it suitable for ingestion into Hive schema for analysis.
- Implemented complex MapReduce programs to perform joins on Map side using Distributed Cache.
- Used Hive data warehouse tool to analyze the unified historic data in HDFS to identify issues and behavioral patterns.
- Created Hive tables as per requirement both internal and external tables defined with appropriate partitions, intended for efficiency.
- Created Hive external tables and loaded the data into tables and queried them using HQL
- Experience in administering the Linux systems to deploy Hadoop cluster and monitoring the cluster using Ganglia
- Used Oozie workflow engine to manage interdependent Hadoop jobs and to automate several types of Hadoop jobs such as Java MapReduce, Hive and Sqoop as well as system specific jobs.
- Experience in upgrading the cluster (major and minor) upgrades as part of migration.
Environment: Java, Map Reduce, HDFS, Hive, Pig, Flume, Sqoop, MySQL, Shell Scripting.
Confidential, Richmond, VA
Hadoop Developer
Responsibilities:
- Responsible for designing, developing and deploying automation of various tasks in the operations environment.
- Streamlined various workflows and deployments in dev, testing, pre prod and prod environments.
- Built servers, installing OS, file systems, setup file and directory permissions, hard disk partitioning.
- Mounted file systems, implemented security features and installed monitoring tools.
- Implementations of security, authentication-using Kerberos. Installed Cloudera tools on production, QA and Dev clusters.
- Which include Cloudera Manager, Hive, Pig, Sqoop, Oozie, and Impala, Spark.
- Participated in integration of existing cluster with Cloudera Manager.
- Authored Shell scripts for Cron jobs, automation of various installations.
- Debugging java MapReduce applications and worked closely with developers for troubleshooting.
Environment: RHEL, CDH3, CDH4, Sqoop, Flume, Pig, Hive, MapReduce, Nagios, Ganglia, Tableau.
Confidential, Madison, WI
Hadoop Developer
Responsibilities:
- Involved in review of functional and non-functional requirements.
- Facilitated knowledge transfer sessions.
- Installed and configured Hadoop Mapreduce, HDFS, Developed multiple MapReduce jobs in java for data cleaning and pre-processing.
- Importing and exporting data into HDFS and Hive using Sqoop.
- Experienced in defining job flows.
- Experienced in managing and reviewing Hadoop log files.
- Extracted files from CouchDB through Sqoop and placed in HDFS and processed.
- Experienced in running Hadoop streaming jobs to process terabytes of xml format data.
- Load and transform large sets of structured, semi structured and unstructured data.
- Responsible to manage data coming from different sources.
- Got good experience with NOSQL database.
- Supported Map Reduce Programs those are running on the cluster.
- Involved in loading data from UNIX file system to HDFS.
- Installed and configured Hive and also written Hive UDFs.
- Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.
- Gained very good business knowledge on health insurance, claim processing, fraud suspect identification, appeals process etc.
- Developed a custom File System plug in for Hadoop so it can access files on Data Platform.
- This plugin allows Hadoop MapReduce programs, HBase, Pig and Hive to work unmodified and access files directly.
- Designed and implemented Mapreduce-based large-scale parallel relation-learning system
- Extracted feeds form social media sites such as Facebook, Twitter using Python scripts.
- Setup and benchmarked Hadoop/HBase clusters for internal use
- Setup Hadoop cluster on Amazon EC2 using whirr for POC.
- Wrote recommendation engine using mahout.
Environment: Java 6, Eclipse, Oracle 10g, Sub Version, Hadoop, Hive, HBase, Linux, MapReduce.
Confidential, Oklahoma City, OK
Software Engineer
Responsibilities:
- Involved in gathering business requirements, analyzing the project and creating use Cases.
- Coordination with the Design team, Business analysts and end users of the system.
- Designed and developed front-end using JSP, JavaScript, html.
- Worked with Solr for indexing the data and used JSP for the Web application.
- Implemented caching techniques using Singleton Pattern, wrote POJO classes for storing data.
- Used JAXP (DOM, XSLT), XSD for XML data generation and presentation.
- Wrote JUnit test classes for the services and prepared documentation.
Environment: Java, JDBC, JSP, Servlets, HTML, JUnit, Java APIs, Design Patterns, MySQL, Eclipse IDE.
Confidential
Associate Software Engineer
Responsibilities:
- Responsible in developing regression test cases and test plans.
- Used tools like FireBug and FirePath to understand the HTML code for websites.
- Responsible in creating Absolute xpaths to locate elements on the page.
- Performed end to end and adhoc testing on multi OS platforms.
- Executed smoke tests on E-commerce website, reader application and reader devices whenever new build is released.
- Performed unit, functional, performance, stress, regression testing and cross browser testing.
- Performed Business acceptance testing to make sure the application meets the business requirements and made sure to include all the test cases in the test plan.
- Worked closely with developers to triage the defects.
Environment: Android, iOS, Java, CentOS, mAutomate, JavaScript, Agile.