We provide IT Staff Augmentation Services!

Hadoop Developer Resume

0/5 (Submit Your Rating)

Rockville, MD

SUMMARY

  • Over 8 years of IT experience, with around 4 years of experience in Hadoop and Hadoop Ecosystem.
  • Experience in working on Cloudera distributions CDH4, CDH5 and Horton works HDP
  • Expertise in building business processes for Banking, Finance industries.
  • Strong expertise on BIGDATA data modeling techniques with Hive, Hbase.
  • Experience in developing PIG Latin Scripts and using Hive Query Language.
  • Expertise on Hive SERDE parser for unstructured data analysis.
  • Experience in developing customized UDF's in java to extend Hive and Pig Latin functionality.
  • Experience working on NoSQL database Hbase.
  • Experience using Sqoop to import data into HDFS from RDBMS and vice - versa.
  • Proficiency working with various data sources: RDBMSs, Web services and other data sources.
  • Experience in developing Map Reduce Programs using ApacheHadoopfor analyzing the Bigdata as per the requirement.
  • Good knowledge in java topics such as Generics, Collections and multi-threading.
  • Experience in database development using SQL and PL/SQL and experience working on databases like Oracle 9i/10g and MySQL.
  • Advanced experience with HDFS, MapReduce, Hive, Hbase, ZooKeeper, Impala, Pig, and Flume, Oozie.
  • Good understanding of workload management, schedulers, scalability and distributed platform architectures
  • Prepared test case scenarios and internal documentation for validation and reporting.
  • Ability to perform at a high level, meet deadlines, adaptable to ever changing priorities.
  • Knowledge of Hadoop GEN2 Federation, High Availability and YARN architecture.
  • Strong expertise in design and implementation of extraction and transformation (ETL) process
  • Experienced in working with global teams (off-shore / On Site model)
  • Experienced in coding SQL, PL/SQL, Procedures/Functions, and Triggers on database (RDBMS) packages like Oracle.
  • Experience in collecting business requirements, writing functional requirements and test cases and in creating technical design documents with UML - Use Cases, Class, and Sequence and Collaboration diagrams

TECHNICAL SKILLS

Languages: Java, PL SQL, Shell Scripting, Pig Latin, HiveQL

Microsoft Tools: MS Office Suite, MS Outlook, MS Project, MS Visio

Databases: Oracle 10G/11G, MYSQL.

Operating Systems: Windows XP, Unix, Linux

Hadoop: Hadoop 0.20.2-cdh3u3,HDFS 0.20.2, Map Reduce 0.20.2, Hbase 0.90.4, Pig0.8.1, Hive 0.7.1, Impala 1.2, Sqoop1.3.0, Flume0.9.4, Cassandra, Oozie 2.3.2, HUE 1.2.0.0, Zookeeper 3.3.3, YARN,Cluster Build,CDH4.8.2.

Web Technologies: HTML, DHTML,JSP, Servlets, Java Script.

PROFESSIONAL EXPERIENCE

Confidential, Rockville, MD

Hadoop Developer

Responsibilities:

  • Conduct interviews with subject matter experts and document the features to be included in the system.
  • Preparation of Vendor Questionnaire to capture the Vendor product features and advantages with Hadoop cluster.
  • Involved in design and Implementation of proof of concept for the system to be developed on BIGDATA Hadoop with Hbase, HIVE, Pig, Flume
  • Used Hbase for real time searching on log data and PIG, HIVE, MapReduce for analysis.
  • Managed and reviewed Hadooplog files
  • Used Flume to publish logs to Hadoop in real time.
  • Used Flume to collect, aggregate and store the web log data from different sources like web servers, mobile and network devices and pushed to HDFS.
  • Worked with business teams and created Hive queries for ad hoc access.
  • Involved in loading data from UNIX file system to HDFS. Automated the steps to load log files into Hive
  • Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.
  • Used HUE to save Hive queries for each required report and for downloading the query results as CSV or Excel.
  • Used Pig for data cleansing.
  • Created partitioned tables in Hive.
  • Developed Hive SerDe for parsing send email logs.
  • Wrote Hive UDF to extract date from the time in seconds.
  • Involved in installing the Hive, Hbase, PIG, Flume and other Hadoop ECO system software.
  • Involved in moving all log files generated from various sources to HDFS for further processing
  • Involved in HDFS maintenances and loading of structured and unstructured data
  • Involved in managing and reviewing Hadoop log files
  • Imported data using Sqoop to load data from mySql to HDFS on regular basis
  • Importing and exporting the data using Sqoop from HDFS to Relational Database systems
  • Importing and exporting data into HDFS and Hive using Sqoop
  • Involved in Hadoop along with Map Reduce, Hive and Pig set up.
  • Worked with Hive QL on big data of logs to perform a trend analysis of user behavior on various online modules.

Environment: Java 6, Eclipse, Linux 5.x, CDH3, CDH4.x, Sqoop, Pig, Hive 0.71, Flume, UNIX Shell Scripting, HUE, WinSCP, MYSQL 5.5

Confidential, Santa Clara, CA

Hadoop Developer

Responsibilities:

  • Responsible for Installing, Configuring and Managing of Hadoop Cluster spanning multiple racks.
  • DevelopedPigLatin scripts in the areas where extensive coding needs to be reduced to analyze large data sets.
  • Configured Sqoop and developed scripts to extract data from MySQL into HDFS.
  • Worked with business teams and created Hive queries for ad hoc access.
  • Involved in review of functional and non-functional requirements
  • Continuous monitoring and managing theHadoopcluster through Cloudera Manager.
  • UsedSqooptool to extract data from a relational database intoHadoop.
  • Involved in performance enhancements of the code and optimization by writing custom comparators and combiner logic.
  • Worked closely with data warehouse architect and business intelligence analyst to develop solutions.
  • Good understanding of job schedulers like Fair Scheduler which assigns resources to jobs such that all jobs get, on average, an equal share of resources over time and an idea about Capacity Scheduler.
  • Responsible for performing peer code reviews, troubleshooting issues and maintaining status report.
  • Involved in creating Hive Tables, loading with data and writing Hive queries, which will invoke and run MapReduce jobs in the backend.
  • Prepare daily and weekly project status report and share it with the client.
  • Supported in setting up QA environment and updating configurations for implementing scripts with Pig, Hive and Sqoop.
  • Responsible for coding Map Reduce program, Hive queries, testing and debugging the Map reduce programs

Environment: Apache Hadoop, Java (JDK 1.6), Oracle, My SQL, Hive, Pig, Sqoop,Linux, Cent OS, Cloudera Linux

Confidential, Lafayette, LA

Developer/Admin

Responsibilities:

  • Installed and configuredMapReduce, HDFS and developed multiple MapReduce jobs in java for data cleaning and preprocessing.
  • Monitored the Hadoop cluster with Ambari GUI to ensure the health of Hadoop servicesin Hadoop cluster
  • Developed workflows using custom MapReduce, Pig, Hive, Sqoop.
  • Tuned the cluster for optimal performance to process these large data sets.
  • Built reusable Hive UDF libraries for business requirements which enabled users to use these UDF's in Hive querying.
  • The logs and semi structured content that are stored on HDFS were preprocessed using PIG and the processed data is imported into Hive warehouse which enabled business analysts to write Hive queries.
  • DevelopedHadoop streaming Map/Reduce jobs using Java.
  • Configured big data workflows to run on the top ofHadoop using Control M and these workflows comprises of heterogeneous jobs like Pig, Hive, Sqoop and MapReduce.
  • Developed suit of Unit Test Cases for Mapper, Reducer and Driver classes using MR Testing library.
  • Developed workflow in Control M to automate tasks of loading data into HDFS and preprocessing with PIG.
  • Used Maven extensively for building jar files of MapReduce programs and deployed to Cluster.
  • Bug fixing and 24/7 production support.

Environment: MapReduce, PIG, Hive, Sqoop, Oozie, Flume, Java, Hbase, ApacheHadoop Distribution, Ambari Gui, VM Player, UNIX.

Confidential

Software Engineer

Responsibilities:

  • Associated in designing application using MVC design pattern.
  • Developed front-end user interface modules by using HTML, XML, Java AWT, and Swing.
  • Front-end validations of user requests carried out using Java Script.
  • Designed and developed the interactingJSPs and Servlets for modules like User Authentication and Summary Display.
  • Designed and developedEntity/Session EJB components for the primary modules.
  • Java Mailwas used to notify the user of the status and completion of the request.
  • Developed Stored Procedures on Oracle 8i.
  • Implemented Queries using SQL (database triggers and functions).
  • JDBC was used to interface the web-tier components on the J2EE server with the relational database.

Environment: Java1.3, EJB, Java Script, HTML, XML, Rational Rose, Microsoft Visio, Swings, JSP, Servlets, JNDI, JDBC, SQL, Oracle8i,Tomcat 3.1.

Confidential

Developer/Officer

Responsibilities:

  • Gathered requirements for the project and involved in analysis phase.
  • Worked on minor enhancements using core Java.
  • Involved in writingSQLqueries.
  • Used stored procedures, triggers, cursors, packages, Anonymous PL/SQL to store, retrieve, delete and update the database tables by using PL/SQL.
  • Used technologies likeJDBCfor accessing related data from database
  • Created UML class and sequence diagrams using Rational Rose.

Environment: Java, Oracle, PL/SQL.

We'd love your feedback!