We provide IT Staff Augmentation Services!

Senior Hadoop Developer Resume

5.00/5 (Submit Your Rating)

Louisville, KY

SUMMARY:

  • Have 7+ years experience in IT industry which includes 3+ years of experience in Java and 4+ years of experience in BIGDATA (HDFS, HIVE, PIG, SQOOP, FLUME and others)
  • Strong experience and knowledge of Hadoop, HDFS, Map Reduce and Hadoop ecosystem components like Hive, Pig, Sqoop, Oozie, NoSQL and Python.
  • Excellent understanding knowledge of Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node, Resource Manager and Map Reduce
  • Hands on experience in installing, configuring and using ecosystem components like Hadoop MapReduce, HDFS, Hbase, Zoo Keeper, Oozie, Hive, HDP, Sqoop, PIG, Flume
  • Hands on experience in writing MAPREDUCE programs in JAVA, PYTHON
  • Worked on Multi Clustered environment and setting up Cloudera Hadoop echo System.
  • Implemented workflows in Oozie using Sqoop, MapReduce, Hive and other Java and Shell actions.
  • Expertise in developing solutions around SQL and NoSQL databases like HBASE.
  • Strong experience on Hadoop distributions like Cloudera and Hortonworks
  • In - depth understanding of Data Structure and Algorithms
  • Hands on experience in application development using Java, RDBMS, and Unix shell scripting.
  • Experience in validating the files loading into HDFS.
  • Validating that there is no data loss by comparing HIVE table data against RDBMS data.
  • Experience in Java, Python, Scala programming
  • Experience and good knowledge in streaming realtime data using Apache Kafka and Apache strom
  • Experience in Java, CSS, XML, JUnit
  • Have hands-on Experience in Tableau to view transformed data in graphical view
  • Have good knowledge in the tool Talend
  • Knowledge and experience in Data masking, Data Discovery
  • Strong communication, written and interpersonal skills.
  • Exceptional ability to quickly master new concepts.
  • Able to adapt to any situation and a great team player.

TECHNICAL SKILLS:

BIGDATA Technologies: Hadoop, HDFS, Pig, Hive, HBase, Flume, Cassandra, MongoDB, Sqoop,kafka,Apache strom and Oozie.Apache Spark, Mahout

Java/J2EETechnologies: Java 6.0, Servlets, JSP, JDBC, XML, AJAX, SOAP, WSDL

WebTechnologies: HTML, XML, JavaScript, CSS

Databases: Data warehousing, PL/SQLOracle DB2, MS - SQL Server, NoSQL, MySQLMS - Access

Operating Systems: Windows 98/XP/Vista/7, UNIX, Linux,Mac.

BI Tools: Tableau,Talend

Other Tools: Eclipse,Visual Studio 2008/2010,Netbeans

Other Languages: Scala, Python

NoSQL: HBase, Cassandra

PROFESSIONAL EXPERIENCE:

Senior Hadoop Developer

Confidential, Louisville, KY

Responsibilities:

  • Designed and developed Hadoop system to analyze the SIEM (Security Information and Event Management) data usingMapReduce, HBase, Hive, Sqoop and Flume.
  • Involved in designing and implementation of Hbase .
  • Coordinated with business customers to gather business requirements and interact with other technical peers to derive technical requirements.
  • Extensively involved in Design phase and delivered Design documents.
  • Developed custom writable MapReduce JAVA programs to load web server logs into HBase using flume.
  • Log data Stored in HBase DB is processed and analyzed and then imported into Hive warehouse, which enabled end business analysts to write HiveQL queries.
  • Built re-usable Hive UDF libraries which enabled various business analysts to use these UDF’s in Hive querying.
  • Developed various workflows using custom MapReduce, Pig, Hive and scheduled them using Oozie.
  • Extensive knowledge in troubleshooting code related issues.
  • Developed suit of Unit Test Cases for Mapper, Reducer and Driver classes using MRUnit.
  • Used Apache Kafka and Apache Strom to gather log data and fed into HDFS
  • Configured flume agent with flume syslog source to receive the data from syslog servers.
  • Auto Populate Hbase tables with data coming from Kafka sink.
  • Designed and coded application components in an agile environment utilizing test driven development approach.
  • POC (Proof of Concept):
  • Setup an agent app interaction server to capture the keystrokes, screen recordings using the NICE application suite
  • Setup flume agent on the server to capture the server logs.
  • Used Avro to move the file into HDFS.
  • Used Pentaho with sql
  • Designed and build unit tests using Hbase
  • Executed operational queries on Hbase
  • Analysis of the application usage in a day-day basis on a sample of machines log data using Spark, Hive Pig.
  • Used Sqoop to move the analyzed data to the Oracle DB for report generation.
  • Developed Scala programs for data extraction and Spark streaming
  • Developed MapReduce programs using Scala
  • Knowledge in ETL tools like Informatica
  • Worked on Data governance implementation.
  • Worked on Data modeling during application software design

Environment: Hortonworks, MapReduce, Kafka, Strom,YARN2.0, HBase, Hive, Java, Pig, Oozie, Flume, Sqoop,Tableau,pentaho,PL/SQl,Data masking, Data Modeling, Scala, phyton

Senior Hadoop Developer

Confidential, Tampa, FL

Responsibilities:

  • Launching Cloudera Instances using Cloudera Images (Linux/ Ubuntu) and Configuring launched instances with respect to specific applications.
  • Launching and Setup of HADOOP/ HBASE Cluster which includes configuring different components of HADOOP and HBASE Cluster.
  • Hands on experience in loading data from UNIX file system to HDFS.
  • Experienced on loading and transforming of large sets of structured, semi structured and unstructured data from HBase through Sqoop and placed in HDFS for further processing.
  • Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data HDFS using Sqoop
  • Written HIVE queries and HDFS commands to validate the data between HDFS Files and HIVE External tables to validate the data loaded in hive external tables.
  • Installed and configured Flume, Hive, Pig, Sqoop and Oozie on the Hadoop cluster.
  • Managing and scheduling Jobs on a Hadoop cluster using Oozie.
  • Involved in creating Hive tables, loading data and running hive queries in those data.
  • Configured Oozie workflow to run multiple Hive and Pig jobs which run independently with time and data availability.
  • Handled NoSql database like HBASE
  • Extensive Working knowledge of partitioned table, UDFs, performance tuning, compression-related properties, thrift server in Hive.
  • Involved in writing optimized Pig Script along with involved in developing and testing Pig Latin Scripts.
  • Working knowledge in writing Pig's Load and Store functions
  • Experience in migrating data to and from RDBMS and unstructured sources into HDFS using Sqoop&Flume.

Environment: Cloudera, Apache Hadoop 1.0.1, MapReduce, HDFS, CentOS 6.4,Linux Redhat, Hbase, Hive, Pig, Oozie, Flume, Java (jdk 1.6), Eclipse, Tableau,PL/SQl,Scala, Python

Hadoop Developer

Confidential, Phoenix, Arizona

Responsibilities:

  • Worked with business partners to gather business requirements.
  • Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS.
  • Developed the Pig UDF'S to pre-process the data for analysis.
  • Implemented Spark applications from existing MapReduce framework for better performance
  • Implemented multiple Map Reduce Jobs in java for data cleansing and pre-processing.
  • Moved all RDBMS data into flat files generated from various channels to HDFS for further processing.
  • Developed job workflows in Oozie to automate the tasks of loading the data into HDFS.
  • Responsible for creating Hive tables, loading data and writing hive queries.
  • Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data HDFS using Sqoop.
  • Writing the script files for processing data and loading to HDFS.
  • Worked extensively with Sqoop for importing metadata from RDBMS.
  • Writing CLI commands using HDFS.
  • Responsible for running Hadoop streaming jobs to process terabytes of XML Data.
  • Responsible to ensured NFS is configured for Name Node.
  • Setting up crone job to delete hadoop logs/local old job files/cluster temp files.
  • Setup Hive with MySQL as a Remote Meta store.
  • Created connection through JDBC and used JDBC statements to call stored procedures.
  • Implemented nine nodes CDH3 Hadoop cluster.
  • Moved all log/text files generated by various products into cluster location on top of HDFS.
  • Tracks the customer support tickets through the JIRA Tool.
  • Used Apache Kafka to gather log data and fed into HDFS

Environment: Hadoop 1.x, HDFS, Map Reduce, Hive 10.0, Pig, Sqoop, HBase, Kafka,Shell Scripting, Oozie, Oracle 10g, SQL Server 2008, Ubuntu 13.04, Cloudera, Java 6.0, JDBC, Apache Tomcat

Hadoop Developer

Confidential, New York, NY

Responsibilities:

  • Expertise in designing and deployment of Hadoop cluster and different Big Data analytic tools including Pig, Hive, HBase, Oozie, ZooKeeper, SQOOP, flume, Kafka, Spark, Cassandra with Hortonworks and Cloudera
  • Installed Hadoop, Map Reduce, HDFS, AWS and developed multiple MapReduce jobs in PIG and Hive for data cleaning and pre-processing.
  • Understanding business needs, analysing functional specifications and map those to develop and designing MapReduce programs and algorithms.
  • Written Pig and Hive jobs to parse the logs and structure them in tabular format to facilitate effective querying on the log data. Also have hand on Experience on Pig and Hive User Define Functions (UFD).
  • Execution of Hadoop ecosystem and Applications through Apache HUE.
  • Optimizing Hadoop MapReduce code, Hive/Pig scripts for better scalability, reliability and performance.
  • Developed the OOZIE workflows for the Application execution.
  • Feasibility Analysis (For the deliverables) - Evaluating the feasibility of the requirements against complexity and time lines.
  • Performing data migration from Legacy Databases RDBMS to HDFS using SQOOP.
  • Writing Pig scripts for data processing.
  • Implemented Hive tables and HQL Queries for the reports. Written and used complex data type in Hive. Storing and retrieved data using HQL in Hive. Developed Hive queries to analyze reducer output data.
  • Highly involved in designing the next generation data architecture for the unstructured data
  • Managed a 4-node Hadoop cluster for a client conducting a Hadoop proof of concept. The cluster had 12 cores and 3 TB of installed storage.
  • Developed PIG Latin scripts to extract data from source system.
  • Involved in Extracting, loading Data from Hive to Load an RDBMS using SQOOP.
  • Integrate four square monitoring and production system with Kafka
  • Designed, documented operational problems by following standards and procedures using a software reporting tool JIRA.
  • Worked on Data modeling during application software design

Environment: HDFS, Map Reduce, Hive, Oozie, Java, PIG, Shell Scripting,Kafka, Linux, HUE, SQOOP, Flume, DB2, and Oracle 11g, Data modelling

Java Developer

Confidential

Responsibilities:

  • Involved in developing solutions to requirements, enhancements and defects.
  • Involved in requirements design, development, and system testing.
  • Implemented Action class to encapsulate the business logic.
  • Used frameworks for developing applications.
  • Used various design patterns using Core Java techniques.
  • Used Object Oriented Application Design (OOA/D) for deriving objects and classes.
  • Stored Procedures, database triggers were used at all levels.
  • Communicating across the team about the processes, goals, guidelines and delivery of items.
  • Developed the Java Code using Eclipse as IDE.
  • Configuration of Tomcat 4.1 for the application on Win NT server.
  • Used Java script for validation of page data in the JSP pages.
  • Responsible for code version management and unit test plans

Environment: Java 1.3, Tomcat, Eclipse, SQL and Windows.

Java Developer

Confidential

Responsibilities:

  • Get the knowledge of the Test Suite from onsite coordinator.
  • Dry run the Suite while the transfer of information to the onsite coordinator is in progress.
  • Set up the environment for the Port and Identify the major issues with the operating system to which porting should be done and solve them.
  • Customize the test suite by modifying various java options and shell scripts based on the JVM/OS combination.
  • Understand the functionalities of the Web Logic component to be certified.
  • Test the Web Logic component on a given Operating system.
  • Componentwhile running the Test Suite.
  • Analyze the problems and find whether the problems are with the Product/OS/Database
  • Fix the problem or pass them to the concerned teams.
  • Have been involved in building and deployment of the applications by using build tools like ANT, Maven.
  • Training and sharing of knowledge on Core java components with the new team members.

Environment: Linux, Window, Core Java,SQL

We'd love your feedback!