We provide IT Staff Augmentation Services!

Hadoop Lead/architect Resume

2.00/5 (Submit Your Rating)

Addison, TX

PROFESSIONAL SUMMARY:

  • Around 10 years of experience in analysis, design and development of software applications using various technologies.
  • 4+ years of strong experience with Big Data and Hadoop Ecosystems.
  • Hands on experience in Apache Hadoop ecosystem components like HDFS, MapReduce, Oozie, Zookeeper, Hive, Sqoop, HBase, Flume, Pig, Spark, Kafka, Scala, Hue and Impala
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems (RDBMS) and vice - versa.
  • Experience in analyzing data using HIVEQL, PIG Latin and custom MapReduce programs in JAVA.
  • Extending HIVE and PIG core functionality by using custom UDF's.
  • Experience in NoSQL databases such as HBase and Cassandra.
  • Experience in coding MapReduce programs, knowledge of job workflow scheduling and monitoring tools like Oozie and Zookeeper.
  • Developed PIG Latin scripts for handling business transformations.
  • Experience in using Flume and Kafka to load the log data from multiple sources into HDFS.
  • Hands on experience in virtualization and worked on VMware Virtual Center.
  • Having good knowledge on Python and R.
  • Extensive experience in Requirements gathering, Analysis, Design, Reviews, Coding and Code Reviews, Unit and Integration Testing.
  • Adequate knowledge and working experience with Agile methodology.
  • Having Good knowledge on Single node and Multi Node Cluster Configurations.
  • In depth understanding of Hadoop Architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and MapReduce.
  • Experience in setting up Hive, Pig, HBase and Sqoop on Ubuntu Operating system.
  • Proficiency in OOProgramming using Java technologies, web technologies like HTML, XML, JSP & JavaScript.
  • Good experience and knowledge on SQL queries for manipulating data.
  • Good experience in developing Pig Scripts, Pig UDFs and Hive Scripts, Hive UDFs to load data files
  • Having Experience on UNIX commands and Deployment of Applications in Server.
  • Experienced in interacting with business users and technical consultants to analysis the requirements, business process, transforming requirements into technical specifications, designing databases, documenting, rolling out the deliverables.
  • Having good experience developing Java and mainframes applications.
  • Design, Development and testing of applications in Mainframes applications.
  • Effective ability to work independently as well as a team member on group Projects.

PROFESSIONAL EXPERIENCE:

Confidential, Addison, TX

Hadoop Lead/Architect

Responsibilities:

  • Developed Managed, External and partition tables as per the requirement.
  • Experience in loading and transforming of large sets of structured, semi structured and unstructured data.
  • Ingested structured data into appropriate schemas and tables to support the rule and analytics.
  • Developed custom User Defined Function (UDF's) in Hive to transform the large volumes of data with respect to business requirement.
  • Responsible for building scalable distributed data solutions using Hadoop.
  • Involved in loading data from edge node to HDFS using shell scripting.
  • Implemented scripts for loading data from UNIX file system to HDFS.
  • Implemented a script to transmit sysprin information from Oracle to Hbase using Sqoop.
  • Load and transform large sets of structured, semi structured and unstructured data.
  • Automated workflow using Shell Scripts.
  • Good experience in Hive partitioning, bucketing and perform different types of joins on Hive tables and implementing Hive series like REGEX, JSON and Avro.
  • Developed Pig Scripts, Pig UDFs and Hive Scripts, Hive UDFs to load data files.
  • Used Kafka for messaging services instead of message broker.
  • Experience in Hadoop 2.x with spark and Scala.
  • Managed Hadoop jobs using Oozie workflow scheduler system for Map Reduce, Hive, Pig and Sqoop actions.
  • Good knowledge on Data Ingestion and Data Processing.
  • Sound knowledge on Python and R.
  • Experience in managing and reviewing Hadoop log files.
  • Used Oozie workflow engine to run multiple Hive and pig jobs.
  • Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
  • Responsible to manage the test data coming from different sources.
  • Responsible for developing batch process using Unix Shell Scripting.

Environment: Apache Hadoop, HDFS, Hive, Pig, Sqoop, HBase, Unix, Shell Scripting, Spark, Scala, Kafka, Oozie, Zookeeper, CDH5.

Confidential, Somerset, NJ

Hadoop Developer / Analyst

Responsibilities:

  • Setup scripts to fetch data from various ftp server locations and copy them into HDFS folder corresponding to the client.
  • Defined client - agnostic formats for different kinds of data we receive from the clients.
  • Wrote Pig UDFs to pre-process the data received from various clients, and transform them to the required formats.
  • Specified numerous Pig relations to map various fields in the data set.
  • Developed various Pig Latin scripts to join, group different kinds of data to construct relevant records according to the functional requirement.
  • Developed MapReduce programs for analyzing the data, in cases where Pig scripts performance is not satisfactory.
  • Utilized HCATALOG to access Hive tables metadata from Pig scripts and MapReduce jobs.
  • Implemented test scripts to support test driven development and continuous integration.
  • Automated the jobs to pull the data from ftp servers to HDFS using Oozie workflows and enabled email alerts for communication in case of any failure.
  • Performed unit testing of MapReduce jobs using MRUnit.
  • Worked closely with the Data Analyst to identify the business aspects for analysis.
  • Took part in managing and reviewing log files.
  • Involved in set up of Oracle R connector for Hadoop so that data analyst can use data in HDFS to perform analytics.
  • Actively took part in scrum meetings to discuss the progress of the deliverables.

Environment: CDH4, HDFS, Cloudera Manager, MapReduce, Linux, Putty, Pig, Hive, Oozie, MRUnit, Shell scripting, Eclipse Luna, Java, VersionOne.

Confidential, MI

Hadoop Developer

Responsibilities:

  • Worked on analyzing Hadoop stack and different big data analytic tools including Pig, Hive, HBase database and Sqoop.
  • Involved in requirement gathering, architecture development, design, development and deployment of solutions built on the Hadoop platform.
  • Involved in loading and transforming large sets of Structured, Semi - Structured and Unstructured data and analyzed them by running Hive queries and Pig scripts.
  • Importing of data from various sources, performing transformations using Pig and loaded data into HDFS and extracted data from Teradata to HDFS using Sqoop.
  • Used different file formats like Text files, Sequence Files, Avro.
  • Developed map reduce programs for applying business rules on the data.
  • Played a key role in mentoring the team on developing MR jobs and custom UDFs.
  • Creating Hive tables and working on them using Hive QL.
  • Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
  • Developed Scripts to schedule the batch jobs.
  • Helped the team in optimizing Hive queries.
  • Utilized Agile Scrum Methodology to help manage and organize a team of 4 developers with regular code review sessions.
  • Weekly meetings with technical collaborators and active participation in code review sessions with junior developers.

Environment: Hadoop, MapReduce, HDFS, Hive, Pig, Linux, XML, MySQL, HBase, Ubuntu.

Confidential, Webster, NY

Hadoop developer

Responsibilities:

  • Installed and configured Apache Hadoop to test the maintenance of log files in Hadoop cluster.
  • Installed and configured Hive, Pig, Sqoop, Flume and Oozieon the Hadoop cluster.
  • InstalledOozie workflow engine to run multiple Hive and Pig Jobs.
  • Setup and benchmarked Hadoop /HBase clusters for internal use.
  • Developed Java MapReduce programs for the analysis of sample log file stored in cluster.
  • Developed Simple to complex Map/reduce Jobs using Hive and Pig.
  • DevelopedMap Reduce Programs for data analysis and data cleaning.
  • DevelopedPIG Latin scripts for the analysis of semi structured data.
  • Developed and involved in the industry specific UDF (user defined functions)
  • UsedHive and created Hive tables and involved in data loading and writing Hive UDFs.
  • UsedSqoop to import data into HDFS and Hive from other data systems.
  • Continuous monitoring and managing the Hadoop cluster using Cloudera Manager.
  • Migration of ETL processes from Oracle to Hive to test the easy data manipulation.
  • DevelopedHive queries to process the data for visualizing.

Environment: Apache Hadoop, HDFS, Cloudera Manager, Java, MapReduce, Eclipse Indigo, Hive, PIG, Sqoop, Oozie and SQL.

Confidential

JAVA/J2EE Consultant

Responsibilities:

  • Development using Struts MVC model with J2EE standards.
  • Design and development of front end using JSPs, struts, XML, JavaScript, HTML.
  • Design and development of Action & Form objects as part of Struts frame work.
  • Involved in the Development and Deployment of Stateless Session beans.
  • Generated deployment descriptors for EJBs using XML.
  • Worked on JavaScript libraries like JSP, angular JS, and JQuery to develop the application.
  • Developed shell scripts for Inventory Management.
  • Assisted in troubleshooting JSP and Java code (EJBs and Servlets).
  • Ported Application in WebSphere.

Environment: JDK 1.4, IBM WebLogic 7.1, WSAD 5.0, Oracle 9i, Ant, CVS, JUnit, Struts 2.0, JavaScript 1.1, HTML, Log4j, Rational Rose, Unix.

Confidential

Junior Programmer

Responsibilities:

  • Requirement gathering and worked according to the CR.
  • Data validation/Reconciliation report generation.
  • Code Development as per the client requirements.
  • Involved in the development backend code, altered tables to add new columns, Constraints, Sequences and Indexes as per business requirements.
  • Performed DML, DDL Operations as per the Business requirement.
  • Creating views and prepared the Business Reports.
  • Resolved production issues by modifying backend code as and when required.
  • Used different joins, sub queries and nested query in SQL query.
  • Involved in creation of sequences for automatic generation of Product ID.
  • Created Database Objects like tables, Views, sequences, Synonyms, Stored Procedures, functions, Packages, Cursors, Ref Cursor and Triggers.
  • Testing of code functionality using testing environment.
  • Worked under the senior level guidance.

Environment: MySQL, Windows, MS Excel, Reports, Java.

We'd love your feedback!