Hadoop Lead/Architect Resume Addison, TX - Hire IT People

PROFESSIONAL SUMMARY:

Around 10 years of experience in analysis, design and development of software applications using various technologies.
4+ years of strong experience with Big Data and Hadoop Ecosystems.
Hands on experience in Apache Hadoop ecosystem components like HDFS, MapReduce, Oozie, Zookeeper, Hive, Sqoop, HBase, Flume, Pig, Spark, Kafka, Scala, Hue and Impala
Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems (RDBMS) and vice - versa.
Experience in analyzing data using HIVEQL, PIG Latin and custom MapReduce programs in JAVA.
Extending HIVE and PIG core functionality by using custom UDF's.
Experience in NoSQL databases such as HBase and Cassandra.
Experience in coding MapReduce programs, knowledge of job workflow scheduling and monitoring tools like Oozie and Zookeeper.
Developed PIG Latin scripts for handling business transformations.
Experience in using Flume and Kafka to load the log data from multiple sources into HDFS.
Hands on experience in virtualization and worked on VMware Virtual Center.
Having good knowledge on Python and R.
Extensive experience in Requirements gathering, Analysis, Design, Reviews, Coding and Code Reviews, Unit and Integration Testing.
Adequate knowledge and working experience with Agile methodology.
Having Good knowledge on Single node and Multi Node Cluster Configurations.
In depth understanding of Hadoop Architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and MapReduce.
Experience in setting up Hive, Pig, HBase and Sqoop on Ubuntu Operating system.
Proficiency in OOProgramming using Java technologies, web technologies like HTML, XML, JSP & JavaScript.
Good experience and knowledge on SQL queries for manipulating data.
Good experience in developing Pig Scripts, Pig UDFs and Hive Scripts, Hive UDFs to load data files
Having Experience on UNIX commands and Deployment of Applications in Server.
Experienced in interacting with business users and technical consultants to analysis the requirements, business process, transforming requirements into technical specifications, designing databases, documenting, rolling out the deliverables.
Having good experience developing Java and mainframes applications.
Design, Development and testing of applications in Mainframes applications.
Effective ability to work independently as well as a team member on group Projects.

PROFESSIONAL EXPERIENCE:

Confidential, Addison, TX

Hadoop Lead/Architect

Responsibilities:

Developed Managed, External and partition tables as per the requirement.
Experience in loading and transforming of large sets of structured, semi structured and unstructured data.
Ingested structured data into appropriate schemas and tables to support the rule and analytics.
Developed custom User Defined Function (UDF's) in Hive to transform the large volumes of data with respect to business requirement.
Responsible for building scalable distributed data solutions using Hadoop.
Involved in loading data from edge node to HDFS using shell scripting.
Implemented scripts for loading data from UNIX file system to HDFS.
Implemented a script to transmit sysprin information from Oracle to Hbase using Sqoop.
Load and transform large sets of structured, semi structured and unstructured data.
Automated workflow using Shell Scripts.
Good experience in Hive partitioning, bucketing and perform different types of joins on Hive tables and implementing Hive series like REGEX, JSON and Avro.
Developed Pig Scripts, Pig UDFs and Hive Scripts, Hive UDFs to load data files.
Used Kafka for messaging services instead of message broker.
Experience in Hadoop 2.x with spark and Scala.
Managed Hadoop jobs using Oozie workflow scheduler system for Map Reduce, Hive, Pig and Sqoop actions.
Good knowledge on Data Ingestion and Data Processing.
Sound knowledge on Python and R.
Experience in managing and reviewing Hadoop log files.
Used Oozie workflow engine to run multiple Hive and pig jobs.
Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
Responsible to manage the test data coming from different sources.
Responsible for developing batch process using Unix Shell Scripting.

Environment: Apache Hadoop, HDFS, Hive, Pig, Sqoop, HBase, Unix, Shell Scripting, Spark, Scala, Kafka, Oozie, Zookeeper, CDH5.

Confidential, Somerset, NJ

Hadoop Developer / Analyst

Responsibilities:

Setup scripts to fetch data from various ftp server locations and copy them into HDFS folder corresponding to the client.
Defined client - agnostic formats for different kinds of data we receive from the clients.
Wrote Pig UDFs to pre-process the data received from various clients, and transform them to the required formats.
Specified numerous Pig relations to map various fields in the data set.
Developed various Pig Latin scripts to join, group different kinds of data to construct relevant records according to the functional requirement.
Developed MapReduce programs for analyzing the data, in cases where Pig scripts performance is not satisfactory.
Utilized HCATALOG to access Hive tables metadata from Pig scripts and MapReduce jobs.
Implemented test scripts to support test driven development and continuous integration.
Automated the jobs to pull the data from ftp servers to HDFS using Oozie workflows and enabled email alerts for communication in case of any failure.
Performed unit testing of MapReduce jobs using MRUnit.
Worked closely with the Data Analyst to identify the business aspects for analysis.
Took part in managing and reviewing log files.
Involved in set up of Oracle R connector for Hadoop so that data analyst can use data in HDFS to perform analytics.
Actively took part in scrum meetings to discuss the progress of the deliverables.

Environment: CDH4, HDFS, Cloudera Manager, MapReduce, Linux, Putty, Pig, Hive, Oozie, MRUnit, Shell scripting, Eclipse Luna, Java, VersionOne.

Confidential, MI

Hadoop Developer

Responsibilities:

Worked on analyzing Hadoop stack and different big data analytic tools including Pig, Hive, HBase database and Sqoop.
Involved in requirement gathering, architecture development, design, development and deployment of solutions built on the Hadoop platform.
Involved in loading and transforming large sets of Structured, Semi - Structured and Unstructured data and analyzed them by running Hive queries and Pig scripts.
Importing of data from various sources, performing transformations using Pig and loaded data into HDFS and extracted data from Teradata to HDFS using Sqoop.
Used different file formats like Text files, Sequence Files, Avro.
Developed map reduce programs for applying business rules on the data.
Played a key role in mentoring the team on developing MR jobs and custom UDFs.
Creating Hive tables and working on them using Hive QL.
Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
Developed Scripts to schedule the batch jobs.
Helped the team in optimizing Hive queries.
Utilized Agile Scrum Methodology to help manage and organize a team of 4 developers with regular code review sessions.
Weekly meetings with technical collaborators and active participation in code review sessions with junior developers.

Environment: Hadoop, MapReduce, HDFS, Hive, Pig, Linux, XML, MySQL, HBase, Ubuntu.

Confidential, Webster, NY

Hadoop developer

Responsibilities:

Installed and configured Apache Hadoop to test the maintenance of log files in Hadoop cluster.
Installed and configured Hive, Pig, Sqoop, Flume and Oozieon the Hadoop cluster.
InstalledOozie workflow engine to run multiple Hive and Pig Jobs.
Setup and benchmarked Hadoop /HBase clusters for internal use.
Developed Java MapReduce programs for the analysis of sample log file stored in cluster.
Developed Simple to complex Map/reduce Jobs using Hive and Pig.
DevelopedMap Reduce Programs for data analysis and data cleaning.
DevelopedPIG Latin scripts for the analysis of semi structured data.
Developed and involved in the industry specific UDF (user defined functions)
UsedHive and created Hive tables and involved in data loading and writing Hive UDFs.
UsedSqoop to import data into HDFS and Hive from other data systems.
Continuous monitoring and managing the Hadoop cluster using Cloudera Manager.
Migration of ETL processes from Oracle to Hive to test the easy data manipulation.
DevelopedHive queries to process the data for visualizing.

Environment: Apache Hadoop, HDFS, Cloudera Manager, Java, MapReduce, Eclipse Indigo, Hive, PIG, Sqoop, Oozie and SQL.

Confidential

JAVA/J2EE Consultant

Responsibilities:

Development using Struts MVC model with J2EE standards.
Design and development of front end using JSPs, struts, XML, JavaScript, HTML.
Design and development of Action & Form objects as part of Struts frame work.
Involved in the Development and Deployment of Stateless Session beans.
Generated deployment descriptors for EJBs using XML.
Worked on JavaScript libraries like JSP, angular JS, and JQuery to develop the application.
Developed shell scripts for Inventory Management.
Assisted in troubleshooting JSP and Java code (EJBs and Servlets).
Ported Application in WebSphere.

Environment: JDK 1.4, IBM WebLogic 7.1, WSAD 5.0, Oracle 9i, Ant, CVS, JUnit, Struts 2.0, JavaScript 1.1, HTML, Log4j, Rational Rose, Unix.

Confidential

Junior Programmer

Responsibilities:

Requirement gathering and worked according to the CR.
Data validation/Reconciliation report generation.
Code Development as per the client requirements.
Involved in the development backend code, altered tables to add new columns, Constraints, Sequences and Indexes as per business requirements.
Performed DML, DDL Operations as per the Business requirement.
Creating views and prepared the Business Reports.
Resolved production issues by modifying backend code as and when required.
Used different joins, sub queries and nested query in SQL query.
Involved in creation of sequences for automatic generation of Product ID.
Created Database Objects like tables, Views, sequences, Synonyms, Stored Procedures, functions, Packages, Cursors, Ref Cursor and Triggers.
Testing of code functionality using testing environment.
Worked under the senior level guidance.

Environment: MySQL, Windows, MS Excel, Reports, Java.

We provide IT Staff Augmentation Services!

Hadoop Lead/architect Resume

Addison, TX

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship