We provide IT Staff Augmentation Services!

Hadoop Developer Resume

4.00/5 (Submit Your Rating)

Denver, CO

SUMMARY

  • 7+ years of IT experience in Architecture, Analysis, design, development, implementation, maintenance and support wif experience in developing strategic methods for deploying big data technologies to efficiently solve Big Data processing requirement.
  • Around 3 years of experience on BIG DATA using HADOOP framework and related technologies such as HDFS, HBASE, MapReduce, HIVE, PIG, FLUME, OOZIE, SQOOP, and ZOOKEEPER.
  • Experience in data analysis using HIVE, Pig Latin, HBase and custom Map Reduce programs in Java.
  • Experience in training people on Big data and cloud technologies
  • Experience in writing custom UDFs in java for Hive and Pig to extend the functionality.
  • Experience in writing MAPREDUCE programs in java for data cleansing and preprocessing.
  • Excellent understanding /knowledge on Hadoop (Gen - 1 and Gen-2) and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node, Resource Manager (YARN).
  • Experience in managing and reviewing Hadoop log files.
  • Experience in working wif Flume to load the log data from multiple sources directly into HDFS.
  • Excellent understanding and knowledge of NOSQL database HBase.
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems (RDBMS) and from RDBMS to HDFS.
  • Good knowledge on other big data technologies like Apache Kafka and Spark.
  • Worked extensively wif Dimensional modeling, Data migration, Data cleansing, Data profiling, and ETL Processes features for data warehouses.
  • Implemented Hadoop based data warehouses, integrated Hadoop wif Enterprise Data Warehouse systems
  • Built real-time Big Data solutions using HBASE handling billions of records.
  • Good experience working wif Horton works Distribution and Cloudera Distribution.
  • Worked extensively wif Dimensional modeling, Data migration, Data cleansing, Data profiling, and ETL Processes features for data warehouses
  • Experience in Object Oriented Analysis Design (OOAD) and development of software using UML Methodology, good knowledge of J2EE design patterns and Core Java design patterns.
  • Experience in designing both time driven and data driven automated workflows using Oozie.
  • Experience in writing UNIX shell scripts.
  • Experience working wif JAVA, J2EE, JDBC, ODBC, JSP, Java Eclipse, Java Beans, EJB, Servlets, MS SQL Server.
  • Experience in all stages of SDLC (Agile, Waterfall), writing Technical Design document, Development, Testing and Implementation of Enterprise level Data mart and Data warehouses.
  • Extensive experience working in Oracle, DB2, SQL Server and My SQL database.
  • Experience in J2EE technologies like Struts, JSP/Servlets, and spring.
  • Good Exposure on scripting languages like JavaScript, Angular JS, jQuery and xml.
  • Delivery Assurance - Quality Focused & Process Oriented:
  • Ability to work in high-pressure environments delivering to and managing stakeholder expectations
  • Application of structured methods to: Project Scoping and Planning, risks, issues, schedules and deliverables.
  • Strong analytical and Problem solving skills.
  • Good Inter personnel skills and ability to work as part of a team. Exceptional ability to learn and master new technologies and to deliver outputs in short deadlines

TECHNICAL SKILLS

Technology: Hadoop Ecosystem/J2SE/J2EE / Data base

Operating Systems: Windows Vista/XP/NT/2000/ LINUX (Ubuntu, Cent OS), UNIX

DBMS/Databases: DB2, My SQL, PL/SQL

Programming Languages: C, C++, Core Java, XML, JSP/Servlets, Struts, Spring, HTML, JavaScript, jQuery, Web services, Xml.

Big Data Ecosystem: HDFS, Map Reducing, HDFS, Oozie, Hive, Pig, Sqoop, Flume, Zookeeper, Kafka and Hbase.

Methodologies: Agile, Water Fall

NOSQL Databases: Hbase

Version Control Tools: SVN, CVS, VSS, PVCS

ETL Tools: IBM data stage 8.1, Informatica

PROFESSIONAL EXPERIENCE

Hadoop Developer

Confidential, Denver CO

Responsibilities:

  • Responsible for building scalable distributed data solutions using Hadoop Ecosystem.
  • Responsible for writing MapReduce jobs to handle files in multiple formats(JSON, Text, XML etc..)
  • DevelopedPIG UDFs to perform data cleansing and transforming for ETL activities.
  • Developed data pipeline using Flume, Sqoop, Pig and Java MapReduce to ingest data into HDFS for analysis
  • Worked extensively on creating combiners, Partitioning, Distributed cache to improve the performance of MapReduce jobs.
  • Worked on Creating the MapReduce jobs to parse the raw web logs data into delimited records.
  • Used Pig to do data transformations, event joinsand some pre-aggregations before storing the data on the HDFS.
  • Developed Sqoop scripts to import and export data from and to relational sources by handling incremental data loading on the customer transaction data by date.
  • Involved in moving all log files generated from various sources to HDFS for further processing through Flume.
  • Involved in loading and transforming large sets of structured, semi structured and unstructureddata from relational databases into HDFS using Sqoop imports.
  • Responsible for creating complex tables using hive.
  • Created partitioned tables in Hivefor best performance and faster querying.
  • Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing wif Pig.Developed Pig Scripts to pull data from HDFS.
  • DevelopedJava APIs for invocation in Pig Scripts to solve complex problems.
  • Developed Shell scripts to automate and provide Control flow to Pig scripts.
  • Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
  • Performed extensive data analysis using Hive and Pig.
  • Performed Data scrubbing and processing wif Oozie.
  • Responsible for managing data coming from different sources.
  • Worked on Data Serialization formats for converting Complex objects into sequence bits by using AVRO, JSON, CSV formats.

Environment: Hadoop Framework, MapReduce, Hive, Sqoop, Pig, HBase, Flume, Oozie, Java(JDK1.6), UNIX Shell Scripting, Oracle 11g/12g, Windows NT, IBM Datastage 8.1, TOAD 9.6, Teradata

Hadoop Developer

Confidential, Minneapolis MN

Responsibilities:

  • Involved and configured wif Hadoop environment through Amazon Web Services in cloud.
  • Designed, planned and delivered proof of concept and business function/division based implementation of Big Data roadmap and strategy project (Apache Hadoop stack wif Tableau) using Hadoop.
  • Developed MapReduce jobs in java for data cleaning and preprocessing.
  • Importing and exporting data into HDFS and Hive using Sqoop.
  • Used Bash Shell Scripting, Sqoop, AVRO, Hive,Pig, Java, Map/Reduce daily to develop ETL, batch processing, and data storage functionality.
  • Responsible for developing data pipeline using Flume, Sqoop and Pig to extract the data from weblogs and store in HDFS.
  • Worked on NoSQL database like Hbase.
  • Used data stores included Accumulo/Hadoop and graph database.
  • Exploited Hadoop MySQL-Connector to store Map Reduce results in RDBMS.
  • Worked on loading all tables from the reference source database schema through Sqoop.
  • Worked on designed, coded and configured server side J2EE components like JSP, AWS and JAVA.
  • Collected data from different databases( i.e. Oracle, MySQL) to Hadoop
  • Used Oozie and Zookeeper for workflow scheduling and monitoring.
  • Worked on Designing and Developing ETL Workflows using Java for processing data in HDFS/Hbase using Oozie.
  • Experienced in managing and reviewing Hadoop log files.
  • Working on extracting files from Mongo DB through Sqoop and placed in HDFS and processed.
  • Supported Map Reduce Programs those running on the cluster.
  • Cluster coordination services through Zoo Keeper.
  • Involved in loading data from UNIX file system to HDFS.
  • Installed and configured Hive and also written Hive UDFs.
  • Created several Hive tables, loaded wif data and wrote Hive Queries in order to run internally in MapReduce.
  • Worked on setting up Pig, Hive, Redshift and Hbase on multiple nodes and developed using Pig, Hive, Hbase and MapReduce.
  • Developed Simple to complex MapReduce Jobs using Hive and Pig.

Environment: Apache Hadoop, Map Reduce, HDFS, Hive, Java, SQL, PIG, Zookeeper,Java (jdk1.6), Flat files, Oracle 11g/10g, MySQL, Windows NT, UNIX, Sqoop, Hive, Oozie, HBase.

Hadoop Developer

Confidential, Denver CO

Responsibilities:

  • Experience indefining, designing and developing Java applications, specially using Hadoop Map/Reduce by leveraging frameworks such as Cascading and Hive.
  • Experience in Architect and build Turn's multi-petabyte scale big data Hadoop infrastructure
  • Developed workflow using Oozie for running Map Reduce jobs and Hive Queries.
  • Worked on loading log data directly into HDFS using Flume.
  • Involved in loading data from LINUX file system to HDFS.
  • Responsible for managing data from multiple sources.
  • Experienced in running Hadoop streaming jobs to process terabytes of xml format data.
  • Created and maintained Technical documentation for launching Cloudera Hadoop Clusters and for executing Hive queries and Pig Scripts
  • Experience in working wif various kinds of data sources such as Oracle, MS SQL Server.
  • Successfully loaded files to Hive and HDFS by using Sqoop.
  • Experience in managing the CVS and migrating into Subversion.
  • Experience in managing development time, bug tracking,project releases,development speed,release forecast,scheduling and many more. Using a custom framework of Nodes to take care of the back-end calls wif a lightning fast speed. Intensive Object-Oriented JavaScript, jQuery and plug-ins are used to work on dynamic user interface.
  • Experience in Partner wif Hadoop developers in building best practices for Warehouse and Analytics environment.
  • Extracted files from MySQL through Sqoop and placed in HDFS and processed.
  • Collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis.
  • Worked on debugging, performance tuning of Hive & Pig Jobs.
  • Created Hbase tables to store various data formats of PII data coming from different portfolios.
  • Implemented test scripts to support test driven development and continuous integration.
  • Worked on tuning the performance Pig queries.
  • Involved in loading data from LINUX file system to HDFS.
  • Experience working on processing unstructured data using Pig and Hive.

Environment: Hadoop, HDFS, Pig, Sqoop, HBase, Shell Scripting, Maven, Hudson, Ubuntu, Linux Red Hat, Hive, Java (JDK 1.6), Hadoop Distribution Cloudera, MapReduce, PL/SQL, UNIX Shell Scripting.

Java Developer

Confidential, Maryland

Responsibilities:

  • Developed Servlets and JSP based on MVC pattern using Struts Action framework.
  • Used Tiles for setting the header, footer and navigation and Apache Validator Framework for Form validation.
  • Using Resource and Properties files for i18n support.
  • Parsing high-level design spec to simple ETL coding and mapping standards
  • Involved in writing Hibernate queries and Hibernate specific configuration and mapping files.
  • Used Log4J logging framework to write Log messages wif various levels.
  • Involved in fixing bugs and minor enhancements for the front-end modules.
  • Used JUnit framework for writing Test Classes.
  • Coded various classes for Business Logic Implementation.
  • Develop and test the code according to the requirements.
  • Preparing and executing Unit test cases
  • Doing functional and technical reviews
  • Support to the testing team for System testing/Integration/UAT
  • Assuring quality in the deliverables.
  • Conducted Design reviews and Technical reviews wif other project stakeholders.
  • Implemented Services using Core Java.
  • Developed and deployed UI layer logics of sites using JSP.
  • Struts (MVC) is used for implementation of business model logic.
  • Worked wif Struts MVC objects like Action Servlets, Controllers, and Validators, Web Application Context,Handler Mapping, Message Resource Bundles and JNDI for look-up for J2EE components.
  • Involved in the complete life cycle of the project from the requirements to the production support.

Environment: J2EE, JDBC, Java 1.4, Servlets, JSP, Struts, Hibernate, Web services, SOAP, WSDL, Design Patterns, MVC, HTML, JavaScript 1.2, WebLogic 8.0, XML, JUnit, Oracle 10g, My Eclipse.

UNIX Admin

Confidential

Responsibilities:

  • Proficient in UNIX shell commands and network related concepts like DNS server setup, TCP/IP.
  • Constructing and tuning system and network parameters for optimum performance.
  • Knowledgeable in troubleshooting and problem solving skills, including application and network-level assessment capability.
  • Strong understanding and experience on writing shell scripts to automate the tasks.
  • Analyzing and triaging outages monitor and amend systems and network performance.
  • Expanding tools to automate the distribution, administration and overseeing of a large-scale Linux environment.
  • Operating server tuning, performing system upgrades.
  • Partake in the planning phase for system requirements on different projects for deployment of business functions.
  • Participating in 24x7 on-call rotation and maintenance windows.
  • Communication and coordination wif internal/external groups and operations.

Environment: RHEL/CentOS

We'd love your feedback!