We provide IT Staff Augmentation Services!

Hadoop Developer Resume

0/5 (Submit Your Rating)

Peachtree City, GA

SUMMARY

  • Versatile, dynamic and a technically - competent problem solver wif over 7 years of experience in Hadoop & Big Data (4 years) and Java, J2EE (3+years) technologies.
  • Expertise in Design and Implementation of Big Data solutionsin Retail, Finance and E-commerce domains.
  • Hands-on experience in Installation, Configuration, Support and Management of Cloudera’s Hadoop platform along wif CDH3&4 clusters.
  • Sound noledge of Hadoop Architecture, Administration, HDFS-Federation & High Availability and Streaming API along wif Data ware housing Concepts.
  • Experienced in understanding complex Big Data processing needs and developing MapReduce jobs (in Java), Scala codes and modules to address those needs.
  • Experience wif handling Data accuracy, Scalability and Integrity of Hadoop platforms.
  • Experience wif complex data processing pipelines, including ETL and Data Ingestion dealing wif unstructured and semi-structured data.
  • Knowledgeable of Apache Spark, Scala mainly in framework exploration for transition from Hadoop/MapReduce to Spark.
  • Knowledge on Designing and Implementing ETL process to load data from various data sources to HDFS using Flume and Sqoop, performing transformation logic using Hive, Pig and Integration wif BI tools for Visualization/Reporting.
  • Solid understanding of NoSQL databases like Mongo DB, HBase and Cassandra.
  • Expertise in performing Large-scale Web crawling wif Apache Nutch using a Hadoop/HBase cluster.
  • Knowledge in Job Workflow Scheduling and Monitoring tools like Oozie and Zookeeper.
  • Experience wif working on teh AWS cloud environment.
  • Excellent Java development skills using J2EE, spring, J2SE, Servlets, JUnit, MRUnit, JSP, JDBC.
  • Experienced in working wif various frameworks like Struts, spring, Hibernate, EJB and JSF.
  • Professional noledge of UNIX, Shell and PERL Scripting.
  • Knowledge of Data Warehousing and ETL Tools like Informatica and Pentaho.
  • Hands on noledge of writing code in Scala.
  • Experience in Object Oriented Analysis and Design (OOAD) and development of software using UML Methodology.
  • Experienced in Agile Scrum, RUP and TDD software development methodologies.
  • Possess strong commitment to team environment dynamics wif teh ability to lead, contribute expertise and follow leadership directives at appropriate times.
  • TEMPEffectively used Oozie to develop automatic workflows for Sqoop, Map Reduce and Hive jobs.

TECHNICAL SKILLS

Big Data Ecosystem: HDFS, HBase, MapReduce, Hive, Pig, Sqoop, Spark,Splunk, Impala, Kafka, Talend, Oozie, Zookeeper,Flume, Storm, AWS, EC2, EMR.

Programming Languages: Java, Scala, Python, C/C++, PL/SQL.

Scripting Languages: PHP, JQuery, JavaScript, XML, HTML, Bash, Ajaxand CSS.

UNIX Tools: Apache, YUM, RPM.

J2EE Technologies: Servlets, JSP, JDBC, EJB, & JMS.

Databases: NoSQL- MongoDB & Cassandra, Oracle.

Data Integration Tools: Informatica, Pentaho.

Methodologies: Agile, Scrum, SDLC, UML, Design Patterns.

IDEs: Eclipse, NetBeans, WSAD, RAD.

Platforms: Windows, Linux, Solaris, AIX, HPUX, Centos.

Application Servers: Apache Tomcat, Web Logic, WebSphere,JBoss 4.0

Frameworks: Spring, MVC, Hibernate, Struts, Log4J, Junit,WebServices

PROFESSIONAL EXPERIENCE

Confidential, Peachtree City, GA

Hadoop Developer

Responsibilities:

  • Working extensively in creating MapReduce jobs for search and analytics in teh identification of various trends across teh data for Infotainment product line.
  • Working on Data analytics using Pig and Hive. Hive made it easier to extract information out of very old data.
  • Designing teh adaptive ecosystem to ensure that teh archived data was accessible using third party BI tools.
  • Using Oozie for workflow orchestration in teh automation of MapReduce, Pig and Hive jobs.
  • Installing and configuring teh Hadoop cluster and developing multiple MapReduce jobs in java for data cleaning and pre-processing.
  • Analyzing teh information from teh automobile-bounded unit Dedicated Short Range Communication, using Pig and Hive, makes it easier to monitor vehicles and teh road status.
  • Responsible for optimizing data across network using Combiners, joining multiple schema datasets using Joins and organizing data using Partitions and Buckets.
  • Writing jobs in Scala for teh company’s parallel data processing center located in teh vicinity.
  • Moving large datasets hourly wif AVRO file format and imposing Hive and Impala queries.
  • Working on importing data into HBase using HBase Shell and HBase Client API.
  • Capturing archived data from existing relational database into HDFS using Sqoop.
  • Installing and configuring remote Hive Metastore for both - development and production jobs as required.
  • Coordinating teh cluster services using Zookeeper.
  • Improving system performance by working wif teh development team to analyze, identify and resolve issues quickly.
  • Storing teh geographically pre-distributed datasets in Cassandra.
  • Capturing teh data logs from web server into HDFS using Flume for analysis.
  • Writing Pig scripts and implementing business logic using Pig UDFs to pre-process teh data for analysis.
  • Managing and reviewing Hadoop log files, theirby keeping track of nodes’ health.

Environment: CDH4 wif Hadoop 2.x, HDFS, MapReduce, Pig, Hive, Oozie, Sqoop, Scala, Zookeeper, HBase, Cassandra, Flume, Servlets, JSPs, JSTL, HTML, JavaScript.

Confidential, Cupertino, CA

Hadoop Consultant

Responsibilities:

  • Used Cloudera Manager for Hadoop cluster environment administration that includes adding and removing cluster nodes, cluster capacity planning, performance tuning, cluster monitoring, and trouble shooting.
  • Developed efficient MapReduce programs for data cleaning and structuring using Java and Python.
  • Supported teh team in Code/Design analysis, Strategy development and Project planning.
  • Modeled and made teh data query-able using a unified query service.
  • Developed Hive queries for data sampling and pre-analysis before submitting to teh analysts.
  • Implemented Kafka Storm topologies, which are capable of handling and channelizing high stream of data and integrating teh stormtopologies wif Esper to filter and process that data across multiple clusters for complex event processing.
  • Registered, ingested, validated, stored and archived teh data in its native form.
  • Used Oozie to automate and schedule business workflows invoking Sqoop, MapReduce and Pig jobs as per teh requirements.
  • Used Cassandra to store majority of teh data which needs to be divided regionally.
  • Worked on Splunk to leverage teh archived and specialized analytics of Hadoop.
  • Collaborated wif teh infrastructure, network, database, application and BI teams to ensure data quality and availability.
  • Cleansed, enriched, transformed, and analyzed teh data through hosted compute engines.
  • Used Apache Spark for Performance optimization and Parallel data processing.
  • Developed Sqoop scripts to import and export data from relational sources and handled incremental loading on teh customer and transaction data by date.
  • Implemented business logic by writing Pig Latin UDFs in Java and used various UDFs from Piggybanks and other sources. Developed Pig UDFs for pre-processing as well.
  • Created HBase tables to load large, disparate datasets of structured, semi-structured and unstructured data coming from UNIX, NoSQL and a variety of portfolios.
  • Responsible for automation to add data nodes as needed
  • Worked wif various HDFS file-formats like Avro, Sequence File and various compression formats like Snappy, bzip2.
  • Identified several PL/SQL batch applications in General Ledger processing and conducted performance comparison to demonstrate teh benefits of migrating to Hadoop.

Environment: Hadoop 2.0, MapReduce, HDFS, Hive, Java, Cloudera, Pig, HBase, Kafka Storm, Splunk, MySQL Workbench, Eclipse, Oracle 10g, PL/SQL, SQL*PLUS.

Confidential, Bloomington, IL

Hadoop/Big data Developer

Responsibilities:

  • Designed, developed and supported a MapReduce-based data processing pipeline to process growing number of events from log files and messages per day.
  • Worked closely wif client development staff to perform ad-hoc queries and data analysis on newly created cross-platform datasets using Apache Hive and Pig.
  • Used Pig as ETL tool to do transformations, event joins, filter bot traffic and some pre-aggregations before storing teh data onto HDFS
  • Used Hive- Partitioning and bucketing, to segregate teh data and analyze it.
  • Incorporated various job flow mechanisms in Oozie to automate teh workflow for extraction of data from warehouses and weblogs.
  • Implemented open-source monitoring tool GANGLIA for monitoring teh various services across teh cluster.
  • Collaborated wif teh administration team to set up a monitoring infrastructure for supporting and optimizing teh Hadoop infrastructure.
  • Responsible for writing complex SQL-queries involving multiple inner and outer joins.
  • Developed and supported a Scala-based data processing pipeline for one of teh processing centers located in Sacramento.
  • Developed data pipeline using Flume, Sqoop, Pig and Java map reduce to ingest customer behavioral data and purchase histories into HDFS for analysis.
  • Worked wif teh applications team to install teh operating systems, Hadoop updates, patches and version upgrades as required.
  • Personally dealt wif teh stakeholders in order to closely understand teh business needs.
  • Computed various metrics and loaded teh aggregated data onto DB2 for reporting on teh dashboard.
  • Designed teh shell script for backing up of important metadata and rotating teh logs on a monthly basis.
  • Greatly sharpened my business acumen wif noledge on teh health insurance, claim processing, fraud suspect identification, appeals process and other domains.

Environment: CDH4 wif Hadoop 1.x, HDFS, MapReduce, Pig, Hive, Oozie, Sqoop, Flume, Servlets, JSPs, JSTL, HTML, JavaScript, JQuery, CSS.

Confidential

Sr. Java/J2EE developer

Responsibilities:

  • DesignedanddevelopedStruts-like MVC 2 Webframework using teh front-controller design pattern, which is used successfully in a number of production systems.
  • Spearheadedthe “Quick Wins” project by working very closely wif teh business and end users to improve teh current website’s ranking from being 23rdto 6thin just 3 months.
  • Normalized Oracle database, conforming to design concepts and best practices.
  • Resolvedproduct complications at customer sites and funneled teh insights to teh development and deployment teams to adopt long term product development strategy wif minimal roadblocks.
  • Built front end UI using JSP, Servlets, HTML and JavaScript to create user friendly and appealing interface.
  • Used JSTL and built custom tags whenever necessary.
  • Used Expression Language to tie beans to UI components.
  • Convinced business users and analysts wif alternative solutions that are more robust and simpler to implement from technical perspective while satisfying teh functional requirements from teh business perspective.
  • Applied design patterns and OO design conceptsto improve teh existing Java/JEE based code base.
  • Identified and fixed transactional issues due to incorrect exception handling and concurrency issues due to unsynchronized block of code.

Environment: Java 1.2/1.3, Swing, Applet, Servlets, JSP, custom tags, JNDI, JDBC, XML, XSL, DTD, HTML, CSS, Java Script, Oracle, DB2, PL/SQL, WebLogic, JUnit, Log4J and CVS.

Confidential

Java/J2EE developer

Responsibilities:

  • Developed user interface usingJSP, HTML, CSSandJavaScript.
  • Responsible for gathering and analyzing teh requirements for teh project.
  • Implemented teh various unified modeling language diagrams like use case diagram, ER diagram for teh project.
  • Used Dependency injection inspringfor Service layer and DAO layer.
  • J2EE Architecture was implemented usingStrutsbased on theMVC2pattern.
  • Wrote Servlets and deployed them onWebSphereApplication server.
  • Created teh user validations on client side as well as server side.
  • Developed teh Java classes to be used inJSPandServlets.
  • Extensively used JavaScript for client side validations.
  • Improved teh coding standards, code reuse and participated in code-reviews.
  • Worked wifPL/SQLscripts to gather data and perform data manipulations.
  • UsedJDBCfor Database transactions.
  • Involved in unit testing of teh application.
  • Developed stored procedures inOracle.
  • UsedTest Driven Developmentapproach, and wrote many unit and integration tests
  • Involved in analyzing how teh requirements related to and depended on each other.
  • Onsite coordination for developing various modules.

Environment: Java 1.4, JSP 2.0, Servlets 2.4, JDBC, HTML, CSS, JavaScript, WebSphere 3.5.6, Eclipse, Oracle 9i.

We'd love your feedback!