We provide IT Staff Augmentation Services!

Hadoop Developer Resume

4.00/5 (Submit Your Rating)

Kansas City, MO

PROFESSIONAL SUMMARY:

  • Over 8+ years of experience in the field of IT with over 2.5+ years of experience in developing Bigdata/Hadoop applications and over 5.5 years of experience in Data warehousing.
  • Good knowledge ofHadoopArchitecture and various components such as HDFS, Job Tracker, Task Tracker, Data Node, Name Node and Map - Reduce components. Hands on experience in working with Flume, HDFS, sqoop, hive and fair knowledge of hbase.
  • Hands on experience in spark programming and good exposure to scala.
  • Implemented in setting up standards and processes forHadoopbased application design and implementation.
  • Experience in installation, configuration, performance tuning and deployment of Big Data solutions.
  • Expertize with the tools inHadoopEcosystem including Pig, Hive, HDFS, MapReduce, Sqoop, HBase, Impala, Yarn, Flume,Oozie and Zookeeper.
  • Experience in developing NoSQL databases like HBase and Cassandra by using CRUD, Sharding, Indexing and Replication.
  • Experience in developing Pig scripts and Hive Query Language.
  • Managing and scheduling batch Jobs on aHadoopCluster using Oozie.
  • Experience in managing and reviewingHadoopLog files.
  • Used Zookeeper to provide coordination services to the cluster.
  • Experienced using Sqoop to import data into HDFS from RDBMS and vice-versa.
  • Sound knowledge of LINUX as Hadoop runs on Linux.
  • Good knowledge on Core Java and debugging skills.
  • Familiarity with Data as Service(DaaS) and working experience on Apache Solr.
  • Experience in requirement analysis, system design, development and testing of various software applications.
  • Hands on experience in application development using Java, RDBMS and Linux Shell Scripting.
  • Detailed understanding of Software Development Life Cycle (SDLC) .
  • Strong Data warehousing experience in application development using InformaticaPowerCenter(Designer, Workflow Manager, Workflow Monitor, Repository Manager).
  • Experience in creating complex mappings using various transformations and well versed in debugging an Informatica mapping.
  • Strong SQL experience in coding and Query Tuning.
  • Tweaked complete ETL system by collecting performance data, optimizing default session parameters and used various Performance Tuning methods on source, target, mapping and session.

TECHNICAL SKILLS:

Operating Systems: Windows 8/7/XP, Unix, Linux CentOS-6.5, Ubuntu 13.X, Mac OSX.

HadoopEco Systems: Hadoop1.x/2.x(Yarn), HDFS, Map Reduce, HBase, Hive, PIG, Zookeeper, Sqoop, Flume, Oozie, Impala,Cloudera-desktop.

Development Tools: Eclipse, MySQL, SQL Developer, TOAD, Microsoft Suite (Word, Excel, PowerPoint, Access), Open Office Suite (Editor, Calc etc..),VM Ware, InformaticaPowerCenter.

NoSQL Databases: HBase,MongoDB.

Databases: MySQL, Oracle 12c/11g/10g, SQL Server.

PROFESSIONAL EXPERIENCE:

Confidential, Kansas City, MO

Hadoop Developer

Responsibilities:

  • Worked on analyzingHadoopcluster using different big data analytic tools including Flume, Pig, Hive and Map Reduce.
  • Collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis.
  • Worked on debugging, performance tuning of Hive & Pig Jobs.
  • Implemented test scripts to support test driven development and continuous integration.
  • Worked on tuning the performance Pig queries.
  • Involved in loading data from LINUX file system to HDFS.
  • Importing and exporting data into HDFS and Hive using Sqoop.
  • Experience working on processing unstructured data using Pig and Hive.
  • Implemented Partitioning, Dynamic Partitions, Buckets in Hive.
  • Supported MapReduce Programs those are running on the cluster.
  • Gained experience in managing and reviewingHadooplog files.
  • Involved in scheduling Oozie workflow engine to run multiple Hive and pig jobs.
  • Developed Pig Latin scripts to extract data from the web server output files to load into HDFS.
  • Extensively used Pig for data cleansing.
  • Created and maintained technical documentation for launchingHadoopclusters and for executing Hive queries and Pig Scripts.
  • Strong experience on Apache server configuration.
  • Used NoSQL database with HBase .
  • Exported the result set from HIVE to MySQL using Shell scripts.
  • Actively involved in code review and bug fixing for improving the performance.

Environment: Hadoop, HDFS, Pig, Hive, Map Reduce, Sqoop, LINUX, Cloudera, Big Data, Java APIs, SQL, NoSQL, Hbase, MySQL,Apache Solr.

Confidential, Bentonville, Arkansas

Hadoop Developer

Responsibilities:

  • Developed data pipeline using Flume, Sqoop, Pig and Java map reduce to ingest data.
  • Worked on the Data Feed to the pricing algorithm, Staged data in the specified format for the Pricing engine to consume.
  • Involved in writing MapReduce jobs.
  • Used Pig to do transformations, event joins, filter traffic and some pre-aggregations before storing the data onto HDFS.
  • Involved in developing UDFs for both Pig and Hive in Java.
  • Used Hive to analyze the partitioning and bucketing data and compute various metrics for reporting.
  • Involved in developing Hive DDLs to create, alter and drop Hive tables.
  • Involved in using HCATALOG to access Hive table metadata from Map Reduce or Pig code.
  • Computed various metrics using Java MapReduce to calculate metrics that define user experience, revenue etc.
  • Responsible for developing data pipeline using flume, sqoop and pig to extract the data from weblogs and store in HDFS.
  • Proficient work experience with NOSQL HBase database.
  • Used Eclipse to build the application.
  • Involved in using SQOOP for importing and exporting data into HDFS.
  • Involved in processing ingested raw data using MapReduce, Apache Pig and Hive.
  • Involved in developing Pig Scripts for CDC and delta record processing between newly arrived data and already existing data in HDFS.
  • Involved in developing Shell scripts to orchestrate execution of all other scripts (Pig, Hive, and MapReduce) and move the data files within and outside of HDFS.
  • Worked on HBase by using CRUD (Create, Read, Update and Delete), Indexing, Replication and Sharding features.

Environment: Hadoop, MapReduce, Yarn, Hive, Pig, HBase, Oozie, Sqoop, Flume, Oracle 11g, Core Java, HDFS, Eclipse.

Confidential, Atlanta, GA

Informatica Developer

Responsibilities:

  • Design of Informatica Servers and repositories.
  • Design of repository architecture.
  • Creation of UNIX Level folders for installing latest versions.
  • Installing, testing, analyzing on the latest versions.
  • Assigning properties to Integration Service and Repository Services as per the environment.
  • Creating/joining domains, nodes and grids.
  • Understanding new features by working with Informatica Vendor.
  • Work with Informatica Vendor and participate in resolving configurational and environmental issues.
  • Sharing knowledge to Developers, Administrators on the latest Informatica versions and make them understand the tasks to be performed in understanding and working on new versions.
  • Design reviews of new code developed in various projects.
  • Make sure that any new development should follow TSG standards.
  • Helping developers in resolving any issues they face during the development of new mappings or during enhancements.
  • Supporting Informatica Administrators in resolving complex tickets/break downs.
  • Resolving the tickets raised with Informatica Design teams on design/configurational issues.
  • Participated in implementing InformaticaPowerCenter Data validation Option (DVO) for complete data validation and testing in production environment.
  • Documentation & Presentation of the project and new architectures.
  • Deciding best properties of various components for maximum performance.

Environment: Oracle 11g, Unix, Informatica 9.1, 8.x, Edit plus, Putty, SQL plus, Toad, Autosys.

Confidential, Atlanta, GA

ETL Developer

Responsibilities:

  • Extensively used all the features of Informatica 8.6 including Designer, Workflow manager and Repository Manager, Workflow monitor.
  • Responsible for developing, support and maintenance for the ETL (Extract, Transform and Load) processes using Informatica power center.
  • Implemented various workflows using transformations such as Normalizer, look up, aggregator, stored procedure and scheduled jobs for different sessions.
  • Working closely with Source System teams to analyze source data and resolve source data issues.
  • Interacting with other project teams for seamless integration.
  • Understanding existing business model and customer requirements with the Functional team.
  • Attending the technical review meetings for Data Modeling.
  • Understanding Data warehouse concepts to implement helpful techniques in the project.
  • Analyzing data model to document the requirements and to document the pseudo code.
  • Responsible for creating Technical specification and Mapping design documents.
  • Designing technical specification documents for all the developed mappings
  • Ensure business rules are translated and documented from business to technical requirements for developers
  • Development of ~50 Informatica mappings to build business rules to load data.
  • Used Informatica power center to extract, transform and load data from multiple input sources like relational sources, legacy sources and flat files to load the target systems.
  • Created mappings using Filter, Expression, Look up, Source Qualifier and Sequence Generator Transformations.
  • Coordinate multiple architects responsible for development, integration, administration, and evolution of the data warehouse
  • Worked on UNIX shell scripts. Developed UNIX shell scripts to run the pmcmd functionality to start and stop sessions
  • Automation of job processing, establish automatic email notification to the concerned persons.
  • Involved in unit testing and system integration testing and preparing Unit Test Plan (UTP) and System Test Plan (STP) documents.
  • Worked on HPSD Change Requests (CR) while integration testing and UAT are in progress.
  • Worked with Informatica Design and Administrator team for code migration and participated in project Go-live.
  • Extensively worked with Reports Development Team to make them understand the DW to generate required reports using BO and Cognos tools.
  • Transferred knowledge to maintenance team after production go-live.

Environment: Informatica 8.1, Windows XP and Oracle 9i, 11g, UNIX, Erwin 4.x, SQL Developer, BO 6.5, Cognos.

Confidential

Jr. ETL Developer

Responsibilities:

  • Analyzing data issue problems and recommending solutions.
  • Resolving Load Failure Issues in Production.
  • Analyzing ETL mappings and making changes for correction of errors and inclusion of required new logic.
  • Analyzing and modifying Perl Scripts.
  • Running Test Loads in Development and Test Environments.
  • Addition of new Source Systems.

Environment: Informatica 8.x, UNIX, Shell Scripting and Oracle 9i

We'd love your feedback!