We provide IT Staff Augmentation Services!

Hadoop Developer Resume

4.00/5 (Submit Your Rating)

MI

SUMMARY

  • 11+ years of experience with emphasis on Big Data Technologies, Development and Design of Java based enterprise applications.
  • Hands - on experience on major components in Hadoop Ecosystem including HDFS, MR, Hive, Pig, HBase, Flume, Sqoop, Oozie, Kafka, Storm, Cassandra, HBase-Hive Integration, Hive-Pig integration, and knowledge on Spark.
  • Involved in extracting customer's Bigdata from various data sources intoHadoopHDFS.
  • Experienced in developing MapReduce programs using Apache Hadoop for working with Big Data.
  • Developed MapReduce programs to cleanse the data in HDFS obtained from heterogeneous data sources to make it suitable for ingestion into Hive schema for analysis.
  • Good experience writing scripts & analysiing using Pig and Hive and understanding of Sqoop and integration of these components.
  • Hands-on NoSQL databases including HBase and Cassandra.
  • Experience on Horton works and ClouderaHadoopenvironments, and worked onHadoopCloudera upgrade from CDH4.x to CDH5.x.
  • Experienced on loading and transforming of large sets of structured, semi structured and unstructured data from HBase through Sqoop and placed in HDFS for further processing.
  • Experienced in installing, configuring, and administratingHadoopcluster of majorHadoop distributions.
  • Expertise in Web technologies using Core Java, J2EE, Servlets, EJB, JSP, JDBC, Java Beans, and Design Patterns.
  • Familiarity and experience with data warehousing and ETL tools.
  • Experience in creating databases, users, tables, triggers, macros, views, stored procedures, functions, Packages, joins and hash indexes in Teradata database.
  • Hands-on experience with informatica and teradata.
  • Major strengths are familiarity with multiple software systems, ability to learn quickly new technologies, adapt to new environments, self-motivated, team player, focused adaptive and quick learner with excellent interpersonal, technical and communication skills.

TECHNICAL SKILLS

Hadoop/Big Data: HDFS, MapReduce, YARN, Hive, Hbase, Pig, Sqoop, Flume, Oozie, Tez, Zookeeper, Storm, Kafka, Cassandra, Spark, Spark Streaming, Storm and Ambari

NoSQL Databases: Hbase, Teradata, MongoDB

Languages: C, Java, Shell Scripting, JavaScript, SQL, PL/SQL, informatica, Teradata

Java Technologies: Core Java, J2EE, Struts 2.1/2.2, Spring 3.x/4.x, Servlets 2.3/3.0, JSP, JDBC, Hibernate 3.x/4.x, JUnit, REST/SOAP Web services

Scripting/Query: SQL, Shell Scripting, HiveQL

IDEs: Eclipse, MyEclipse, RAD, Intellij IDEA

Frame works: Spring, Hibernate

Servers: Apache Tomcat, WebLogic and JBoss

Database: Oracle 10g/9i/8i, MySQL, SQL Server, Teradata, DB2

Operating Systems: Windows, LINUX/UNIX, MAC OSX

PROFESSIONAL EXPERIENCE

Confidential, MI

Hadoop Developer

Responsibilities:

  • Development and ETL Design inHadoop
  • Developed MapReduce Input format to read specific data format.
  • Performance tuning of Hive Queries written by data analysts.
  • Developing Hive queries and UDF s as per requirement.
  • Involved in extracting customer's Bigdata from various data sources intoHadoopHDFS. This included data from mainframes, databases and also logs data from servers.
  • Used Sqoop to efficiently transfer data between databases and HDFS and used Flume to stream the log data from servers.
  • Developed MapReduce programs to cleanse the data in HDFS obtained from heterogeneous data sources to make it suitable for ingestion into Hive schema for analysis.
  • The Hive tables created as per requirement were managed or external tables defined with appropriate static and dynamic partitions, intended for efficiency.
  • Implemented Partitioning, Bucketing in Hive for better organization of the data.
  • Used Oozie workflow engine to manage interdependentHadoopjobs and to automate several types ofHadoopjobs such as Java MapReduce, Hive and Sqoop as well as system specific jobs.
  • Implemented Fair Scheduler on the job tracker to allocate the fair amount of resources to small jobs.
  • Implemented automatic failover Zookeeper and zookeeper failover controller.
  • Used Sqoop to transfer data from external sources to HDFS
  • Designed and Developed Oozie workflows, integration with Pig.
  • Documented ETL best practices to be implemented withHadoop
  • Monitoring and DebuggingHadoopjobs/Applications running in production.
  • Worked onHadoopCloudera upgrade from CDH4.x to CDH5.x.
  • Worked on providing user support and application support onHadoopInfrastructure.
  • Worked on evaluating, comparing different tools for test data management withHadoop.
  • Helping testing team onHadoopApplication testing.
  • Worked on Installing 20 node UATHadoopcluster.

Environment: Hadoop, HDFS, MapReduce, Hive, Sqoop, Pig, DB2, Oracle, XML, Cloudera Manager.

Confidential, NJ

Hadoop Developer

Responsibilities:

  • Responsible for designing and implementing ETL process to load data from different sources, perform data mining and analyze data using visualization/reporting tools to leverage the performance of OpenStack.
  • Collected the logs from the physical machines and the OpenStack controller and integrated into HDFS using Flume.
  • Partitioned the collected logs by date/timestamps and host names.
  • Worked on the HBase for data optimization.
  • Designing and creating Hive external tables using shared meta - store instead of derby with partitioning, dynamic partitioning and buckets.
  • Hands on experience in loading data from UNIX file system to HDFS.
  • Experienced on loading and transforming of large sets of structured, semi structured and unstructured data from HBase through Sqoop and placed in HDFS for further processing.
  • Responsible and managed entire Hive warehouse.
  • Developed custom MapReduce programs to extract the required data from the logs.
  • Performed performance tuning and troubleshooting of MapReduce jobs by analyzing and reviewing Hadooplog files.
  • Responsible for creating Hive tables, loading the structured data resulted from MapReduce jobs into the tables and writing hive queries to further analyze the logs to identify issues and behavioral patterns.
  • Imported data frequently from MySQL to HDFS using Sqoop.
  • Supported operations team inHadoopcluster maintenance activities including commissioning and decommissioning nodes and upgrades.
  • Used Tableau for visualizing and to generate reports.

Environment: Hadoop, MapReduce, Hive, HBase, Sqoop, kafka, Impala, Tableau

Confidential, CA

Java Developer

Responsibilities:

  • Develop GUI related changes using JSP, HTML and client validations using Java script.
  • Designed and developed front end using HTML, JSP and Servlets
  • Implemented client side validation using JavaScript
  • Developed the application using Struts Framework to implement a MVC design approach
  • Validated all forms using Struts validation framework
  • Used Hibernate in persistence layer of the application
  • Responsible for design and implementation of various modules of the application using Struts - Spring-Hibernate architecture.
  • Implemented action classes, form beans and JSP pages interaction with these components.
  • Developed user interface using JSP, Struts Tag Libraries to simplify the complexities of the application.
  • Developed Web Interface using Servlets, Java Server Pages, HTML and CSS.
  • Involved in coding SQL queries.
  • Created java classes to communicate with database through JDBC.

Environment: Java, Servlets, JSP, EJB, J2EE, STRUTS, XML, XSLT, Java Script, SQL, PL/SQL, MS Visio, Eclipse, JDBC, Win CVS, Windows XP.

Confidential

ETL Developer

Responsibilities:

  • Developed mappings in Informatica Power Center 6.1 that catered to the Extraction, Transformation, and Loading from various source systems to target systems.
  • Creating the Mappings and Workflows.
  • Used Informatica tool to handle complex Mappings and extensively used the various Transformations like Source Qualifier, Aggregators, Lookups, Filters, Update Strategy, Expression, Sequence generator and Sorter etc.
  • Extensively used workflow manager to create tasks and workflows.
  • Rule based data cleansing, data conversion & process implementation
  • Used PL/SQL with Informatica Power Center 6.1for Oracle 8i database.
  • Developed mapping using Informatica Power Center Designer to bulk load data from Oracle and Flat file source system to target database.
  • Used Informatica tool to handle complex mappings and extensively used various Transformations including Source Qualifier, Aggregators, Lookups, Filters, Update Strategy, Expression, Sequence generator and Sorter.

Environment: Teradata, Informatica Power Center 5.2, Flat files, Oracle 8i, BO 5, Windows 2000

Confidential

ETL Developer/Informatica Lead

Responsibilities:

  • Migrating Informatica flows to IBM DataStage
  • Created migration documents
  • Reviews existing data dictionaries and source to target mappings
  • Participate in Data modeling discussions, exchange thoughts to come up with a best model to efficiently serve both ETL teams and reporting teams
  • Designed complex DataStage mappings and re - usable transformations to facilitate Initial, Incremental and CDC data loads and parameterize to source from multiple systems
  • ETL migration and administration
  • Extensive Experience in writing SQL, PL/SQL queries, stored procedures, functions, packages and database triggers, exception handlers.

Environment: Informatica Power center8.6, IBM Info sphere Information Server Datastage, UNIX, Shell scripting, control M, Erwin, Oracle, PL/SQL.

We'd love your feedback!