We provide IT Staff Augmentation Services!

Hadoop Developer Resume

5.00/5 (Submit Your Rating)

Omaha, NE

SUMMARY

  • 7+ Years of extensive experience in IT including 2+ years of Big Data Ecosystem related technologies.
  • Experience in installation, configuration, management and deployment of Big Data solutions and the underlying infrastructure of Hadoop Cluster.
  • Good understanding/knowledge of Hadoop Architecture.
  • Hands on experience in installing, configuring and using ecosystem components like HadoopMapReduce, HDFS, Hbase, ZooKeeper, Oozie, Flume, Sqoop, Pig & Hive with CDH3&4&5 clusters..
  • Experience in managing Hadoop clusters using Cloudera Manager
  • Set up standards and processes for Hadoop based application design and implementation.
  • Experience in analyzing data using HIVEQL, PIG Latin, H - Base and custom MapReduce programs in JAVA.
  • Extending HIVE and PIG core functionality by writing custom UDF’s like UDAFs and UDTFs.
  • Good experience in analysis using PIG,HIVE and understanding of SQOOP.
  • Experienced in developing MapReduce programs using Apache Hadoop for working with Big Data.
  • Experience in Designing, developing and implementing connectivity products that allow efficient exchange of data between our core database engine and Hadoop ecosystem.
  • Worked on NoSQL databases including HBase.
  • Experience in database development using SQL and PL/SQL and experience working on databases like Oracle 9i/10g and SQL Server.
  • Performeddata analysisusingMySQL, SQL Server Management Studio, and Oracle.
  • Strong skills in Datastage Administrator in UNIX & LINUX environments, Report creation using OLAP data source and having knowledge in OLAP universe.
  • Profound knowledge of the principle of DW using Fact Tables, Dimension Tables, Star schema modeling and Snowflake Schema modeling.
  • Excellent experience in ETL analysis, designing, developing, testing and implementing ETL processes including performance tuning and query optimizing of databases.
  • Excellent experience in extracting source data from Sequential files, XML files, Excel files, transforming and loading it into the Confidential data warehouse.
  • Experience in importing and exporting the data using Sqoop from HDFS to Relational Database systems/mainframe and vice-versa.
  • Experience in using Apache Flume for collecting, aggregating and moving large amounts of data from application servers.
  • Experience in using Zookeeper and Oozie Operational Services for coordinating the cluster and scheduling workflows.
  • Experience in setting up automated monitoring and escalation infrastructure for Hadoop Cluster using Ganglia and Nagios.
  • Hands on experience in application development using Java, RDBMS, and Linux shell scripting.
  • Diverse experience utilizing Java tools in business, Web, and client-server environments including Java Platform, J2EE, EJB, JSP, Java Servlets, Struts, and Java database Connectivity (JDBC) technologies.
  • Solid background in Object-Oriented analysis (OOAD) and design. Very good at various Design Patterns, UML and Enterprise Application Integration EAI.
  • Major strengths are familiarity with multiple software systems, ability to learn quickly new technologies, adapt to new environments, self-motivated, team player, focused adaptive and quick learner with excellent interpersonal, technical and communication skills.
  • Good communication skills, work ethics and the ability to work in a team efficiently with good leadership skills.

TECHNICAL SKILLS

Big data/Hadoop: HDFS, Map Reduce, HIVE, PIG, HBase, Sqoop, Flume, Oozie, Zookeeper.

Java Technologies: Core Java, I8N, JFC, Swing, Beans, Log4j, Reflection.

J2EE Technologies: Servlets, JSP, JDBC, JNDI, Java Beans.

Methodologies: Agile, UML, Design Patterns (Core Java and J2EE).

Monitoring and Reporting: Ganglia, Nagios, Custom Shell scripts.

Frameworks: MVC, Struts, Hibernate, Spring.

Programming Languages: C,C++, Java, Python, Ant scripts, Linux shell scripts.

Database: Oracle 11g/10g/9i, MySQL, DB2, MS-SQL Server.

Web Servers: WebLogic, WebSphere, Apache Tomcat.

Web Technologies: HTML, XML, JavaScript, AJAX, SOAP, WSDL.

Network Protocols: SSH,TCP/IP, UDP, HTTP, DNS, DHCP.

PROFESSIONAL EXPERIENCE:

Confidential, Omaha, NE

Hadoop Developer

Responsibilities:

  • Responsible for building scalable distributed data solutions using Hadoop.
  • Installed and configured Hive, Pig, Sqoop, Flume and Oozie on the Hadoop cluster.
  • Developed Simple to complex Map/reduce Jobs using Hive and Pig.
  • Provide mentorship and guidance to other architects to help them become independent.
  • Provide review and feedback for existing physical architecture, data architecture and individual code.
  • Optimized Map/Reduce Jobs to use HDFS efficiently by using various compression mechanisms.
  • Involved in Hadoop cluster task like commissioning & decommissioningNodes without any effect to running jobs and data.
  • Wrote MapReduce jobs to discover trends in data usage by users.
  • Involved in running Hadoop streaming jobs to process terabytes of text data.
  • Analyzed large data sets by running Hive queries and Pig scripts.
  • Helped the team to increase the Cluster size from 22 to 30 Nodes.
  • Job management using Fair scheduler.
  • Develop Core Framework based on Hadoop to Migrate Existing ETL (RDBMS) Solution.
  • Wrote Pig Scripts to generate Map Reduce jobs and performed ETL procedures on the data in HDFS
  • A deep and thorough understanding of ETL tools and how they can be applied in a Big Data environment.
  • Responsible for smooth error-free configuration of DWH-ETL solution and Integration with Hadoop.
  • Worked extensively with Sqoop for importing metadata from Oracle.
  • Involved in creating Hive tables, and loading and analyzing data using hive queries.
  • Responsible for managing data from multiple source.
  • Designed, developed and did maintenance of data integration programs in a Hadoop and RDBMS environment with both traditional and non-traditional source systems as we as RDBMS and NoSQL data stores for data access and analysis.
  • Experienced in runningHadoopstreaming jobs to process terabytes of xml format data.
  • Load and transform large sets of structured, semi structured and unstructured data.
  • Responsible to manage data coming from different sources.
  • Assisted in exporting analyzed data to relational databases using Sqoop.
  • Expert knowledge developing and debugging in Java/J2EE.
  • Wrote Hive Queries and UDF’s.
  • Developed Hive queries to process the data and generate the data cubes for visualizing.
  • Extracted feeds form social media sites such as Facebook, Twitter.
  • Created Pig Latin scripts to sort, group, join and filter the enterprise wise data.
  • Implemented Partitioning, Dynamic Partitions, Buckets in HIVE.
  • Gained experience in managing and reviewing Hadoop log files.

Environment: Hadoop, MapReduce, HDFS, Pig, Hive, HBase, Java, Oracle 10g, MySQL, Ubuntu.

Confidential, Atlanta, GA

Hadoop Developer

Responsibilities:

  • Developed shell scripts to automate the cluster installation.
  • Played a major role in choosing the right configurations for Hadoop.
  • Involved in start to end process of hadoop cluster installation, configuration and monitoring.
  • Installed and configured Hive, Pig, Sqoop, Flume and Oozie on the Hadoop cluster.
  • Setup and benchmarked Hadoop/HBase clusters for internal use.
  • Developed Pig Latin scripts to extract and filter relevant data from the web server output files to load into HDFS.
  • Responsible for building scalable distributed data solutions using Hadoop.
  • Upgrading the Hadoop Cluster to CDH2 and setup High availability Cluster Integrate the HIVE with existing applications.
  • A deep and thorough understanding of ETL tools and how they can be applied in a Big Data environment.
  • Responsible for cluster maintenance, adding and removing cluster nodes, cluster monitoring and troubleshooting, manage and review data backups, manage and review Hadoop log files.
  • Familiar with ETL Standards and Process and developed ETL logic as per standards from Source-Flat File, Flat-File-Stage, Stage-Work, Work-Work Interim tables and Work Interim tables- Confidential Tables
  • Handled importing of data from various data sources, performed transformations using Hive, MapReduce, and loaded data into HDFS.
  • Involved in moving all log files generated from various sources to HDFS for further processing through Flume.
  • Created HBase tables to store variable data formats of data coming from different portfolios.
  • Involved in transforming data from Mainframe tables to HDFS, and HBASE tables using Sqoop.
  • Implemented test scripts to support test driven development and continuous integration.
  • Specifying the Cluster size, allocating Resource pool, Distribution of Hadoop by writing the specification texts in JSON File format.
  • Developed Simple to complex Map/reduce Jobs using Hive and Pig.
  • Optimized Map/Reduce Jobs to use HDFS efficiently by using various compression mechanisms..
  • Extracted the data from MySQL into HDFS using Sqoop.
  • Analyzed the data by performing Hive queries and running Pig scripts to study customer behaviour.
  • Used UDF's to implement business logic in Hadoop.
  • Implemented business logic by writing UDFs in Java and used various UDFs from Piggybanks and other Sources.
  • Continuous monitoring and managing the Hadoop cluster using Cloudera Manager.
  • Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as Required.
  • Installed Oozie workflow engine to run multiple Hive and Pig jobs.
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generatereports for the BI team.

Environment: Hadoop, MapReduce, HDFS, Hive, Pig, Java, SQL, Cloudera Manager, Sqoop, Flume, Oozie, Java (jdk 1.6), Eclipse

Confidential, Bolingbrook, IL

Java/Hadoop Developer

Responsibilities:

  • Installed and configured Hadoop MapReduce, HDFS and developed multiple MapReduce jobs in Java for data cleansing and preprocessing.
  • Analyzed Hadoop clusters, other analytical tools used in big data like Hive, Pig and databases like HBase.
  • Used Hadoop to build scalable distributed data solution .
  • Evaluated business requirements and prepared detailed specifications that follow project guidelines required to develop written programs.
  • Devised procedures that solve complex business problems with due considerations for hardware/software capacity and limitations, operating times and desired results.
  • Worked hands on with ETL process.
  • Extracted feeds form social media sites such as Facebook, Twitter using Flume .
  • Involved in loading data from LINUX file system to HDFS.
  • Used Sqoop extensively to ingest data from various source systems into HDFS.
  • Written Hive queries for data analysis to meet the business requirements.
  • Created Hive tables and worked on them using Hive QL.
  • Installed cluster, worked on commissioning & decommissioning of Datanode, Namenode recovery, capacity planning, and slots configuration.
  • Assisted in managing and reviewing Hadoop log files.
  • Assisted in loading large sets of data (Structure, Semi Structured, Unstructured).
  • Implemented Hadoop cluster on Ubuntu Linux.
  • Cluster coordination services through Zookeeper.
  • Installed and configured Flume, Sqoop, Pig, Hive, HBase on Hadoop clusters.
  • Managed Hadoop clusters include commissioning & decommissioning cluster nodes for maintenance and capacity needs.
  • Wrote test cases in JUnit for unit testing of classes..
  • Supported in setting up QA environment and updating configurations for implementing scripts with Pig and Sqoop.
  • Involved in templates and screens in HTML and JavaScript.

Environment: Eclipse IDE, Linux, Hadoop Map Reduce, Pig Latin, Sqoop, Java, Hive, Hbase, Unix Shell Scripting.

Confidential, Minneapolis, MN

Sr. Java / J2EE Developer

Responsibilities:

  • Requirement analysis of the business specifications, development of programs Specification, System Testing, Internal code reviews for quality, Client Interaction.
  • Actively involved in development and provided support for implementation.
  • Partly worked as a technical lead in the design and development of enterprise Service Platform.
  • Used JMS messages to log audit messages (success and failure) on to GAL (General Audit Logging).
  • Used Hibernate for persisting the customer and billing information’s and EHCache for second level caching.
  • MDB’s are used for nightly builds for auto billing and payment services.
  • Used WS-Security for authenticating the SOAP messages along with encryption and decryption.
  • Actively involved in resolving the design bottlenecks and optimized queries depending on the service calls with respect to cost and time spent.
  • Performance tuning to identify and solve possible bottle necks in the application.
  • Ensured code quality using tools like Find Bugs and Hudson.
  • Parameterized different JVM settings to obtain optimal values for the application.
  • Automated the deployment to each environment.

Environment: Eclipse, IBM RAD, JAX-WS, XML, XSD, Java, J2EE, Struts, Spring, Hibernate, Ajax, CVS, log4j, JUnit, Oracle, Linux, Weblogic, and Load Runner.

Confidential

Java/JEE Developer

Responsibilities:

  • Created a Java based 24x7 Web application.
  • Designing of the logical and physical data model, generation of DDL scripts, and writing DML scripts for Oracle 9i database.
  • Tuning of SQL statement to improve performance, and consequently meet the SLAs.
  • Gathering business requirements and writing functional specifications and detailed design documents.
  • Building and deployment of Java applications into multiple Unix based environments and producing both unit and functional test results along with release notes.
  • Documentation of the process involved in the software development.

Environment: Java 1.5, XML, XSL, XHTML, Oracle 9i, PL/SQL.

We'd love your feedback!