We provide IT Staff Augmentation Services!

Hadoop Developer Resume

3.00/5 (Submit Your Rating)

Columbus, OhiO

SUMMARY

  • 8+ years of IT experience in analysis, design and development using Hadoop, Java and J2EE. 3+plus years of experience wif Hadoop, HDFS, MapReduce and Hadoop Ecosystem including Pig & Hive.
  • Excellent understanding / noledge of Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce programming paradigm.
  • Knowledge of Data Analytics and Business Analytics processes.
  • Hands on experience in installing, configuring and deployment of Hadoop ecosystem components like Hadoop Map Reduce, YARN, HDFS, HBase, Oozie, Hive, Pig, Impala and Spark, Storm, Kafka, Tableau, Sqoop, Pig, HCatalog, Zoo Keeper, Amazon Web Services and Flume.
  • Experience in building, maintaining multiple Hadoop clusters of different sizes and configuration and setting up the rack topology for large clusters also in Hadoop Administration/Architecture/Developer wif multiple distributions like Horton Works & Cloudera.
  • Experienced wif test frameworks for Hadoop using MRUnit.
  • Experience on Cloud managed database using Amazon EC2. Amazon RDS, Cloud watch, S3, SQS/SNS, etc.
  • Performed data analytics using PIG, Hive, and Language R for Data Scientists wifin the team.
  • Worked extensively on Data Visualization tool Tableau, Graph Database like Neo4J.
  • Supported Warehouse database of maximum size of 300+ terabytes. Worked on implementing features of Exadata x2 and x3 on Warehouse server. Worked on x2 to x3 migration for warehouse server.
  • Experience on creating and monitoring Hadoop clusters on EC2, VM, CDH3, CDH4 Cloudera Manager on Linux, Ubuntu OS etc. Having Good noledge on Single node and Multi node Cluster Configurations.
  • Configured Splunk to perform the web analytics.
  • Good technical Skills in Oracle 11i, SQL Server, ETL Development using Informatica Qlikview, Cognos, SAS.
  • Experience in managing and reviewing Hadoop log files.
  • Experience in analyzing data using HiveQL, Pig Latin, HBase and custom Map Reduce programs in Java.
  • Responsible for smooth error - free configuration of DWH-ETL solution and Integration wif Hadoop.
  • Extending Hive and Pig core functionality by writing custom UDFs.
  • Worked on multiple stages of Software Development Life Cycle including Development, Component Integration, Performance Testing, Deployment and Support Maintenance.
  • Worked extensively wif Data migration, Data cleansing, Data profiling, and ETL Processes features for data warehouses.
  • Extensive experience in middle-tier development using J2EE technologies like JDBC, JNDI, JSP, Servlets, JSP, JSF, Struts, Spring, Hibernate, and Web Services.

TECHNICAL SKILLS

Hadoop/Big Data: HDFS, Mapreduce, HBase, Pig, Hive, Sqoop, Flume, Cassandra, Impala, Oozie, Zookeeper, MapR, Amazon Web Serivces, EMR, MRUnit, Spark, Storm, Greenplum, Datameer, Language R, Ignite.

Java & J2EE Technologies: Core Java, JDBC, Servlets, JSP, JNDI, Struts, Spring, Hibernate and Web Services (SOAP and Restful)

IDE’s: Eclipse, Net beans, MyEclipse, IntelliJ

Frameworks: MVC, Struts, Hibernate, Spring

Programming languages: C,C++, Java, Python, Ant scripts, Linux shell scripts, R

Databases: Oracle 11g/10g/9i, MySQL, DB2, MS-SQL Server, MongoDB, Couch DB. Graph DB

Web Servers: Web Logic, Web Sphere, Apache Tomcat

Web Technologies: HTML, XML, JavaScript, AJAX, SOAP, WSDL, Restful WS

Network Protocols: TCP/IP, UDP, HTTP, DNS, DHCP

ETL Tools: Informatica, IBM Infosphere, Qlikview and Cognos

PROFESSIONAL EXPERIENCE

Confidential, Columbus, Ohio

Hadoop Developer

Responsibilities:

  • Worked on evaluation and analysis of Hadoop cluster and different big data analytic tools including Pig, Hbase database and Sqoop.
  • Conducted information sharing and teaching sessions to facilitate increased awareness of industry trends and upcoming initiatives by ensuring compliance between business strategies and goals and solution architecture designs
  • Performance tuned the application at various layers - MR, HIVE, CDH, Oracle
  • Used Spark streaming for the real time processing of the data from hdfs.
  • Used Qlikview to create visual interface of the real time data processing.
  • Implemented partitioning, dynamic partitioning and bucketing in hive
  • Imported, exported data from various databases netezza, oracle, MySql into hdfs.
  • Implemented Pub/Sub model using Apache Kafka for real-time transactions to load in HDFS.
  • Worked on implementing code in Java for HDFS, job types to be processed and submitted by the Hadoop Plugin to the Edge Node or gateway node to run on Cloudera cluster.
  • Automated the process from pulling the data from data sources to Hadoop and exporting the data in the form of Jason files in to specified location.
  • Migrated the Hive queries to Impala
  • Worked on various file formats Avro, SerDe, Parquet, and Text by using snappy compression.
  • Created analysis batch job prototypes using Hadoop, Pig, Oozie, Hue and Hive
  • Clear understanding of Cloudera Manager Enterprise edition.
  • Designed, documented operational problems by following standards and procedures using a software reporting tool JIRA.

Environment: Hadoop, HDFS, Map Reduce, Spark, Hive, Impala, Pig, Sqoop, Java, UNIX shell scripting, Oracle, Netezza, MySql, Qlikview

Confidential, Grove Heights, Minnesota

BigData/Hadoop Developer

Responsibilities:

  • Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, Hbase database and Sqoop.
  • Responsible for building scalable distributed data solutions using Hadoop.
  • Implemented nine nodes CDH3 Hadoop cluster on Red hat LINUX.
  • Involved in loading data from LINUX file system to HDFS.
  • Worked on installing cluster, commissioning & decommissioning of data node, name node recovery, capacity planning, and slots configuration.
  • Developed performance utilization charts, optimized and tuned SQL and designed physical databases.
  • Assisted developers wif Teradata load utilities and SQL.
  • Researched Sources and identified necessary Business Components for Analysis.
  • Gatheird the required information from the users.
  • Interacted wif different system groups for analysis of systems.
  • Created tables, views in Teradata, according to the requirements.
  • Created HBase tables to store variable data formats of PII data coming from different portfolios.
  • Implemented a script to transmit sysprin information from Oracle to Hbase using Sqoop.
  • Implemented best income logic using Pig scripts and UDFs.
  • Implemented test scripts to support test driven development and continuous integration.
  • Worked on tuning the performance Pig queries.
  • Worked wif application teams to install operating system, Hadoop updates, patches, version upgrades as required.
  • Responsible to manage data coming from different sources.
  • Involved in loading data from UNIX file system to HDFS.
  • Load and transform large sets of structured, semi structured and unstructured data
  • Cluster coordination services through Zookeeper.
  • Experience in managing and reviewing Hadoop log files.
  • Job management using Fair scheduler.
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
  • Responsible for cluster maintenance, adding and removing cluster nodes, cluster monitoring and
  • Troubleshooting, manage and review data backups, manage and review Hadoop log files.
  • Installed Oozie workflow engine to run multiple Hive and pig jobs.
  • Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
  • Experience wif Re-engineered customer account software systems used by brokerage teams. Web developer for user interfaces to trading inquiries, support parallel systems.
  • Supported in setting up QA environment and updating configurations for implementing scripts wif Pig and Sqoop.

Environment: MapReduce, Java (jdk1.6), Flat files, Oracle 11g/10g, Netezza, UNIX, Sqoop, Hive, Oozie.

Confidential, Detroit, MI

HadoopConsultant

Responsibilities:

  • Worked extensively on importing data using scoop and flume.
  • Responsible for creating complex tables using hive.
  • Created partitioned tables in Hive for best performance and faster querying.
  • Developed workflow in Oozie to automate the tasks of loading the data into HDFS.
  • Experience wif professional software engineering practices and best practices for the full software development life cycle including coding standards, code reviews, source control management and build processes.
  • Worked collaboratively wif all levels of business stakeholders to architect, implement and test Big Data based analytical solution from disparate sources
  • Involved in source system analysis, data analysis, data modeling to ETL (Extract, Transform and Load)
  • Written multiple MapReduce procedures to power data for extraction, transformation and aggregation from multiple file formats including XML, JSON, CSV & other compressed file formats.
  • Handling structured and unstructured data and applying ETL processes.
  • Develop Hive queries for the analysts
  • Prepare Developer (Unit) Test cases and execute Developer Testing.
  • Create/Modify shell scripts for scheduling various data cleansing scripts and ETL loading process.
  • Supports and assist QA Engineers in understanding, testing and troubleshooting.
  • Written build scripts using ant and participated in the deployment of one or more production systems
  • Production Rollout Support that includes monitoring the solution post go-live and resolving any issues that are discovered by the client and client services teams.
  • Designed, documented operational problems by following standards and procedures using a software-reporting tool JIRA.

Environment: MapReduce, Java (jdk1.6), Flat files, Oracle 11g/10g, Netezza, UNIX, Sqoop, Hive, Oozie.

Confidential, Dallas, TX

Java Developer

Responsibilities:

  • Writing application code using Core Java, J2EE, Servlets, JSP, Hibernate and spring wif Restful Web Services, Maven and ANT build tools.
  • Used spring framework for DI/IOC, Spring MVC design pattern implementation, configuring application context files and performed database object mapping using Hibernate annotations.
  • Involved in implementing DAO pattern for database connectivity and Hibernate for object persistence.
  • Designed front ends using JSP’s, JSTL Tag Libs, Display Tags, JavaScript, HTML, CSS, JQuery and Dojo Toolkit.
  • Using ANT and MAVEN for project builds.
  • Designing and developing search components using Oracle Endeca Information Discovery platform.
  • Developed and configured the baseline, partial and delta pipelines.
  • Writing Java classes for each component in the pipeline.
  • Converting raw data into XML format wif DTD validation, Extracting URL’s from these XML’s, reading and writing XML files using DOM Parser and SAX Parser.
  • Configuring the Search patterns, properties and dimensions.
  • Configuring Endeca platform using Web Studio and Developer Studio for implementing search pattern as required by client.
  • Perform logging and Unit testing the code using JUnits.
  • Writing use cases and sequence diagrams using UML.
  • Writing queries, Stored Procedures, Functions, and backend programming using Oracle, SQL/PL-SQL.

Environment: Java/J2EE, JSP, Hibernate, Spring, Maven, Restful Web Services, Endeca Information Discovery (Endeca Server, Developer Studio, Web Studio, Web Crawler), Eclipse, log4j, Slf4j, Maven, ANT, Oracle, SQL/PL-SQL, HTML, XML, JSON, JavaScript, CSS, Dojo, JQuery, and Weblogic.

Confidential, Detroit, MI

Java Developer

Responsibilities:

  • Developed the Web Interface using Struts, Java Script, HTML and CSS.
  • Extensively used the Struts controller component classes for developing the applications.
  • Involved in developing business tier using stateless session bean (acts as a Session Facade) and Message driven beans.
  • Used JDBC and Hibernate to connect to the database, using Oracle.
  • Data sources were configured in the app server and accessed from the DAO’s through Hibernate.
  • Design patterns of Business Delegates, Service Locator and DTO are used for designing the web module of the application.
  • Developed SQL stored procedures and prepared statements for updating and accessing data from database.
  • Involved in developing database specific data access objects (DAO) for Oracle.
  • Used CVS for source code control and JUNIT for unit testing.
  • Used Eclipse to develop entity and session beans.
  • The entire application is deployed in WebSphere Application Server.
  • Followed coding and documentation standards.

Environment: Java, J2EE, JDK, Java Script, XML, Struts, JSP, Servlets, JDBC, EJB, Hibernate, Web services, JMS, JSF, JUnit, CVS, IBM Web Sphere, Eclipse, Oracle 9i, Linux.

We'd love your feedback!