We provide IT Staff Augmentation Services!

Hadoop Developer Resume

4.00/5 (Submit Your Rating)

Irving, TX

SUMMARY

  • Proactive IT developer with 8 years of working experience on development and design of various scalable systems using Hadoop Technologies on various environments.
  • Extraordinary Understanding of Hadoop building and Hands on involvement with Hadoop segments such as Job Tracker, Task Tracker, Name Node, Data Node and HDFS Framework.
  • Extensive experience in analyzing data using Hadoop Ecosystems including Sqoop, Flume, Kafka, Storm, HDFS, Hive, Pig, Impala, Oozie, Zookeeper, Solr, Nifi, SparkSQL, Spark Streaming.
  • Capable of processing large sets of structured, semi - structured and unstructured data and supporting systems application architecture.
  • Configured Zookeeper, Cassandra, and Flume to the existing Hadoop cluster.
  • Have an experience in importing and exporting data using Sqoop from HadoopDistributed FileSystems to Relational Database Systems to HadoopDistributed File Systems.
  • Expertise in writing HadoopJobs for analyzing data using Hive QL (Queries), Pig Latin (Data flowlanguage), and custom MapReduce programs in Java.
  • Involvement in creating custom UDFs for Pig and Hive to consolidate strategies and usefulness of Python/Java into Pig Latin and HQL (HiveQL).
  • Experience in converting Hive queries into Spark transformations using Spark RDDs and Scala.
  • Hands on Experience in troubleshooting errors in HBase Shell, Pig, Hive and MapReduce.
  • Hands-on experience in provisioning and managing multi-tenant Cassandra cluster on public cloud environment - Amazon Web Services (AWS) - EC2, Open Stack.
  • Experience in NoSQL Column-Oriented Databases like HBase, Cassandra, Mongo dB and its Integration with Hadoop cluster.
  • Experience in maintaining the big data platform using open source technologies such as Spark and Elastic Search.
  • Experience in installation, configuration, supporting and managing HadoopClusters using Hortonworks, and Cloudera (CDH3, CDH4) distributions on Amazon web services(AWS).
  • Experience in configuring the flume agents for the transfer of data from external systems to HDFS.
  • Good understandingon Yarn and Mesos.
  • Planned and created answer for constant information ingestion utilizing Kafka, Storm, Spark spilling and differentNoSQL databases.
  • Experienced in using apache Hue and Ambari to manage and monitor the Hadoop clusters.
  • Developed Scala scripts, UDF's using both Data frames/SQL and RDD/MapReduce in Spark for Data Aggregation, queries and writing data back into RDBMS through Sqoop.
  • Experience in understanding the security requirements for Hadoopand integrate with Kerberos authentication and authorization infrastructure.
  • Good hands on experience in creating the RDD's, Data frames for the required input data and performed the data transformations using Spark Scala.
  • Good knowledge on Datasets.
  • Knowledge in developing a Nifi flow prototype for data ingestion in HDFS.
  • Extensive experience working in Oracle, DB2, SQL Server, PL/SQL and My SQL database and JavaCore concepts like OOPS, Multithreading, Collections, and IO.
  • Experienced in designing Web Applications using HTML5, CSS3, JavaScript, Json, JQuery, AngularJS,Bootstrap, and Ajax under Windows operating system.
  • Experience in Service Oriented Architecture using Web Services like SOAP&Restful.
  • Learning on administration situated design (SOA), work processes and web administrations utilizing XML, SOAP, and WSDL
  • Extensive experience in middle-tier development using J2EE technologies like JDBC, JNDI, JSP,Servlets, JSP, JSF, Struts, Spring, Hibernate, JDBC, EJB.
  • Have good interpersonal, communicational skills, strong problem solving skills, explore to new technologies with ease and a good team member.

TECHNICAL SKILLS

Big Data Eco systems: HDFS, MapReduce, Hive, YARN, Pig, Sqoop, Kafka, Storm, Flume, Oozie, and Zookeeper, Apache Spark, Apache Tez, Impala, Nifi, Apache Solr, Active MQ, Scala.

No SQL Databases: HBase, mongo DB, Cassandra

Programming Languages: C, C++, Java, J2EE, PL/SQL, Pig Latin, Scala, Python

Java/J2EE Technologies: JDBC, JNDI, JSON, JSTL, RMI, JMS, Java Script, JSP, Servlets, EJB, JSF, JQuery, AngularJS

Frameworks: MVC, Struts, Spring, Hibernate

Operating Systems: Sun Solaris, HP-UNIX, Red Hat Linux, Ubuntu Linux and Windows XP/Vista/7/8

Web Technologies: HTML, DHTML, XML, AJAX, WSDL, SOAP

Web/Application servers: Apache Tomcat, WebLogic, JBoss.

Version control: GIT, SVN, CVS

Network Protocols: TCP/IP, UDP, HTTP, DNS, DHCP

PROFESSIONAL EXPERIENCE

Confidential, Irving, TX

Hadoop Developer

Responsibilities:

  • Involved in managing nodes onHadoopcluster and monitorHadoop cluster job performance using Cloudera manager.
  • Optimizing the Hive Queries using the various files format like JSON, Avro, ORC, and Parquet.
  • Worked onSpark RDD transformations to map business analysis and apply actions on top of transformations.
  • Experienced in working with spark eco system using Spark SQL and Scala queries on different formats like Text file, Avro, Parquet files.
  • Worked inSpark streaming to get ongoing information from the Kafka and store the stream information to HDFS.
  • Developed PigLatin scripts and Pig command line transformations for data joins and custom processing of Map reduce outputs and loading tables fromHadoop to various clusters.
  • Talend jobs for data ingestion, enrichment, and provisioning.
  • Worked in migrating HiveQL into Impala to minimize query response time.
  • Involved in loading data from edge node to HDFS using shell scripting.
  • Worked with Kerberos and integrated it to the Hadoop cluster to make it more strong and secure from unauthorized access.
  • Created Hive tables, dynamic partitions, buckets for sampling, and working on them using HQL.
  • Worked onSpark using Scala and Spark SQL for faster testing and processing of data.
  • Experienced a proof of concept using Kafka, Mongo DBfor processing streaming data.
  • Involved in advanced procedures like text analytics and processing using the in-memory computing capabilities like Apache Spark written in Scala.
  • Developed HDFS with huge amounts of data using Apache Kafka.
  • Implemented Talend jobs to load data from different sources and integrated with Kafka.
  • Integrated Oozie with the rest of theHadoop stack supporting several types ofHadoopjobs out of the box (such as Map-Reduce, Pig, Hive, and Sqoop) as well as system specific jobs (such as Java programs and shell scripts).

Environment: Map Reduce, HDFS, Spark, Scala, Kafka, Hive, Pig,Spark streaming, Mongo DB, maven, Jenkins, UNIX, Python, MRUnit, Git.

Confidential, Perry, Iowa

Hadoop Developer

Responsibilities:

  • Worked onSpark SQL to handle structured data in Hive.
  • Involved in making Hive tables, stacking information, composing hive inquiries, producing segments and basins for enhancement.
  • Involved in migrating tables from RDBMS into Hive tables using SQOOP and later generate visualizations using Tableau.
  • Worked on complex MapReduce program to analyses data that exists on the cluster.
  • Analyzed substantial data sets by running Hive queries and Pig scripts.
  • Written Hive UDFs to sort Structure fields and return complex data type.
  • Worked in AWS environment for development and deployment of custom Hadoop applications.
  • Involved in creating Shell scripts to simplify the execution of all other scripts (Pig, Hive, Sqoop, Impala and MapReduce) and move the data inside and outside of HDFS.
  • Creating files and tuned the SQL queries in Hive utilizing HUE.
  • Involved in collecting and aggregating large amounts of log data using Storm and staging data in HDFS for further analysis.
  • Created the Hive external tables using Accumulo connector.
  • Knowledge in developing Nifi flow prototype for data ingestion in HDFS.
  • Managed real time data processing and real time Data Ingestion in Mongo DB and Hive using Storm.
  • Created custom Solr Query segments to optimize ideal search matching.
  • Developed Spark scripts by using Python shell commands.
  • Stored the processed results In Data Warehouse, and maintaining data using Hive.
  • Experienced in working with Spark eco system using Spark SQL and Scala queries on different formats like Text file, CSV file.
  • Created Oozie workflow and Coordinator jobs to kick off the jobs on time for data availability.
  • Worked with NoSQL databases like Mongo DB in making Mongo DB tables to load expansive arrangements of semi structured data.
  • DevelopedSpark scripts by using Python shell commands as per the requirement.
  • Installed Oozie workflow engine to run multiple Hive and Pig jobs, which run independently with time and data availability.
  • Worked and learned a great deal from Amazon Web Services (AWS) Cloud services like EC2, S3, EMR.

Environment: Cloudera, HDFS, MapReduce, Storm, Hive, Pig, Sqoop, Mongo DB, Apache Spark, Python,Accumulo, Oozie Scheduler, Kerberos, AWS, Tableau, Java, UNIX Shell scripts, HUE, Nifi, Solr, Git, Maven.

Confidential, Boston, MA

Hadoop Developer

Responsibilities:

  • Responsible for importing log files from various sources into HDFS using Flume.
  • Handled Big Data utilizing a Hadoop group comprising of 40 hubs.
  • Performed complex HiveQL queries on Hive tables.
  • Actualized Partitioning, Dynamic Partitions, Buckets in HIVE.
  • Exported data from DB2 to HDFS using Sqoop and Developed MapReduce jobs using Java API.
  • Created final tables in Parquet format.
  • Developed PIG scripts for source data validation and transformation.
  • Developed Shell, and Python scripts to automate and provide Control flow to Pig scripts.
  • Involved in unit testing using MR unit for MapReduce jobs.
  • Utilized Hive and Pig to create BI reports.
  • Developed data integration programs in a Hadoop environment with NoSQL data store Cassandra for data access and analysis
  • Used Oozie workflow engine to manage interdependent Hadoop jobs and to automate several types of Hadoop jobs such as Java map-reduce Hive, Pig, and Sqoop.
  • Worked with Informatica MDM in creating single view of the data.

Environment: Horton works, HDFS, Pig, Hive, MapReduce, Java, Informatica, Oozie, Linux/Unix Shell scripting, Cassandra, Python, Perl, Java (jdk1.7), Git, Maven, Jenkins.

Confidential, NJ

Java Developer

Responsibilities:

  • Effectively interacted with team members and business users for requirements gathering.
  • Involved in analysis, design, and implementation phases of the software development lifecycle (SDLC).
  • Implementation of spring core J2EE patterns like MVC, Dependency Injection(DI), and Inversion of Control (IOC).
  • Implemented REST Web Services with Jersey API to deal with customer requests.
  • Developed test cases using J Unit and used Log4j as the logging framework.
  • Worked with HQL and Criteria API from retrieving the data elements from database.
  • Developed user interface using HTML, Spring Tags, JavaScript, JQuery, and CSS.
  • Developed the application using Eclipse IDE and worked under Agile Environment.
  • Design and implementation of front end web pages using CSS, JSP, HTML, java Script Ajax and, Struts
  • Utilized Eclipse IDE as improvement environment to plan, create and convey Spring segments on Web Logic

Environment: Java, J2EE, HTML, JavaScript, CSS, J Query, Spring 3.0, JNDI, Hibernate 3.0, Java Mail, Web Services, REST, Oracle 10g, J Unit, Log4j, Eclipse, Web logic 10.3.

Confidential

Java Developer

Responsibilities:

  • Designed and implemented the training and reports modules of the application using Servlets, JSP and Ajax.
  • Developed custom JSP tags for the application.
  • Writing queries for fetching and manipulating data using ORM software ibatis.
  • Used Quartz schedulers to run the jobs sequentially at given time.
  • Implemented design patterns like Filter, Cache Manager, and Singleton to improve the performance of the application.
  • Implemented the reports module of the application using Jasper Reports to display dynamically generated reports for business intelligence.
  • Deployed the application in client’s location on Tomcat Server.

Environment: HTML, Java Script, Ajax, Java, Servlets, JSP, ibatis, Tomcat Server, SQL Server, Jasper Reports.

We'd love your feedback!