We provide IT Staff Augmentation Services!

Hadoop Developer. Resume

2.00/5 (Submit Your Rating)

TX

SUMMARY:

  • Proactive IT developer with Eight years of working experience on development and design of various scalable systems using Hadoop Technologies on various environments.
  • Extraordinary Understanding of Hadoop building and Hands on involvement with Hadoop segments such as Job Tracker, Task Tracker, Name Node, Data Node and HDFS Framework.
  • Extensive experience in analyzing data using Hadoop Ecosystems including Sqoop, Flume, Kafka,
  • Storm, HDFS, Hive, Pig, Impala, Oozie, Zookeeper, Solr, Nifi, SparkSQL, Spark Streaming.
  • Capable of processing large sets of structured, semi - structured and unstructured data and supporting systems application architecture.
  • Configured Zookeeper, Cassandra, and Flume to the existing Hadoop cluster.
  • Have an experience in importing and exporting data using Sqoop from Hadoop Distributed File Systems to Relational Database Systems to Hadoop Distributed File Systems.
  • Expertise in writing Hadoop Jobs for analyzing data using Hive QL (Queries), Pig Latin (Data flow language), and custom MapReduce programs in Java.
  • Involvement in creating custom UDFs for Pig and Hive to consolidate strategies and usefulness of Python/Java into Pig Latin and HQL (Hive QL).
  • Knowledge in developing Nifi flow prototype for data ingestion in HDFS.
  • Experience in converting Hive queries into Spark transformations using Spark RDDs and Scala.
  • Hands on Experience in troubleshooting errors in HBase Shell, Pig, Hive, and MapReduce.
  • Hands-on experience in provisioning and managing multi-tenant Cassandra cluster on public cloud environment - Amazon Web Services (AWS) - EC2, Open Stack.
  • Experience in NoSQL Column-Oriented Databases like HBase, Cassandra, MongoDB, and its Integration with Hadoop cluster.
  • Good noledge on Integration of Mesos (DS/OS) into the Cloud Platform
  • Experience in maintaining the big data platform using open source technologies such as Spark and Elastic Search.
  • Implemented partitioning on several fact tables in data warehouse using Cluster Column store indexes.
  • Experience in installation, configuration, supporting and managing Hadoop Clusters using Horton works, and Cloudera (CDH3, CDH4) distributions on Amazon web services (AWS).
  • Experience in configuring the flume agents for the transfer of data from external systems to HDFS.
  • Good noledge on SDLC and have experience in waterfall and agile methodologies like scrum.
  • Experienced data pipelines using Kafka and Akka for handling large terabytes of data.
  • Good experience on general data analytics on distributed computing cluster like Hadoop using Apache Spark, Impala, and Scala
  • Good understanding on Yarn and Mesos.
  • Developed and deployed Apache Nifi flows across various environments, optimized Nifi data flow and written QA scripts in Python for tracking missing files.
  • Planned and created answer for constant information ingestion utilizing Kafka, Storm, Spark spilling and different NoSQL databases.
  • Developed Scala scripts, UDF's using both Data frames/SQL and RDD/MapReduce in Spark for Data Aggregation, queries and writing data back into RDBMS through Sqoop.
  • Experience in understanding the security requirements for Hadoop and integrate with Kerberos authentication and authorization infrastructure.
  • Good hands on experience in creating the RDD's, Data frames for the required input data and performed the data transformations using Spark Scala.
  • Good noledge on Datasets.
  • Worked in developing a Nifi flow prototype for data ingestion in HDFS.
  • Extensive experience working in Oracle, DB2, SQL Server, PL/SQL and My SQL database and Java Core concepts like OOPS, Multithreading, Collections, and IO.
  • Experienced in designing Web Applications using HTML5, CSS3, JavaScript, Json, JQuery, AngularJS, Bootstrap, and Ajax under Windows operating system.
  • Experience in Service Oriented Architecture using Web Services like SOAP & Restful.
  • Learning on administration situated design (SOA), work processes and web administrations utilizing XML, SOAP, and WSDL.
  • Extensive experience in middle-tier development using J2EE technologies like JDBC, JNDI, JSP, Servlets, JSP, JSF, Struts, Spring, Hibernate, JDBC, EJB.
  • Have good interpersonal, communicational skills, strong problem-solving skills, explore to new technologies with ease and a good team member.

TECHNICAL SKILLS:

Big Data Eco systems: HDFS, MapReduce, Hive, YARN, Pig, Sqoop, Kafka, Storm, Flume, Oozie, and Zookeeper, Apache Spark, Apache Tez, Impala, Nifi, Akka Apache Solr, Active MQ, Scala.

No SQL Databases: HBase, MongoDB, Cassandra

Programming Languages: C, C++, Java, J2EE, PL/SQL, Pig Latin, Scala, Python

Java/J2EE Technologies: JDBC, JNDI, JSON, JSTL, RMI, JMS, Java Script, JSP, Servlets, EJB, JSF, JQuery, AngularJS

Frameworks: MVC, Struts, Spring, Hibernate

Operating Systems: Sun Solaris, HP-UNIX, Red Hat Linux, Ubuntu Linux and Windows XP/Vista/7/8

Web Technologies: HTML, DHTML, XML, AJAX, WSDL, SOAP, Play framework

Web/Application servers: Apache Tomcat, WebLogic, JBoss.

Version control: GIT, SVN, CVS

Network Protocols: TCP/IP, UDP, HTTP, DNS, DHCP

PROFESSIONAL EXPERIENCE:

Confidential, TX

Hadoop Developer

Responsibilities:

  • Optimizing the Hive Queries using the various files format like JSON, Avro, ORC, and Parquet.
  • Worked on Spark RDD transformations to map business analysis and apply actions on top of transformations.
  • Experienced in working with spark eco system using Spark SQL and Scala queries on different formats like Text file, Avro, Parquet files.
  • Worked in Spark streaming to get ongoing information from the Kafka and store the stream information to HDFS.
  • Developed Pig Latin scripts and Pig command line transformations for data joins and custom processing of Map reduce outputs and loading tables from Hadoop to various clusters.
  • Talend jobs for data ingestion, enrichment, and provisioning.
  • Worked in migrating Hive QL into Impala to minimize query response time.
  • Involved in loading data from edge node to HDFS using shell scripting.
  • Use Data frames for data transformation.
  • Worked with Kerberos and integrated it to the Hadoop cluster to make it more strong and secure from unauthorized access.
  • Migrated an existing on-premises application to AWS. Used AWSservices like EC2 and S3 for small data sets.
  • Created Hive tables, dynamic partitions, buckets for sampling, and working on them using HQL.
  • Worked on Spark using Scala and Spark SQL for faster testing and processing of data.
  • Experienced a proof of concept using Kafka, HBase for processing streaming data.
  • Involved in advanced procedures like text analytics and processing using the in-memory computing capabilities like Apache Spark written in Scala.
  • Worked with BI teams in generating the reports and designing ETL workflows on Tableau. Deployed data from various sources into HDFS and building reports using Tableau.
  • Written Python scripts to analyze the data of the customers.
  • Implemented Talend jobs to load data from different sources and integrated with Kafka.
  • Integrated Oozie with the rest of the Hadoop stack supporting several types of Hadoop jobs out of the box (such as Map-Reduce, Pig, Hive, and Sqoop) as well as system specific jobs (such as Java programs and shell scripts).

Environment: Map Reduce, HDFS, Spark, Scala, Python, Kafka, Hive, Pig, Spark streaming, Talend, HBase, Tableau, Maven, Jenkins, UNIX, MR Unit, Git.

Confidential, Iowa

Hadoop Developer.

Responsibilities:

  • Worked on Spark SQL to handle structured data in Hive.
  • Involved in making Hive tables, stacking information, composing hive inquiries, producing segments and basins for enhancement.
  • Involved in migrating tables from RDBMS into Hive tables using SQOOP and later generate visualizations using Tableau.
  • Worked on complex MapReduce program to analyses data that exists on the cluster.
  • Analyzed substantial data sets by running Hive queries and Pig scripts.
  • Written Hive UDFs to sort Structure fields and return complex data type.
  • Worked in AWS environment for development and deployment of custom Hadoop applications.
  • Involved in creating Shell scripts to simplify the execution of all other scripts (Pig, Hive, Sqoop, Impala and MapReduce) and move the data inside and outside of HDFS.
  • Creating files and tuned the SQL queries in Hive utilizing HUE.
  • Involved in collecting and aggregating large amounts of log data using Storm and staging data in HDFS for further analysis.
  • Created the Hive external tables using Accumulo connector.
  • Managed real time data processing and real time Data Ingestion in MongoDB and Hive using Storm.
  • Created custom Solr Query segments to optimize ideal search matching.
  • Developed Spark scripts by using Python shell commands.
  • Stored the processed results In Data Warehouse, and maintaining data using Hive.
  • Experienced in working with Spark eco system using Spark SQL and Scala queries on different formats like Text file, CSV file.
  • Created Oozie workflow and Coordinator jobs to kick off the jobs on time for data availability.
  • Worked with NoSQL databases like MongoDB in making MongoDB tables to load expansive arrangements of semi structured data.
  • Developed Spark scripts by using Python shell commands as per the requirement.
  • Installed Oozie workflow engine to run multiple Hive and Pig jobs, which run independently with time and data availability.
  • Worked and learned a great deal from Amazon Web Services (AWS) Cloud services like EC2, S3, EMR.

Environment: HDFS, MapReduce, Storm, Hive, Pig, Sqoop, MongoDB, Apache Spark, Python, Accumulo, Oozie Scheduler, Kerberos, AWS, Tableau, Java, UNIX Shell scripts, HUE, Solr, Git, Maven.

Confidential, MA

Hadoop Developer.

Responsibilities:

  • Responsible for importing log files from various sources into HDFS using Flume.
  • Handled Big Data utilizing a Hadoop group comprising of 40 hubs.
  • Performed complex Hive QL queries on Hive tables.
  • Captured the data logs from web servers into HDFS using Flume & Splunk for analysis.
  • Actualized Partitioning, Dynamic Partitions, Buckets in HIVE.
  • Exported data from DB2 to HDFS using Sqoop and Developed MapReduce jobs using Java API.
  • Created final tables in Parquet format.
  • Developed PIG scripts for source data validation and transformation.
  • Developed Shell, and Python scripts to automate and provide Control flow to Pig scripts.
  • Involved in unit testing using MR unit for MapReduce jobs.
  • Utilized Hive and Pig to create BI reports.
  • Developed data integration programs in a Hadoop environment with NoSQL data store Cassandra for data access and analysis
  • Used Oozie workflow engine to manage interdependent Hadoop jobs and to automate several types of Hadoop jobs such as Java map-reduce Hive, Pig, and Sqoop.
  • Worked with Informatica MDM in creating single view of the data.

Environment: Horton works, HDFS, Pig, Hive, MapReduce, Java, Informatica, Oozie, Linux/Unix Shell scripting, Cassandra, Python, Perl, Java (jdk1.7), Git, Maven, Jenkins.

Confidential, NJ

Java Developer.

Responsibilities:

  • TEMPEffectively interacted with team members and business users for requirements gathering.
  • Involved in analysis, design, and implementation phases of the software development lifecycle (SDLC).
  • Implementation of spring core J2EE patterns like MVC, Dependency Injection (DI), and Inversion of Control (IOC).
  • Implemented REST Web Services with Jersey API to deal with customer requests.
  • Developed test cases using J Unit and used Log4j as the logging framework.
  • Worked with HQL and Criteria API from retrieving the data elements from database.
  • Developed user interface using HTML, Spring Tags, JavaScript, JQuery, and CSS.
  • Developed the application using Eclipse IDE and worked under Agile Environment.
  • Design and implementation of front end web pages using CSS, JSP, HTML, java Script Ajax and, Struts
  • Utilized Eclipse IDE as improvement environment to plan, create and convey Spring segments on Web Logic

Environment: Java, J2EE, HTML, JavaScript, CSS, J Query, Spring 3.0, JNDI, Hibernate 3.0, Java Mail, Web Services, REST, Oracle 10g, J Unit, Log4j, Eclipse, Web logic 10.3.

Confidential

Java Developer.

Responsibilities:

  • Designed and implemented the training and reports modules of the application using Servlets, JSP and Ajax.
  • Developed custom JSP tags for the application.
  • Writing queries for fetching and manipulating data using ORM software ibatis.
  • Used Quartz schedulers to run the jobs sequentially at given time.
  • Implemented design patterns like Filter, Cache Manager, and Singleton to improve the performance of the application.
  • Implemented the reports module of the application using Jasper Reports to display dynamically generated reports for business intelligence.
  • Deployed the application in client’s location on Tomcat Server.

Environment: HTML, Java Script, Ajax, Java, Servlets, JSP, ibatis, Tomcat Server, SQL Server, Jasper Reports.

We'd love your feedback!