We provide IT Staff Augmentation Services!

Sr. Hadoop Developer Resume

4.00/5 (Submit Your Rating)

Plano, TX

PROFESSIONAL SUMMARY:

  • Around 8+ years of experience in IT industry, including Java, SQL, Big data environment, Hadoop ecosystem and Design, Developing, Maintenance of various applications.
  • Excellent understanding of Hadoop architecture and complete understanding of Hadoop daemons and various components such as HDFS, YARN, Resource Manager, Node Manager, Name Node, Data Node HDP and CDH.
  • Experience in developing Map Reduce jobs wif Java API in Hadoop
  • Implemented Data Ingestion using Sqoop into HDFS from RDBMS and vice - versa.
  • Involved in developing Pig Latin scripts for Data Transformation and migration
  • Handled structured data using Hive.
  • Wrote Ad-hoc queries for moving data from HDFS to Hive and analyzed the data using HIVE QL.
  • Experience in writing custom UDFs in Java for HIVE and Pig to extend the functionality.
  • Good Knowledge on serialization formats like Sequence File, Avro and Parquet Worked wif RDBMS including MySQL, Oracle.
  • Experience in Cluster maintenance (Adding and removing cluster nodes, Cluster Monitoring and Troubleshooting, Racks, Disk Topology, Manage and review data backups, Manage and review Hadoop log files).
  • Loaded local data into HDFS using Apache NiFi.
  • Scheduled workflow using Oozie workflow Engine.
  • Authentication and authorization management for Hadoop cluster users using Kerberos and Sentry.
  • Implementing Map Reduce jobs using Spark, Spark SQL wif Scala.
  • Experienced wif Real-time data processing mechanism in Big Data Ecosystem such as Apache Kafka, Storm, Spark Streaming and Flume
  • Experience in data collection, processing and streaming wif Kafka.
  • Working experience NoSQL Database including HBase and Cassandra.
  • Created data visualization wif matplotlib, ggplot, GraphX and Tableau for reports.
  • Worked wif Big Data Hadoop Application using Talend on cloud through Amazon Web Services (AWS) EC2 and S3.
  • Experience in working wif Teradata. And making the data to be batch processing using distributed computing.
  • Good knowledge in Graph databases Janus graph and Neo4j.
  • Front End wif HTML5/4, CSS3, JavaScript, jQuery, AngularJS, AJAX, Bootstrap
  • Excellent in implementing Object Oriented Programming techniques.
  • Strong in core java, data structure, algorithms design, Object-Oriented Design (OOD) and Java components like Collections Framework, Exception handling, I/O system, and Multithreading
  • Hands on experience in MVC architecture and Java EE frameworks like Struts2, Spring MVC, and Hibernate.
  • Worked in development environment like Git, JIRA, Jenkins, Agile/Scrum and Waterfall

TECHNICAL SKILLS:

Hadoop Ecosystem: Hadoop, MapReduce, Pig, Hive, Impala, Sqoop, Flume, Kafka, Nifi, Hbase, Oozie, Zookeeper, Kerberos, Sentry

Programming language: C, C++, Java, SQL Scala

Database: Oracle 10g, MySql, SQL server, Cassandra, Janus Graph, Neo4j (graph Databases).

Cloud Platform: Amazon Web services (EC2, S3, EMR)

Ide Application: Eclipse, IntelliJ IDEA.

Collabration: Git, Jira, Jenkins

Web Development: HTML5/4, CSS3, JavaScript, jQuery, AngularJS, AJAX, Bootstrap.

Java/J2EE Technologies: Servlets, JSP (EL, JSTL, Custom Tags), JSF, Apache Struts, Junit, Hibernate 3.x, Log4J Java Beans, EJB 2.0/3.0, JDBC, RMI, JMS, JNDI.

Spark Technologies: Spark Core, Spark SQL, Spark Streaming, Kafka, Storm.

PROFESSIONAL EXPERIENCE:

Confidential, Plano, TX

Sr. Hadoop Developer

Responsibilities:

  • Analyzing Hadoop cluster and different big data analytical and processing tools including Pig, Hive, Sqoop and Spark wif Scala & java, Spark Streaming.
  • Implementing Spark-Streaming applications to consume the data from Kafka topics and wrote processed streams to HBase and steamed data using Spark wif Kafka
  • Involving in architecture and design of distributed time-series database platform using NoSQL technologies like Hadoop/HBase.
  • Implementing Spark solutions to generate reports from Cassandra data and experienced in fetching and loading data in Cassandra
  • Writing HiveQL to analyze the number of unique visitors and their visit information such as views, most visited pages, etc.
  • Monitoring workload, job performance and capacity planning using Cloudera Manager. Responsible for configuring deployment environment to handle the application using Jetty server and Web Logic 10 and Postgres database Confidential the back-end.
  • Managing Multiple AWS accounts wif multiple VPC's for both production and non-production where primary.
  • Implementing ORM through Hibernate and involved in preparing the Database Model for the project and followed Scrum methodology for the application development.
  • Supporting Map Reduce Programs those are running on the cluster and developed multiple Map Reduce jobs in Java for data cleaning and pre-processing.
  • Implementing Data ingestion into HDFS using Sqoop and ran Pig scripts on the huge chunks of data.
  • Using pig for transformations, event joins, elephant bird API and pre -aggregations performed before loading JSON files format onto HDFS.
  • Involving in resolving performance issues in Pig and Hive wif understanding of Map Reduce physical plan execution and using debugging commands to run code in optimized way.
  • Good understanding of Partitions, bucketing concepts in Hive and designed both Managed and External tables in Hive to optimize performance.
  • Creating the AWS VPC network for the Installed Instances and configured the Security Groups and Elastic IP's accordingly.
  • Working wif Amazon EMR framework for processing data on EMR and EC2 instances.
  • Supporting in settling QA environment and updating configuration for implementing scripts wif Pig and Sqoop.
  • Configuring Spark Streaming to receive real time data from the Apache Kafka and store the stream data to HDFS using Scala.
  • Using Spark to perform analytics on data in Hive and experienced wif ETL working wif Hive and Map-Reduce.

Environment: Hadoop 2.6.0, HDFS, Map Reduce, Spark Streaming, Spark-Core, Spark SQL, Scala, Pig 0.14, Hive 1.2.1, Sqoop 1.4.4, Flume 1.6.0, Kafka, JSON, HBase.

Confidential - New York, NY

Sr. Big data Developer

Responsibilities:

  • Evaluated suitability of Hadoop and its ecosystem to the above project and implemented various proof of concept (POC) applications both on Distributed data centers and cloud-based services to eventually adopt them to benefit from the Big Data Hadoop initiative
  • Configured Hadoop clusters wif Cloudera CDH4 as hot standby to tackle failover situations and to achieve Full Stack Resiliency
  • Deployed Hadoop cluster in Azure HDInsight to compare scalability and cost-TEMPeffectiveness, Queried Hadoop cluster using PowerShell, Hue and as well as the remote console
  • Estimated Software & Hardware requirements for the Name-Node and Data-Node& planning the cluster
  • Developed ETL transformations dat sourced from various sources like e-shopping website and in person shopping records
  • Did POC on processing unstructured data in Azure Blob storage
  • Integrated Hive and Hbase, loaded data into HDFS and Bulk Loaded the cleaned data into HBase
  • Written the Map Reduce programs, Hive UDFs in Java where the functionality is too complex
  • Involved in loading data from LINUX file system to HDFS
  • Develop HIVE queries for the analysis, to categorize different items
  • Designing and creating Hive external tables using shared meta-store instead of the derby wif partitioning, dynamic partitioning and buckets
  • Given POC of FLUME to handle the real time log processing for attribution reports
  • Sentiment Analysis on reviews of the products on the client's website
  • Implemented real-time analytics wif Apache Kafka and storm
  • Tested Spark on real-time data, did frequent item mining on real-time by implementing associative-rule mining
  • Exported the resulted sentiment analysis data to Tableau for creating dashboards
  • Used Map Reduce JUnit for unit testing
  • Maintained System integrity of all sub-components (primarily HDFS, MR, HBase, and Hive)
  • Reviewing peer table creation in Hive, data loading and queries
  • Monitored System health and logs and respond accordingly to any warning or failure conditions
  • Responsible to manage the test data coming from different sources
  • Involved in scheduling Oozie workflow engine to run multiple Hive and Pig jobs
  • Weekly meetings wif technical collaborators and active participation in code review sessions wif senior and junior developers
  • Created and maintained Technical documentation for launching Hadoop Clusters and for executing Hive queries and Pig Scripts
  • Involved unit testing, interface testing, system testing and user acceptance testing of the workflow tool

Environment: Apache Hadoop, HDFS, Hive, Map Reduce, Java, Flume, Horton works, Cloud era, Oozie, My SQL, UNIX, Core Java and Pentaho

Confidential, Arizona

Big Data Engineer

Responsibilities:

  • Developed Simple and complex Map Reduce streaming jobs using Java language dat has implemented them Using Pig.
  • Ingested data into HDFS from Oracle and vice-versa using Sqoop.
  • Extensively used Pig for Data cleansing.
  • Analyzed the data by running Pig Latin scripts to study customer behavior.
  • Handled structured and unstructured data and applying ETL processes.
  • Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS.
  • Developed the Pig UDF'S to pre-process the data for analysis.
  • Written multiple UDF programs in Java for data extraction, transformation and aggregation from multiple file formats (XML, JSON, and CSV).
  • Develop and maintained complex outbound notification applications dat run on custom architectures, using languages Core Java, J2EE, SOAP, XML, JMS, JBoss and Web Services.
  • Developed Spark code using Scala and Spark-SQL for faster testing and data processing.
  • Experienced in monitoring and debug performance issues on Linux (RHEL & Centos).
  • Involved in Production Rollout Support which includes monitoring the solution post go-live and resolving any issues dat are discovered prior to Rollout.
  • Integrating Hadoop wif Kafka. Expertise in uploading Click stream data from Kafka to HDFS.
  • Designed and documented operational issues by following standards and procedures in a software reporting tool JIRA.

Environment: Pig, Map Reduce, Sqoop, Kafka, Spark, HBase, Oozie, Java, Jiira, Scala, J2EE, XML, SOAP, JSON, JBoss CSV, Linux, RHEL, Centos.

Confidential

Big Data/Hadoop Developer

Responsibilities:

  • Worked o Hadoop Cluster wif size of 83 Nodes and 896 terabytes capacity
  • Worked on Map reduce jobs, HIVE, Pig.
  • Involve in Requirement Analysis, Design, and Development.
  • Importing and exporting data into Hive and Hbase using Sqoop from existing SQL server.
  • Experience working on processing unstructured data using Pig and Hive.
  • Developed Pig Latin scripts to extract data from the web server output files to load into HDFS.
  • Implemented Partitioning, Dynamic Partitions, Buckets in Hive.
  • Developed Hive queries, Pig scripts, and Spark SQL queries to analyze large datasets.
  • Exported the result set from Hive to MySQL using Sqoop.
  • Created and maintained technical documentation for launching Hadoop clusters and for executing Hive queries and Pig Scripts.
  • Worked on debugging, performance tuning of Hive & Pig Jobs.
  • Gained experience in managing and reviewing Hadoop log files.
  • Involved in scheduling Oozie workflow engine to run multiple Hive and pig jobs.
  • Used NoSQL database wif HBase.
  • Actively involved in code review and bug fixing for improving the performance.

Environment: Hadoop, HDFS, Pig, Hive, MapReduce, Sqoop, Flume, LINUX, Hbase, Java, Oozie.

Confidential

Software Developer

Responsibilities:

  • Interacted wif business managers to transform requirements into technical solutions.
  • Followed Agile software development wif Scrum methodology.
  • Involved in Java, J2EE, Struts, Spring, Web Services and Hibernate in a fast-paced development environment.
  • Server-side coding and development using Spring, Exception Handling, Java Collections including Set, List, Map, Spring, Hibernate, Webservices, etc in Windows & Linux environment.
  • Involved in defect tracking as well as planning using JIRA.
  • Resolved a complicate production issue for business managers, where the number of records where displaying wrong.
  • Created and modified Struts actions. Worked wif struts validations.
  • Worked on Spring application framework features IOC container and AOP and integrated Spring wif Hibernate using the Hibernate Template.
  • Developed enterprise inter-process communication frame work using Spring REST-full Web Service. Developing SOAP Webservices and REST Webservices (JAXB, JSON, JAX-RS, JAX-WS) Developed Hibernate persistent layer.
  • Implemented Spring MVC framework in the presentation tier for all the essential control Used Log4j utility to generate run-time logs.
  • Prepared Unit and System Testing Specification documents and performed Unit and System testing of the application.
  • Reviewed the code for ensuring adherence to java coding standards.
  • Developed Functional Requirement Document based on user’s requirement.

Environment: Corejava, Servlets, Springs3.0, Spring MVC, Hibernate, REST Web Services, SQL Developer, Apache Tomcat 7.0, MongoDB, Multi-Threading, Web sphere, Agile Methodology, Design Patterns, Apache Maven, Junit.

Confidential

Java Developer

Responsibilities:

  • Involved in the complete Software Development Lifecycle (SDLC) using the Agile iterative development
  • Methodology. Interacted wif the end users and participated in the SCRUM meetings.
  • Developing End-User friendly GUI using JSP, HTML, DHTML, JavaScript & CSS
  • Implemented CSS Manipulation, HTML Event functions, Java script TEMPEffects and animations using JQUERY.
  • Involved in development of application using struts, RAD and Oracle database.
  • Developed Data Access Layer using Hibernate ORM framework.
  • Has coded numerous DAO's using Hibernate Dao Support. Used Criteria, HQL and SQL as the query languages in Hibernate Mapping.
  • Used Web Services for transmission of large blocks of XML data using SOAP.
  • Used XML for data exchange and schemas (XSDs) for XML validation. Used XSLT for transformation of XML.
  • Written numerous test cases for unit testing of the code using JUnit testing framework.
  • Used Log4j to implement logging facilities. Used Clear Case for version control.
  • Used Ant as a build tool.
  • Configured and Deployed application on Web Sphere Application Server.

Environment: Java, Java EE, Web Sphere Application Server, SOAP, Eclipse, Struts, Hibernate, Web Services, HTML, CSS, XML, Ant, UML, JavaScript, jQuery, Rational Rose, JUnit, Log4j, Clear Case, Windows XP.

We'd love your feedback!