We provide IT Staff Augmentation Services!

Hadoop Developer Resume

0/5 (Submit Your Rating)

Hazelwood, MO

SUMMARY

  • Overall 7+ years of IT experience which includes strong experience in Big data ecosystem and Java / J2EE related technologies.
  • Hands on Experience in Hadoop ecosystem including HDFS, Spark, Hive, Pig, Sqoop, Impala, Oozie, Flume, Kafka, HBase, ZooKeeper, MapReduce
  • Extensively understanding of Hadoop Architecture, workload management, schedulers, scalability and various components such as YARN, Map Reduce
  • Hands on experience on RDD architecture, implementing Spark operations on RDD and optimizing transformations and in Spark
  • Exposure in Spark Streaming, Spark SQL in a production environment
  • Worked in building, configuring, monitoring and supporting Cloudera Hadoop CDH5
  • Good Knowledge on Hadoop Cluster architecture and monitoring the cluster
  • Experience in managing and reviewing Hadoop log files
  • Involved in creating Hive tables, loading with data and writing Hive queries that will run internally in map reduce way
  • Extensive experience in data ingestion technologies, such as Sqoop, Flume, and Kafka
  • Expertise in Java, Scala and scripting languages like Python
  • Deep understanding and knowledge of NOSQL databases like MongoDB HBase Cassandra
  • Implemented in setting up standards and processes for Hadoop based application design and implementation
  • Well versed in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice - versa
  • Experience in Object Oriented Analysis Design OOAD and development of software using UML Methodology good knowledge of J2EE design patterns and Core Java design patterns
  • Expert in managing Hadoop clusters using Cloudera Manager tool
  • Involvement in complete project life cycle design development testing and implementation of Client Server and Web applications
  • Experience in Administering Installation configuration troubleshooting Security Backup Performance Monitoring and Fine tuning of Linux Red hat
  • Extensive experience working in Oracle DB2 SQL Server and My SQL database Scripting to deploy monitors checks and critical system admin functions automation
  • Hands on experience in application development using Java RDBMS and Linux shell scripting
  • Experience in Java JSP Servlets EJB WebLogic WebSphere Hibernate Spring JBoss JDBC RMI Java Script Ajax jQuery XML and HTML
  • Ability to adapt to evolving technology strong sense of responsibility and accomplishment
  • Conversant with Agile methodology standards and Test - Driven Development

TECHNICAL SKILLS

Big Data Skillset: Frameworks & Environments Cloudera CDHs, Hortonworks HDPs, Hadoop1.0, Hadoop2.0, HDFS, MapReduce, Pig, Hive, Impala, HBase, Data Lake, Cassandra, MongoDB, Sqoop, Oozie, Zookeeper, Flume, Apache Spark, Storm, Kafka, YARN, Falcon, Avro

JAVA & J2EE Technologies: Core Java (Java8 & Java FX versions), Hibernate framework, Spring framework, JSP, Servlets, Java Beans, JDBC, Java Sockets & Java Scripts. JavaScript, jQuery, JSF, Prime Faces, XML, Servlets, EJB, JDBC, HTML, XHTML, CSS, SOAP, XSLT and DHTML

Messaging Services: JMS, MQ Series, MDB, J2EE MVC Frameworks Struts … Struts 2.1, Spring 3.2, MVC, Spring Web Flow, AJAX

IDE Tools: IntelliJ, PyCharm, Eclipse

Web services & Technologies: XML, HTML, XHTML, HTML5, AJAX, jQuery, CSS, JavaScript, AngularJS, VB Script, WSDL, SOAP, JDBC, ODBC Architectures REST, MVC architecture

Databases & Application Servers: Oracle, MySQL, DB2, Cassandra, HBase, MongoDB, 8i, 9i, 11i & 10g, MS Access, Teradata, PostgreSQL

Other Tools: Putty, WinSCP, GitLab, GitHub, SVN, CVS

PROFESSIONAL EXPERIENCE

Confidential, San Francisco, CA

Big Data Engineer

Responsibilities:

  • As part of project attending daily standups and involving other meetings to gather new requirements.
  • As a center of Excellence team, involve in any of the application issues, triage/investigate them, build and fix the issues.
  • Used Sqoop tool to pull the data from different databases to store the data into HBase and Hive.
  • Gathering the different logs data from various sources using Flume for further process.
  • Import millions of structured data from relational databases using Sqoop import to process using Spark and stored the data into HDFS in CSV format.
  • Ownership for control-M jobs and supporting for failing jobs.
  • Improving the performance and optimizing existing algorithms in Hadoop using Spark context, Spark-SQL, Data Frames, Pair RDD's & Spark YARN.
  • Worked on Elasticsearch to store, analyze, search big volumes of data quickly.
  • Extensively worked on python scripts to automate the jobs, which pulls the data from db2, oracle, and Teradata.
  • Involved in Scala coding to help the team to resolve any issues which are related to Prediction IO.
  • As part of the project involved in setting up different experiments to provide top level recommendations.
  • Involved in showcase every week with business stake holders and product mangers to review the changes and adopting new inputs.
  • Generating different file formats like JSON, Parquet, CSV, TSV.
  • Verifying the Splunk alerts.

Environment: HDP, Spark, HBase, Elasticsearch, Sqoop, Hive, Python, Scala, db2, oracle, Teradata, Flume, Splunk.

Confidential, Hazelwood, MO

Hadoop Developer

Responsibilities:

  • Experience with Hadoop Ecosystem components like Hbase, Sqoop, ZooKeeper, Oozie, Hive and Pig with Cloudera Hadoop distribution.
  • Developed PIG and Hive UDF’s in java for extended use of PIG and Hive and wrote Pig Scripts for sorting, joining, filtering and grouping the data.
  • Worked with NoSQL databases like Hbase for creating Hbase tables to load large sets of semi structured data coming from various sources
  • Elaborated spark programs using Scala, involved in creating Spark SQL Queries and Developed Oozie workflow for spark jobs
  • Prepared the Oozie workflows with Sqoop actions to migrate the data from relational databases like Oracle, Teradata to HDFS
  • Expand programs in Spark based on the application for faster data processing than standard MapReduce programs.
  • Creating Hive tables, dynamic partitions, buckets for sampling, and working on them using HiveQL
  • Used Sqoop to store the data into Hbase and Hive
  • Enumerated Hive queries to do analysis of the data and to generate the end reports to be used by business users
  • Worked on scalable distributed computing systems, software architecture, data structures and algorithms using Hadoop, Apache Spark and Apache Storm etc.
  • And ingested streaming data into Hadoop using Spark, Storm Framework and Scala.
  • Good experience with NOSQL databases like MongoDB.
  • Responsible in creating producer and consumer API's using Kafka
  • Experienced in handling large datasets using Spark in Memory capabilities, using broadcasts variables in Spark, effective & efficient joins, transformations and other capabilities
  • Elaborated Spark code and Spark - SQL / Streaming for faster testing and processing of data.
  • Used Spark for interactive queries, processing of streaming data and integration with popular NoSQL database for huge volume of data.
  • Developed a data pipeline using Kafka, Hbase, Mesos Spark and Hive to ingest, transform and analyzing customer behavioral data

Environment: Hadoop, HDFS, CDH, Pig, Hive, Oozie, ZooKeeper, Hbase, Spark, Storm, Spark SQL, NoSQL, Scala, Kafka, Mesos, MongoDB

Confidential, Atlanta, GA

Hadoop Developer

Responsibilities:

  • Involved in Discussions with business users to gather the required knowledge
  • Analyzing the requirements to develop the framework
  • Designed and developed architecture for data services ecosystem spanning Relational, NoSQL and Big Data technologies
  • Loaded and transformed large sets of structured, semi structured and unstructured data using Hadoop / Big Data concepts
  • Developed Java Spark streaming scripts to load raw files and corresponding processed metadata files into AWS S3 and Elasticsearch cluster.
  • Implemented PySpark logic to transform and process various formats of data like XLSX, XLS, JSON, TXT
  • Built scripts to load PySpark processed files into Redshift Db and used diverse PySpark logics
  • Developed scripts to monitor and capture state of each file which is being through
  • Designed and Developed Real Time Stream Processing Application using Spark, Kafka, Scala and Hive to perform Streaming ETL and apply Machine Learning
  • Developed Map Reduce programs to cleanse the data in HDFS obtained from heterogeneous data sources
  • Involved in scheduling Oozie workflow engine to run multiple Hives and pig jobs and used Oozie Operational Services for batch processing and scheduling workflows dynamically
  • Included migration of existing applications and development of new applications using AWS cloud services
  • Wrought with data investigation, discovery and mapping tools to scan every single data record from many sources
  • Implemented Shell script to automate the whole process
  • Extracted data from SQL Server to create automated visualization reports and dashboards on Tableau
  • Responsible for Cluster maintenance, adding and removing cluster nodes, Cluster Monitoring and Troubleshooting, Managing and reviewing data backups & log files

Environment: AWS S3, Java, Maven, Python, Spark, Kafka, Elasticsearch, Amazon Redshift Db, Shell script, PySpark, Pig, Hive, Oozie, JSON

Confidential, Seattle, WA

Hadoop Developer

Responsibilities:

  • Developed simple to complex MapReduce jobs using Java language for processing and validating the data.
  • Developed data pipeline using Sqoop, Spark, MapReduce, and Hive to ingest, transform and analyze, customer behavioral data.
  • Exported analyzed data to relational databases using Sqoop for visualization to generate reports for the BI team
  • Implemented Spark using python and Spark SQL for faster processing of data and algorithms for real time analysis in Spark.
  • Used Spark for interactive queries, processing of streaming data and integration with popular NoSQL database for huge volume of data.
  • Used the Spark - Cassandra Connector to load data to and from Cassandra. Real time streaming the data using Spark with Kafka.
  • Developing Kafka producers and consumers in java and integrating with apache storm and ingesting data into HDFS and Hbase by implementing the rules in storm
  • Built a prototype for real time analysis using Spark streaming and Kafka
  • Involved in moving all log files generated from various sources to HDFS for further processing through Flume
  • Involved in creating Hive tables and working on them using HiveQL and perform data analysis using Hive and Pig.
  • Developed workflow in Oozie to manage and schedule jobs on Hadoop cluster to trigger daily, weekly and monthly batch cycles
  • Experience in job workflow scheduling and monitoring tools like Oozie and Zookeeper
  • Expertise in extending Hive and Pig core functionalities by writing custom User Defined Functions (UDF)
  • Used IMPALA to pull the data from Hive tables
  • Worked on Apache Flume for collecting and aggregating huge amount of log data and stored it on HDFS for doing further analysis
  • Create and develop an End to End Data Ingestion on to Hadoop
  • Involved in architecture and design of distributed time - series database platform using NOSQL technologies like Hadoop / Hbase, Zookeeper
  • Integrated NoSQL database like Hbase with Map Reduce to move bulk amount of data into Hbase
  • Efficiently put and fetched data to / from Hbase by writing MapReduce job

Environment: Hadoop, Kafka, Spark, Sqoop, Spark SQL, Spark - Streaming, Hive, Scala, pig, NoSQL, Impala, Oozie, Hbase, Zookeeper

Confidential

Java Developer

Responsibilities:

  • Identified System Requirements and Developed System Specifications, responsible for high - level design and development of use cases
  • Involved in designing Database Connections using JDBC.
  • Organized and participated in meetings with clients and team members.
  • Developed web - based Bristow application using J2EE (Spring MVC Framework), POJOs, JSP, JavaScript, HTML, jQuery, Business classes and queries to retrieve data from backend.
  • Development of Client - Side Validation techniques using jQuery.
  • Worked with Bootstrap to develop responsive web pages.
  • Implemented client side and server - side data validations using the JavaScript.
  • Responsible for customizing data model for new applications by using Hibernate ORM technology
  • Involved in the implementation of DAO and DTO using spring with Hibernate ORM.
  • Implemented Hibernate for the ORM layer in transacting with MySQL database.
  • Developed authentication and access control services for the application using Spring LDAP.
  • Experience in event - driven applications using AJAX, Object Oriented JavaScript, JSON and XML.
  • Good knowledge on developing asynchronous applications using jQuery. Valuable experience with Form Validation by Regular Expression, and jQuery Light box.
  • Used MySQL for the EIS layer.
  • Involved in design and Development of UI using HTML, JavaScript and CSS.
  • Designed and developed various data gathering forms using HTML, CSS, JavaScript, JSP and Servlets.
  • Developed user interface modules using JSP, Servlets and MVC framework.
  • Experience in implementing of J2EE standards, MVC2 architecture using Struts Framework.
  • Developed J2EE components on Eclipse IDE.
  • Used JDBC to invoke Stored Procedures and used JDBC for database connectivity to SQL.
  • Deployed the applications on Tomcat Application Server
  • Developed Web services using Restful and JSON.
  • Created Java Beans accessed from JSPs to transfer data across tiers.
  • Database Modification using SQL, PL / SQL, Stored procedures, triggers, Views in Oracle9i.

Environment: Java, JSP, Servlets, JDBC, Eclipse, Web services, Spring 3.0, Hibernate 3.0, MySQL, JSON, Struts, HTML, JavaScript, CSS.

Confidential

Java Developer

Responsibilities:

  • Used Eclipse for writing code for JSP, Servlets.
  • Involved in designing the user interface using JSP’s.
  • Developed Application using Core Java Concepts.
  • Used JDBC to invoke Stored Procedures and database connectivity to ORACLE.
  • Used Struts Framework along with JSP, HTML5 to construct the dynamic web pages for the application.
  • Participated in feature team meetings and code review meetings.
  • Responsible for writing SQL queries using MySQL and oracle 10g.
  • Developed various J2EE components like Servlets, JSP, AJAX, SAX, and JMS.
  • Used Spring MVC framework to enable the interactions between JSP / View layer and implemented different DPs.
  • Utilized JSP, HTML5, CSS3, Bootstrap and Angular JS for front - end development.
  • Used JPA and Hibernate annotations for defining object relational metadata.
  • Implemented business layer using Core java, Spring Beans using dependency injection, Spring annotations.
  • Used a Micro service architecture, with Spring Boot - based services interacting and leveraging AWS to build, test and deploy Identity microservices

Environment: Java, J2EE, JSP, JPA, AJAX, SAX, JMS, HTML5, CSS3, Bootstrap, Angular JS, Java script, Hibernate, Spring MVC, Eclipse, Oracle, SQL, MySQL, Spring Beans

We'd love your feedback!