We provide IT Staff Augmentation Services!

Hadoop Developer Resume

0/5 (Submit Your Rating)

Long Beach, CA

SUMMARY

  • Around 5+ years of progressive experience in the IT industry with proven expertise in Analysis, Design, Development, Implementation and Testing of software applications using Big Data(Hadoop) Technologies and Java based technologies.
  • 3 years of hands on experience with Big Data Hadoopcore and Eco - System components including Spark, Scala, HDFS, Map Reduce, Hive, Pig, Storm, Kafka, YARN, HBase, Oozie, Zookeeper, Flume, Sqoop.
  • Experience working with Horton works distribution and Cloudera Hadoop distribution.
  • Assisted in Cluster maintenance, Cluster Monitoring and Troubleshooting, Managing and Reviewing data backups and log files.
  • Developed multiple spark jobs in Scala for data cleaning, pre-processing and aggregating.
  • Expertise in working with Hive data warehouse tool-creating tables, data distribution by implementing partitioning and bucketing, writing and optimizing the HiveQL queries.
  • Optimized streaming log files with no time latency using Flume and more importantly operating the data down stream flow to Hadoopecosystems and it analysis segments.
  • Developed multiple MapReduce jobs in java for data cleaning, pre-processing.
  • Automated all the jobs for extracting the data from different Data Sources like MySQL to pushing the result set data to Hadoop Distributed File System.
  • Experience in importing the data from the MySQL into the HDFS using Sqoop.
  • Good knowledge in using apache NiFi to automate the data movement between different Hadoop systems.
  • Hands onwith NoSQL databases like MongoDB, HBase and Cassandra.
  • Hands on experience in setting up workflow using Apache Oozie workflow engine for managing and scheduling Hadoop jobs.
  • Developed Pig Latin scripts for data cleansing and Transformation.
  • Good knowledge on various scripting languages like Linux/Unix shell scripting and Python.
  • Hands onimporting the unstructured data into the HDFS using Flume.
  • Experience working with Build tools like Maven and SBT.
  • Hands on experience in configuring and working with Flume to load the data from multiple sources directly into HDFS.
  • Experience with Kibana, data visualization tool for plugin.
  • Experience in working with Databases like oracle, MySQL, IBM DB2, Teradata.
  • Experience in database design using PL/SQL to write Stored Procedures, Functions, Triggers and strong experience in writing complex queries for Oracle.
  • Experienced and skilled Agile Developer with a strong record of excellent teamwork and successful coding.
  • Strong Problem Solving and Analytical skills and abilities to make Balanced & Independent Decisions.

TECHNICAL SKILLS

Hadoop Ecosystem: Hadoop, HDFS, MapReduce, Hive, Pig, Spark-streaming, Scala, Kafka, Storm, Zoo Keeper, HBase, Yarn, Spark, Sqoop, Flume, Mahout.

Programming Languages: C++, JAVA, Python, Scala

Hadoop Distributions: Apache Hadoop, ClouderaHadoop Distribution CDH3, CDH4, CDH5 and Horton works Data Platform (HDP)

NoSQL Databases: HBase, Cassandra, MongoDB

Query Languages: HiveQL, SQL, PL/SQL, Pig

Web Technologies: Java, J2EE, Struts, Spring, JSP, Servlet, JDBC, EJB, JavaScript

IDE’s: Eclipse, NetBeans

Frameworks: MVC, Structs, Spring, Hibernate

Build Tools: Ant, Maven

Databases: Oracle, MYSQL, MS Access, DB2, Teradata

Operating systems: Windows (Red Hat, CentOS), Linux, Unix, CentOS

Scripting Languages: Shell scripting

Version Control system: SVN, GIT, CVS

PROFESSIONAL EXPERIENCE

Confidential, Long Beach, CA

Hadoop Developer

Responsibilities:

  • Collected Members, Providers, Claims data from various SQL servers and ingested them in to the Hadoop Distributed File system.
  • Experience with Talend, as an Ingestion Tool.
  • Implemented Spark RDD transformations to map business analysis and apply actions on top of transformations.
  • Worked on Creating the Hive tables on top of the transformations.
  • Hands on experience on handling Hive queries using spark SQL that integrate with spark environment implemented in scala.
  • Developed Spark jobs and Hive Jobs to summarize and transform data.
  • Automation of all the jobs starting from pulling the Data from different Data Sources like MySQL and pushing the result dataset to Hadoop Distributed File System and running Hive jobs using Autosys.
  • Implemented multiple UDF’S to execute the business logic.
  • Involved in converting Hive/SQL queries into Spark transformations using Spark data frames, Scala.
  • Building the Rules for the provider Usecase by interacting with the Provider Team in the organization and creating the extract according to the requirement.
  • Implemented mail generation logic when the automated pipeline refresh job fails.
  • Implemented Spark using Scala and utilizing Data frames and SparkSQLAPI for faster processing of data.
  • Experience in monitoring the pipeline jobs and analyzing the log files.
  • Categorizing the provider data based on the requirement.
  • Implemented Spark POC's using Spark with Scala.
  • Used Spark for interactive queries, processing of streaming data and integration with popular SQL database for huge volume of data.
  • Analysing the cluster configurations and setting the driver memory, executor memory and number of cores according to it.
  • Performing joins among large data sets and performance tuning.
  • Hands on experience in Spark Streaming to ingest data from multiple data sources into HDFS.
  • Monitoring the refresh of HBASE tables and data validation on the front end application.
  • Created partitioned tables in Hive, mentored analyst and test team for writing Hive Queries.
  • Involved in agile methodologies, daily scrum meetings, Sprint planning's.

Environment: Hadoop, Spark, HDFS, Hive, Flume, Sqoop, Oozie, HBase, MySQL, Shell scripting, Linux Red Hat, core Java 7, Eclipse, SBT.

Confidential, New york, NY

Hadoop Developer

Responsibilities:

  • Migrated complex Map reduce programs into Spark RDD transformations, actions.
  • Implemented Kafka High level consumers to get data from Kafka partitions and move into HDFS.
  • Worked on analyzing Hadoop cluster and different big data analytic tools including Map Reduce, Hive and spark.
  • Implemented Kafka Custom encoders for custom input format to load data into Kafka Partitions.
  • Exporting of result set from HIVE to MySQL using Sqoop export tool for further processing.
  • Evaluated the performance of Apache Spark in analyzing genomic data.
  • Implemented Apache Nifi flow topologies to perform cleansing operations before moving data into HDFS.
  • Implemented Hive complex UDF's to execute business logic with Hive Queries.
  • Implemented Impala for data analysis.
  • Prepared Linux shell scripts for automating the process.
  • Implemented Spark RDD transformations to map business analysis and apply actions on top of transformations.
  • Automation of all the jobs starting from pulling the Data from different Data Sources like MySQL and pushing the result dataset to Hadoop Distributed File System and running MR, PIG, and Hive jobs using Kettle and Oozie (Work Flow management).
  • Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS.
  • Load and transform large sets of structured, semi structured, and unstructured data with Map Reduce, Hive, and Pig.
  • Involved in loading data from LINUX file system, servers, Java web services using Kafka Producers, partitions.
  • Evaluated usage of Oozie for Workflow Orchestration.
  • Worked with NoSQL databases like HBase in creating tables to load large sets of semi structured data coming from various sources.
  • Created partitioned tables in Hive, mentored analyst and test team for writing Hive Queries.
  • Involved in cluster setup, monitoring, test benchmarks for results.
  • Involved in agile methodologies, daily scrum meetings, Sprint planning's.

Environment: Hadoop, Spark, HDFS, Pig, Hive, Flume, Sqoop, kafka, Oozie, HBase, Zookeeper, MySQL, Shell scripting, Linux Red Hat, core Java 7, Eclipse.

Confidential

Hadoop Developer

Responsibilities:

  • Experience in configuration, management, supporting and monitoring Hadoop cluster using Cloudera distribution.
  • Worked in Agile scrum development model on analyzing Hadoop cluster and different Big Data analytic tools including Map Reduce, Pig, Hive, Flume, Oozie and SQOOP.
  • Configured Hadoop MapReduce, HDFS, developed multiple MapReduce jobs in Java for data cleaning and preprocessing.
  • Loaded data into cluster from dynamically generated files using Flume and from relational database management systems using Sqoop.
  • Established custom MapReduce programs to analyze data and used Pig Latin to clean unwanted data.
  • Involved in creating Hive tables, loading with data and writing hive queries that will run internally in map reduce way.
  • Used Pig as ETL tool to do Transformations, even joins and some pre-aggregations before storing the data onto HDFS.
  • Implemented Partitioning, dynamic Partitions and Buckets in Hive for increasing performance benefit and helping in organizing data in logical fashion.
  • Implemented in loading and transforming of large data sets of different types of data formats like structured and semi-structured data.
  • Responsible to manage data coming from different sources.
  • Involved in creating Hive Tables, loading data and writing hive queries.
  • Involved in scheduling Oozie workflow engine to run jobs automatically.
  • Implemented No SQL database like HBase for storing and processing different formats of data.
  • Involved in Testing and coordination with business in User testing.
  • Involved in Unit testing and delivered Unit test plans and results documents.

Environment: Apache Hadoop, Map Reduce, HDFS, Hive, Pig, Sqoop, Oozie, HBase, UNIX shell scripting, Zookeeper, Java, Eclipse.

Confidential

Java/J2EE Developer

Responsibilities:

  • Involved in Java, J2EE, struts, web services and Hibernate in a fast paced development environment.
  • Followed agile methodology, interacted directly with the client on the features, implemented optimal solutions, and tailor application to customer needs.
  • Involved in design and implementation of web tier using Servlets and JSP.
  • Used Apache POI for Excel files reading.
  • Developed the user interface using JSP and Java Script to view all online trading transactions.
  • Designed and developed Data Access Objects (DAO) to access the database.
  • Used DAO Factory and value object design patterns to organize and integrate the JAVA Objects.
  • Coded Java Server Pages for the Dynamic front end content that use Servlets and EJBs.
  • Coded HTML pages using CSS for static content generation with JavaScript for validations.
  • Used JDBC API to connect to the database and carry out database operations.
  • Used JSP and JSTL Tag Libraries for developing User Interface components.
  • Performing Code Reviews.
  • Performed unit testing, system testing and integration testing.
  • Involved in building and deployment of application in Linux environment.

Environment: Java, J2EE, JDBC, Struts, Servlets, JSP, JavaScript, HTML, SQL. Hibernate, Eclipse, Apache POI, CSS.

We'd love your feedback!