Hadoop Developer Resume
2.00/5 (Submit Your Rating)
New York, NY
SUMMARY
- 4 years of hands on experience in Hadoop, HDFS, MapReduce and Hadoop Ecosystem.
- Highly experienced as Hadoop Developer with Hadoop Distributed File System and EcoSystem (HDFS, Map Reduce, Hive, Sqoop, Impala, HBase, Flume, PIG, Apache Kafka, Spark - SQL).
- Involved in HDFS maintenance and loading of structured and unstructured data from different sources.
- Experienced in importing and exporting data from RDBMS to HDFS, Hive tables and HBase by using Sqoop.
- Used Sqoop to import data from RDBMS into hive tables.
- Load and transform large sets of structured and semi structured data.
- Created SQL tables and indexes and also wrote queries to update/manipulate data stored in the Database.
- Wrote and ran different type of SQL queries, including create/insert/update on given databases.
- Responsible for creating Databases, Tables, Cluster/Non-Cluster Index, Unique/Check Constraints Views.
- Involved in creating Hive tables, loading data and writing Hive queries.
- Hands on Experience in performing analytics on structured data in Hive with Hive queries, Views, Partitioning, Bucketing using HiveQL.
- Experience in working with Spark SQL for processing data into Hive tables.
- Worked with different file formats like JSON, Parquet, Avro, Sequence, ORC files and text files.
TECHNICAL SKILLS
Hadoop Framework: Cloudera, HDFS, Hive, Spark, PySpark, SparkSQL, Hbase, Sqoop, Zookeeper, Oozie, Map-Reduce, Pig, Hue
Programming Languages: Python, Scala
Tools: Tableau, VMware, MS Office, MS Word, MS Project
Database: MYSQL, NOSQL, MongoDB, HBase, Sql Server, SQL Workbench
Operating Systems: Linux, MacOS, Windows
PROFESSIONAL EXPERIENCE
Hadoop Developer
Confidential - New York, NY
Responsibilities:
- Installed and configured Hive, Flume, HBase, Spark, Pig, Sqoop and Oozie on the hadoop cluster.
- Experienced in Storage, Querying, Processing and Analysis of big data.
- Excellent knowledge on Hadoop Architecture and ecosystems such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node, Map Reduce, Hive, Sqoop, Kafka, HBase, MongoDB, Oozie, Zookeeper, Flume, Impala, Spark with Scala, SparkSQL and PySpark.
- Knowledge in installing, configuring, and using Hadoop ecosystem components like Hadoop Map Reduce, HDFS, HBase, Oozie, Hive, Sqoop, Pig, Zookeeper and Flume.
- Experienced in analysing data using HiveQL, HBase and custom Map Reduce programs.
- Experienced in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa.
- Expertise in implementing Ad-hoc queries using Hive QL and good knowledge in creating Hive tables and loading and analyzing data using hive queries.
- Expertise in developing Hive Generic UDF's to implement complex business logic to in corporate into Hive QL.
- Developed Apache Spark jobs using Scala in test environment for faster data processing and used Spark SQL for querying.
- Experienced in Spark Core, Spark RDD and Spark Deployment Architectures.
- Experienced working with Spark Streaming, SparkSQL and Kafka for real-time data processing.
- Used Spark Data Frames API over Cloudera platform to perform analytics on Hive data.
- Developed Apache Spark jobs using Scala in test environment for faster data processing.
- Worked on loading CSV/TXT/AVRO/PARQUET files using Scala/Python/SparkSQL language in Spark Framework and process the data by creating Spark Dataframe and RDD and save the file in different formats in HDFS.
- Imported the data from different sources like AWS S3, Local file system into Spark RDD.
- Experienced with performing real time analytics on NoSQL databases like HBase and MongoDB.
- Extracted data from MongoDB through Sqoop and placed in HDFS and wrote Queries to generate reports.
Environment: Hadoop, HDFS, Hive, Sqoop, Oozie, SQL, Kafka, Spark, Scala, AWS, Big Data Integration, Impala.
Hadoop Developer
Confidential - New York, NY
Responsibilities:
- Involved in start to end process of setting up Hadoop Cluster and performed installation, configuration and monitoring of the Hadoop Cluster.
- Responsible for Cluster maintenance, commissioning and decommissioning Data Nodes, Cluster Monitoring, Troubleshooting, manage and review data backups, manage & review Hadoop log files.
- Monitoring systems and services, architecture design and implementation of hadoop deployment, configuration management, backup, and disaster recovery systems and procedures.
- Configured various property files like core-site.xml, hdfs-site.xml, mapred-site.xml based upon the job requirement.
- Importing and exporting data into HDFS using Sqoop.
- Installation of various Hadoop Ecosystems and Hadoop Daemons.
- Installation and configuration of Sqoop and Hbase.
- Followed standard Backup policies to make sure the high availability of cluster.
- Involved in Analyzing system failures, identifying root causes, and recommended course of actions. Documented the systems processes and procedures for future references.
- Worked with systems engineering team to plan and deploy new hadoop environments and expand existing hadoop clusters.
- Monitored workload, job performance and capacity planning using Cloudera Manager.
- Installed and configured Flume, Hive, HBase, Spark, Pig, Sqoop and Oozie on the hadoop cluster.
Environment: Hadoop, HDFS, Spark, Hive, Sqoop, Linux, Cloudera