We provide IT Staff Augmentation Services!

Sr. Hadoop Developer Resume

2.00/5 (Submit Your Rating)

Wayzata, MN

SUMMARY:

  • 10+ years of professional IT work experience in Analysis, Design, Administration, Development, Deployment and Maintenance of critical software and big data applications. experience in Big Data platform as both Developer and Administrator.
  • Hands on experience in developing and deploying enterprise based applications using major Hadoop ecosystem components like Map Reduce, YARN, Hive, Pig, HBase, Flume, Sqoop, SparkStreaming, SparkSQL, Storm, Kafka, Oozieand Cassandra.
  • Hands on experience in using MapReduce programming model for Batch processing of data stored in HDFS.
  • Exposure to administrative tasks such as installing Hadoop and its ecosystem components such as Hive and Pig
  • Installed and configured multiple Hadoop clusters of different sizes and with ecosystem components like Pig, Hive, Sqoop, Flume, HBase, Oozie and Zookeeper.
  • Worked on all major distributions of Hadoop Clouderaand Hortonworks.
  • Responsible for designing and building a Data Lake using Hadoop and its ecosystem components.
  • Very good experience in complete project life cycle (design, development, testing and implementation) of Client Server and Web applications.
  • Developed Spark Applications by using Scala, Java and Implemented Apache Spark data processing project to handle data from various RDBMS and Streaming sources.
  • Worked with the Spark for improving performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark - SQL, Spark MLlib, Data Frame, Pair RDD's, Spark YARN.
  • Experienced in Apache Spark for implementing advanced procedures like text analytics and processing using the in-memory computing capabilities written in Scala.
  • Experience in installation, configuration, Management, supporting and monitoring Hadoopcluster using various distributions such as Apache and Cloudera.
  • Experience using middleware architecture using Sun Java technologies like J2EE, Servlets, and application servers like Web Sphere and Web logic.
  • Used Different Spark Modules like Spark core, Spark RDD's, Spark Data frame, Spark SQL.
  • Converted Various Hive queries into Spark transformations and Actions that are required.
  • Experience in working on apache Hadoop open source distribution with technologies like HDFS, Map-reduce, Python, Pig, Hive, Hue, HBase, SQOOP, Oozie, Zookeeper, Spark, Spark-Streaming, Storm, Kafka, Cassandra, Impala, Snappy, Green plum and MongoDB, Mesos.
  • In-Depth knowledge of Scala and Experience building Spark applications using Scala.
  • Good experience working on Tableau and Spotfire and enabled the JDBC/ODBC data connectivity from those to Hive tables.
  • Designed neat and insightful dashboards in Tableau.
  • Have worked and designed on array of reports which includes Crosstab, Chart, Drill-Down, Drill-Through, Customer-Segment, and Geodemographicsegmentation.
  • Deep understanding of Tableau features such as site and serveradministration, Calculatedfields, Tablecalculations, Parameters, Filter's (Normalandquick), highlighting, Levelofdetail,Granularity, Aggregation, Reference line and many more.
  • Adequate knowledge of Scrum, Agile and Waterfall methodologies.
  • Designed and developed multiple J2EEModel 2 MVC based Web Application using J2EE.
  • Worked on various Tools and IDEs like Eclipse, IBM Rational, Apache Ant-Build Tool, MS-Office, PLSQL Developer, and SQL Plus.
  • Highly motivated with the ability to work independently or as an integral part of a team and Committed to highest levels of profession.

TECHNICAL SKILLS:

Big Data Technologies: Hadoop, Spark, Kafka, Flume, HDFS, Hive, Impala, Map Reduce, Sqoop, Oozie. Distribution Cloudera, HortonWorks

Programming Languages: Python, Scala and Java

Web Technologies; HTML, J2EE, CSS, JavaScript, Servlets, JSP, XML, AWS, EC2, S3

Databases: DB2, MySQL, HBase, Cassandra

DB Languages: SQL, PL/SQL.

Operating Systems: Linux, UNIX, Windows

IDE/Testing Tools: Eclipse, IntelliJ, PyCharm

PROFESSIONAL EXPERIENCE:

Sr. Hadoop Developer

Confidential - Wayzata, MN

Responsibilities:

  • Worked with Business Analysts and helped representing the business domain details.
  • Hands on experience in gathering information from different nodes into Greenplum database and then Sqoop incremental load into HDFS.
  • Experience in importing the real-time data to Hadoop using Kafka and implemented the Oozie job.
  • Involved in loading data from LINUX file system to HDFS
  • Experience in Writing Map Reduce jobs for text mining and worked with predictive analysis team to check the output and requirement.
  • Hands on experience in writing hive UDF's for the requirements and to handle different schemas and xml data.
  • Used Pig as ETL tool to do transformations, event joins, filter both traffic and some pre-aggregations before storing the data onto HDFS.
  • Wrote Hive and Pig scripts for joining the raw data with the lookup data and for some aggregative operations as per the business requirement.
  • Optimized Map/Reduce Jobs to use HDFS efficiently by using various compression mechanisms.
  • Involved in writing Flume and Hive scripts to extract, transform and load the data into Database
  • Implemented Partitioning and bucketing in Hive based on the requirement.
  • Connected Tableau from client end with AWS ip addresses and view the end results.
  • Coordinator and Oozie workflows are developed to automate Hive, Map Reduce, Pig and other jobs.
  • Creation of test cases as part of enhancement rollouts and Involved in Unit level and Integration level testing.
  • Hands on experience in working with snappy compression and also different file formats.
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports For the BI team.

    Environment: Hadoop, Map Reducer, Cloudera Manager, HDFS, Hive, Pig, Sqoop, Spark, Oozie, Impala, Greenplum, Kafka, SQL, Java (jdk 1.6), Eclipse.

Hadoop Developer

Confidential - Sunnyvale, CA

Responsibilities:

  • Worked on analyzing, writing HadoopMapReduce jobs using JavaAPI, Pig and Hive.
  • Responsible for building scalable distributed data solutions using Hadoop.
  • Involved in loading data from edge node to HDFS using shell scripting.
  • Created HBase tables to store variable data formats of PII data coming from different portfolios.
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
  • Worked with using different kind of compression techniques to save data and optimize data transfer over network using LZO, Snappy, and Bzip etc.
  • Analyze large and critical datasets using Cloudera, HDFS, HBase, MapReduce, Hive, HiveUDF, Pig, Sqoop, Zookeeper, &Spark.
  • Developed custom aggregate functions using SparkSQL and performed interactive querying.
  • Used Scoop to store the data into HBase and Hive.
  • Worked on installing cluster, commissioning & decommissioning of DataNode, NameNode high availability, capacity planning, and slots configuration.
  • Creating Hive tables, dynamic partitions, buckets for sampling, and working on them using HiveQL.
  • Used Pig to parse the data and Store in Avro format.
  • Stored the data in tabular formats using Hive tables and Hive Serdes.
  • Collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis.
  • Worked with NoSQL databases like HBase for creating HBase tables to load large sets of semi structured data coming from various sources.
  • Implemented a script to transmit information from Oracle to HBase using Sqoop.
  • Implemented MapReduce programs to handle semi/unstructured data like XML, JSON, and sequence files for log files.
  • Fine-tuned Pig queries for better performance.
  • Involved in writing the shell scripts for exporting log files to Hadoop cluster through automated process.
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
  • Installed Oozie workflow engine to run multiple Hive and pig jobs.
  • Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.

Environment: Hadoop, MapReduce, HDFS, Yarn, Sqoop, Oozie, Pig, Hive, HBase, Spark, Java, Eclipse, UNIX shell scripting, python, Horton works.

Hadoop Developer

Confidential - Bethlehem, PA

Responsibilities:

  • Worked closely with the business analysts to convert the Business Requirements into Technical Requirements and prepared low and high-level documentation.
  • Hands on experience on writing MR jobs for encryption and also for converting text data into Avro format.
  • Hands on experience in joining raw data with the reference data using Pig scripting.
  • Hands on experience in writing scripting for copying data between different clusters and also between different UNIX file systems.
  • Hands on experience in writing MR jobs for cleansing the data and to copy it to AWS cluster form our cluster.
  • Developed Spark SQL script for handling different data sets and verified its performance over MR jobs.
  • Connected Tableau from client end with AWS ip addresses and view the end results.
  • Developed Coordinator and Oozie workflows to automate the jobs.
  • Hands on experience in writing hive UDF's to handle different Avro schemas.
  • Experience with moving large datasets hourly with AVRO file format and imposing hive and impala queries.
  • Hands on experience in working with snappy compression and also different file formats.
  • Developed shell script to back up the name node Meta data.
  • Cloudera Manger was used to monitor the health of Jobs which are running on the cluster.

Environment: Hadoop, Map Reducer, Cloudera Manager, HDFS, Hive, Pig, Sqoop, Spark, Oozie, Impala, Greenplum, Kafka, SQL, Java (jdk 1.6), Eclipse.

We'd love your feedback!