We provide IT Staff Augmentation Services!

Hadoop Developer Resume

5.00/5 (Submit Your Rating)

MI

PROFESSIONAL SUMMARY:

  • Overall 8+ years of professional IT experience in Software Development. dis also includes 4 years of experience in Ingestion, Storage, Querying, Processing and Analysis of Big Data using Hadoop technologies and solutions.
  • Excellent understanding/knowledge of Hadooparchitecture and various components ofHadoop ecosystem such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node, Map Reduce &YARN.
  • Hands on experience in using Hadoop ecosystem components like Map Reduce, HDFS, Hive, Pig, Sqoop, Spark, Flume, Zookeeper, Hue, Kafka, Storm & Impala.
  • Experience wif Agile Methodology.
  • Experienced wif teh Sparkimproving teh performance and optimization of teh existing algorithms in Hadoop using Spark Context, Spark - SQL, Data Frame, Pair RDD's and Datasets.
  • Developed producers for Kafka which compress, and bind many small files into a larger Avro and Sequence files before writing to HDFS to make best use of a Hadoop block size.
  • Experience in analyzing data using Hive QL, Pig Latin, HBase and custom Map Reduce programs in Java.
  • Experience in developing customized UDF’s in java to extend Hive and Pig Latin functionality.
  • Expertise in job workflow scheduling and monitoring tools like Oozie.
  • Developed simple to complex Map/Reduce jobs using Hive and Pig to handle files in multiple formats like JSON, Text, XML, Sequence File etc.
  • Worked extensively on creating combiners, Partitioning, Distributed cache to improve teh performance of Map Reduce jobs.
  • Experience in working wif different data sources like Flat files, XML files, log files and Database.
  • Very Good understanding and Working Knowledge of Object Oriented Programming (OOPS).
  • Expertise in application development using Scala, RDBMS, and UNIX shell scripting.
  • Experience developing Scala applications for loading/streaming data into NoSQL databases (HBASE) and into HDFS.
  • Worked on ingesting log data into Hadoop using Flume.
  • Experience in managing and reviewing Hadoop log files.
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Management System and vice-versa.
  • Using Apache Flume, collected and stored streaming data(log data) in HDFS.
  • Experience in optimizing teh queries by creating various clustered, non-clustered indexes and indexed views using and data modeling concepts.
  • Experience wif scripting languages (Scala,Pig,Python and Shell) to manipulate data.
  • Worked wif relational database systems (RDBMS) such as My SQL, and No SQL database systems like HBase and had basic knowledge on MongoDB and Cassandra.
  • Hands on experience in identifying and resolving performance Bottlenecks in various levels like sources, Mappings and Sessions.
  • Highly Motivated, Adaptive and Quick learner.
  • Ability to adapt to evolving Technology, Strong Sense of Responsibility and Accomplishment.

TECHNICAL SKILLS:

  • Hadoop, HDFS, Yarn, Map Reduce, Spark, Hive, Pig, Sqoop
  • Flume, Kafka, Storm, Oozie, Zookeeper, Impala, Hue.
  • HBase, Cassandra, MongoDB
  • Cloudera Manager, Horton Works. Java, Scala.
  • Oracle 8i, 9i, 10g, 11g, MS Sql Server.
  • TCP/IP, DNS, NIS, NIS+, NFS, AutoFS.
  • Centos, Ubuntu, Linux, Windows.

PROFESSIONAL EXPERIENCE

Confidential, MI

Hadoop Developer

Responsibilities:

  • Data Ingestion implemented using Sqoop, Spark, and loading data from various Rdbms.
  • Responsible for design development of Spark Sql Scripts based on Functional Specifications.
  • Data cleansing, transformations tasks are handled using Spark using Scala and Hive.
  • Involved in converting Hive queries into Spark Data Frames and Datasets using Scala.
  • Optimizing of existing algorithms in Hadoop using Spark Context, Spark-SQL and Pair RDD's.
  • Exploring wif teh Spark improving teh Performance and Optimization of teh existing algorithms in Hadoop.
  • Responsible in handling large datasets using Partitions, Spark in Memory capabilities, Broadcasts in Spark, Effective & Efficient Joins, Transformations and other during Ingestion process itself.
  • Data Consolidation was implemented using Spark, Hive to generate data in teh required formats by applying various ETL tasks for data repair, massaging data to identify source for audit purpose, data filtering and store back to Hdfs.
  • Used Spark API over Hadoop YARN to perform analytics on data in Hive.
  • Loaded teh data into Spark RDD and do in memory data Computation to generate teh Output response
  • Optimizing of existing algorithms in Hadoop using Spark Context, Spark -SQL, Data Frames and Pair RDD's.
  • Performed advanced procedures like text analytics and processing, using teh in-memory computing capabilities of Spark using Scala.
  • ETL development to normalize dis data and publish it in Impala.
  • Responsible for Job management using Fair scheduler and Developed Job Processing scripts using Oozie Workflow.
  • Wrote a Shell Script to Convert all hive Internal tables to External tables.
  • Integrated Hive wif Hbase.
  • Responsible for Performance Tuning of Spark Applications for setting right Batch Interval time, correct level of Parallelism and Memory tuning.
  • Importing and exporting data into Hdfs and Hive, Pig using Sqoop.
  • Involved in creating Hive Tables, loading wif data and writing Hive queries which will invoke and run Map Reduce jobs in teh backend.
  • Implemented teh workflows using Apache Oozie framework to automate tasks.
  • Worked wif No SQL databases like HBase. Creating HBase tables to load large sets of semi structured data coming from various sources.
  • Worked wif different file formats such as Text, Sequence files, Avro, ORC and Parquet.
  • Responsible to manage data coming from different sources.
  • Responsible for Loading and Transforming of large seta of Structured, Semi Structured and Unstructured data.
  • Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.

Environment: Scala, Hive, HBase, Flume, Java, Impala, Pig, Spark, Oozie, Oracle, Yarn, Junit, Unix, HortonWorks, Flume, Sqoop, HDFS, Java, Python.

Confidential St, San Francisco, CA

Hadoop Developer

Responsibilities:

  • Involved in file movements between HDFS and AWSS3 and extensively worked wif S3 bucket in AWS.
  • Developing use cases for processing real time streaming data using tools like Spark Streaming.
  • Handled large datasets using Partitions, Spark in Memory capabilities, Broadcasts in Spark, Effective & efficient Joins, Transformations.
  • Imported required tables from Rdbms to HDFS using Sqoop and used Spark and Kafka to get real time streaming of data into HBase.
  • Enhanced and optimized product Spark code to aggregate, group and run data mining tasks using teh Spark framework and handled Json Data.
  • Developed Spark code using Scala and Spark-SQL for faster testing and data processing.
  • Responsible for batch processing of data sources using Apache Spark.
  • Developed predictive analytic using Apache Spark Scala APIs.
  • Developed MapReduce jobs in Java API to parse teh raw data and store teh refined data.
  • Developed Kafka producer and consumers, Hbase clients, Spark and Hadoop MapReduce jobs along wif components on HDFS, Hive.
  • Involved in identifying job dependencies to design workflow for Oozie & YARN resource management.
  • Worked on a product team using Agile Scrum methodology to Design, Develop, Deploy and support solutions dat leverage teh Client big data platform.
  • Integrated Apache Storm wif Kafka to perform web analytics. Uploaded click stream data from Kafka to Hdfs, Hbase and Hive by integrating wif Storm.
  • Design and code from specifications, Analyzes, Evaluates, Tests, Debugs, Documents, and Implements Complex Software Apps.
  • Worked in tuning Hive & Pig to improve performance and solved performance issues in both scripts wif understanding of Joins, Group and Aggregation and how does it translate to Map Reduce jobs
  • Created Partitions, Buckets based on State to further process using Bucket based Hive joins.
  • Implemented Cloudera Manager on existing cluster.
  • Extensively worked wif Cloudera Distribution Hadoop, CDH 5.x, CDH4.x
  • Responsible for troubleshooting debugging and fixing teh wrong data or data missing problem for Oracle Database (Mysql).

Environment: HDFS, MapReduce, JavaAPI, JSP, JavaBean, Pig, Hive, Sqoop, Flume, Oozie, HBase, Kafka,Impala, Spark Streaming, Storm, Yarn, Eclipse, Unix Shell Scripting, Cloudera.

Confidential Co, Saint Louis, MO

Hadoop Developer

Responsibilities:

  • Data Ingestion using Open source Hadoop distribution to process Structured, Semi-Structured and Unstructured datasets using Apache tools like Flume and Sqoop into Hive and Nosql databases like Hbase.
  • Developed job flows in Oozie to automate teh workflow for Pig and Hive jobs.
  • Designed and built teh reporting application dat uses teh Spark SQL to fetch and generate reports on HBase table data.
  • Extracted feeds from social media sites such as Facebook, Twitter using Python scripts.
  • Implemented halper classes dat access HBase directly from Java using Java API.
  • Integrated MapReduce wif HBase to import bulk amount of data into HBase using MapReduce programs.
  • Responsible for converting ETL operations to Hadoop system using Pig Latin Operations, transformations and functions.
  • Extracted teh needed data from server and into Hdfs and bulk loaded teh cleaned data into HBase
  • Handled different time series data using HBase to store data and perform analytics based on time to improve queries retrieval time.
  • Participated wif admin in installation and configuring Map Reduce, Hive and HDFS.
  • Implemented CDH3 Hadoop cluster on CentOS, assisted wif performance tuning and monitoring
  • Used IMPALA to analyze data ingested into HBase and compute various metrics for reporting on teh dashboard.
  • Managed and reviewed Hadoop log files.
  • Involved in review of functional and non-functional requirements.

Environment: Hortonworks Hadoop 2.0, EMP, Cloud Infrastructure (Amazon AWS), JAVA, Python, HBase, Hadoop Ecosystem, Linux,Scala.

Confidential, Philadelphia, PA

Hadoop Developer

Responsibilities:

  • Involved in designing and developingHadoopMap Reduce jobs Using JAVA Runtime Environment for teh batch processing to search and match teh scores.
  • Involved in developingHadoopMap Reduce jobs for merging and appending teh repository data.
  • Worked on developing applications in Hadoop Big Data Technologies-Pig, Hive, Map-Reduce, Oozie.
  • Executed speedy reviews and first mover advantages by using workflows like Oozie in order to automate teh data.
  • Loading process into teh Hadoop distributed File System (HDFS) and Pig language in order to preprocess teh data.
  • Integrated Oozie wif teh rest of teh Hadoop stack supporting several types of Hadoop jobs out of teh box (such as Map-Reduce, Pig, Hive, Sqoop, Flume).
  • Worked on Oozie workflow engine for Job scheduling.
  • Importing and exporting large sets of data into HDFS and vice-versa using Sqoop.
  • Used Java for reading data from MySql database and transferring it to HDFS.
  • Transferred log files from teh log generating servers into HDFS.
  • Read teh log generated data form HDFS using advanced HiveQL(Serialization-De Serialization).
  • Executed teh HiveQL commands on CLI (Command Line Interface) and transferred back teh required output data to HDFS.
  • Worked on Hive partition and bucketing concepts and created hive External and Internal tables wif Hive partition

Environment: Hadoop, Map Reduce, Hdfs, Hive, Sql, Pig, Zookeeper, MongoDb, Centos, Cloudera Manager, Sqoop, Oozie, Zookeeper, MySql, Hbase, Solr, Java.

Confidential Co, Chicago, IL

Java Developer

Responsibilities:

  • Developed Java Script Behavior code for user interaction.
  • Developed UI screens for data entry application in Java GUI.
  • Implemented teh project according to teh Software Development Life Cycle (SDLC).
  • Front end screens development-using JSP wif tag libraries and Html pages.
  • Followed Coding Guidelines and update teh status leads in time.
  • Involved in Requirements Gathering, Analysis, Design, Development, Testing and Maintenance phases of Application.
  • Used core java concepts like Collections, Generics, Exception handling, IO, Concurrency to develop business logic.
  • JSON is used for serializing and de serializing data dat is sent to or receives from JSP pages.
  • Closely working wif QA, Business and Architect to solve various Defects in quick and fast to meet teh deadlines.
  • Ensure all open issues/and or risks are Documented prior to moving to next Testing stage
  • Involved in writing teh Integrations tests and Testing teh workflow of teh service.
  • Involved in writing teh Junit Test Cases and testing teh functionality. And also involved in smoke testing & integrating testing.
  • Created Style Sheets (CSS) to control teh look and feel of entire site.
  • Developed client side screen using Html.
  • Used Eclipse as IDE.
  • Written multiple Map Reduce programs in Java for Data Analysis.
  • Involved in submitting and tracking Map Reduce jobs using Job Tracker.
  • Used Html and Css, as view components in MVC.
  • Verify all Entry/ Exit criteria are completed wif appropriate sign off.
  • Teh work consisted mainly of Parsing data from teh source databases into teh warehouse.

Environment: Core Java, JavaScript, Java, Gui, Html, Css, Junit, Eclipse, Uml, Json, Xml, Web Services, Wsdl, Unix, Mvc,Jsp.

Confidential

Java/J2EE Developer

Responsibilities:

  • Participated in Agile Scrum methodology for application development. Analysis, Design, Coding, Unit, and Integration Testing of business applications in an Object-Oriented environment.
  • Designing UML(Unified Modeling Language) diagrams for new enhancements.
  • Creating requirement Documents and Design teh requirement using UML Diagrams.
  • Used Eclipse for teh Development, Testing and Debugging of teh application.
  • Firebug is used as debugger.
  • Implemented Services using Core Java. Developed and deployed UI layer logics of sites using JSP.
  • Extensively used Java Collection framework and Exception handling.
  • Worked on OOPS concepts.
  • Developed teh application in an Agile Environment wif teh constant changes in teh application scope and deadlines.
  • Worked on Database queries using Oracle instance.
  • Involved in Integration system testing and User acceptance testing (UAT)
  • Support teh Application whenever encountered Production issues.
  • Written and executed teh Test Scripts using Junit.
  • Involved in Bug Fixing and Production Support Maintenance. Integrated various modules and deployed on Websphere.
  • Developed teh user interface presentation screens using HTML, XML, and CSS.
  • Designing teh user interface of teh application using HTML, CSS, JSP, and JavaScript.
  • Involved in writing complex SQL queries, Stored Procedures in PL/SQL to access teh data from Oracle database.

Environment: Core Java, J2EE, Html, JSP, Css, Eclipse, Sql, Plsql, Design Patterns, Web Sphere Application Serv1er, Tomcat, Web Services, Oracle, Xml, Firebug.

We'd love your feedback!