We provide IT Staff Augmentation Services!

Hadoop Developer Resume

3.00/5 (Submit Your Rating)

Boston, MA

SUMMARY

  • Around 8 years of experience in full software development life cycle from concept through the delivery of applications and customizable solutions with emphasis on Object Oriented Programming, Java/J2EE, SQL and Hadoop / Big Data technologies.
  • Excellent understanding of Big Data and Hadoop Ecosystems.
  • Hands - on experience in setting up and configuring Apache Hadoop and Cloudera CDH clusters on Ubuntu, Red Hat Linux distribution environments.
  • Experience in importing and exporting data from relational database into HDFS using Sqoop.
  • Knowledge of Professional software engineering practices & best practices SDLC, including coding standards, code reviews, source control management, building process, testing and operations.
  • Hands on experience on Hadoop Ecosystem components (HDFS, MapReduce, Pig, Hive, HBase, Sqoop, Flume, Oozie, Spark(Python),Storm, MongoDB, Cassandra).
  • Involved in Installation and configuration of Hadoop Ecosystem components along with Hadoop Admin.
  • Hands on Experience in performing analytics on structured data in Hive with Hive queries, Views, Partitioning, Bucketing and UDF’s using HiveQL.
  • Performed tuning of Hive queries along with Java Mapreduce programs in order to reduce execution time and achieve higher scalability.
  • Extensive experience in configuring flume to stream data into HDFS.
  • Hands on experience on Teradata Migration to Hadoop Platform.
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa.
  • Efficiently dealt with structured or unstructured data.
  • Experience in managing and reviewing Hadoop log files.
  • Hands-on experience in Python scripting, Jenkin deployments and Linux shell scripting.
  • Automated Spark-SQL scripts using Unix shell script and Involved in Optimization of Hive Queries.Also used Optimization techniques for better performance.
  • Performed Data Ingestion to Hadoop file system from different data sources.
  • Analyzed different file formats and large data sets by running Hive queries and Pig scripts.
  • Using Apache Flume,collected and stored streaming data(log data) in HDFS.
  • Extensively used Apache Sqoop for efficiently transferring bulk data between Apache Hadoop and relational databases (Teradata).
  • Responsible for creating, modifying and deleting topics (Kafka Queues) as and when required by the Business team.
  • Developed tests cases and POC’s to benchmark and verify data flow through the Kafka clusters.
  • Automated sqoop,hive and pig jobs using Oozie scheduling.
  • Extensive knowledge in NoSQL databases like Hbase,MangoDB,Cassandra.
  • Have good knowledge on writing and using the user defined functions in HIVE,PIG and MapReduce.
  • Configured & deployed and maintained multi-node Dev and Test Kafka Clusters.
  • Able to transform the complex business requirement of user into technicaal specification.
  • Familiarity with version control tools such as git.
  • Good working experience on different OS like UNIX/Linux, Mac-OSX, Windows.
  • Great documentation skills of design documents constituting of UML diagrams.
  • Experienced in working with senior level managers, business people and developers across multiple disciplines.
  • Weekly meetings with technical collaborators and active participation in data warehousing and data analysis as per customer needs.
  • Strong written and oral communication.
  • Ability to quickly learn and adapt to the new working environment and emerging new technologies.

TECHNICAL SKILLS

Big Data/Hadoop: Hadoop Ecosystems(Hive,Pig,Sqoop,Flume,Zookeeper,oozie,MR,Hbase),Kafka,Storm andSpark(Spark SQL,Scala)

Scripting Language: Unix Shell Scripting, Python

Methodologies: Agile, Waterfall model

Technologies: Hadoop,Spark,Scala,Db2,CoreJava,JDBC,JavaScript,SQL

Database: Teradata, SQL, My SQL, DB2,HBase, Cassandra

Servers: Tomcat

IDE: Eclipse, Net Beans

PROFESSIONAL EXPERIENCE

Confidential, Boston, MA

Hadoop Developer

Responsibilities:

  • Involved in complete SDLC life cycle of big data project that includes requirement analysis, design, coding, testing and production
  • Extensively Used Sqoop to import/export data between RDBMS and hive tables, incremental imports and created Sqoop jobs for last saved value.
  • Created Talend workflows to configure various jobs using various Bigdata Connectors like Hive, HBASE consumers to provide clean data abstraction between producers and consumers.
  • Created Map Reduce programs in order to analyze data and used Pig Latin to transform data.
  • Installed and configured Hive and wrote Hive Generic UDF to successfully implement business requirements.
  • Involved in creating hive tables, loading data into tables and writing Hive queries.
  • Experienced with using different kind of compression techniques to save disk usage and optimize data transfer over network using Lzo, Snappy etc. in Hive tables.
  • Implemented custom interceptors for flume to filter data and defined channel selectors to multiplex the data into different sinks.
  • Implemented POC to migrate map reduce jobs into Spark RDD transformations using Scala.
  • Developed Scripts and Auto sys Jobs to schedule a bundle (group of coordinators), which consists of variousHadoop Programs using Oozie.
  • Hands on experience with Accessing and perform CRUD operations against HBase data using Talend.

Environment: Hadoop, Spark, HDFS, Map Reduce, HBase, Talend, Hive, Flume, Sqoop, PIG, CDH

Confidential, Los Angeles, CA

Hadoop Developer

Responsibilities:

  • Worked extensively in creating MapReduce jobs to power data for search and aggregation
  • Designed a data warehouse using Hive
  • Worked extensively with Sqoop for importing metadata from Oracle
  • Extensively used Pig for data cleansing
  • Created partitioned tables in Hive
  • Worked with business teams and created Hive queries for ad hoc access.
  • Evaluated usage of Oozie for Workflow Orchestration
  • Mentored analyst and test team for writing Hive Queries
  • Gained very good business knowledge on health insurance, claim processing, fraud suspect identification, appeals process etc.

Environment: Hadoop, MapReduce, HDFS, Hive, Java (jdk1.6), Hadoop distribution of Hortonworks, Oozie, Oracle 11g/10g.

Confidential

Java Developer

Responsibilities:

  • Involved in design and development phases of Software Development Life Cycle (SDLC).
  • Implemented Multithread concepts inJavaclasses to avoid deadlocking.
  • Involved in High Level Design and prepared Logical view of the application.
  • Involved in designing and developing of Object Oriented methodologies using UML and created Use Case, Class, Sequence diagrams and also in complete development, testing and maintenance process of the application.
  • Created CorejavaInterfaces and Abstract classes for different functionalities.
  • Responsible for Analysis, Design, Development and Integration of UI components with backend usingJ2EEtechnologies such as Servlets, JSP, JDBC.

Confidential

Application Developer

Responsibilities:

  • Worked on Db2 and Sql for FSDB.
  • Involved in source analysis and Inventory Phase of the Project.
  • Involved in Coding new modules, bug fixing, testing of Jobs and ABEND handling.
  • Involved in preparation of project report and took KT for new members of the team.
  • Involved in Unit testing, System Testing, UAT, Integration Testing, Regression Testing and Deployments.
  • Involved in distribution & management of project work with other vendors like Accenture.

We'd love your feedback!