We provide IT Staff Augmentation Services!

Hadoop Developer Resume

5.00/5 (Submit Your Rating)

Oklahoma City, OK

­

SUMMARY:

  • More than 6+ years of work experience in application and product development using full SDLC primarily using Hadoop, Java/J2EE, Mainframe and ETL Technologies.
  • Have good experience with Hadoop stack on CCDH, Core Java with 2 years of comprehensive experience in Hadoop Ecosystem, Map Reduce, HDFS and Spark,AWS.
  • Passionate towards working in Big Data and Analyticsenvironment.
  • Proven skills in establishing strategic direction yet technically strong in designing, implementing, and deploying. Collected/translated business requirements into distributed architecture & robust scalable designs.
  • Experience in working with Map Reduce programs using Apache Hadoop for working with Big Data.
  • Experience in installation, configuration, supporting and monitoring Hadoop clusters using Apache, Cloudera distributions and AWS.
  • In depth understanding/knowledge of Hadoop Architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce concepts
  • Experience in using Pig, Hive, Scoop, HBase and Cloudera Manager.
  • Extensive experience with ETL and Query big data tools like­ Pig Latin and Hive QL.
  • Worked on kafka messaging system, able to ingest from kafka to Spark.
  • Hands on experience in big data ingestion tools like Flume and Sqoop
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice - versa.
  • Hands on experience in application development using Java, RDBMS, and Linux shell scripting.
  • Expertise in writing Hadoop Jobs for analyzing data using Hive and Pig
  • Extending Hive and Pig core functionality by writing custom UDFs.
  • Experience with Cloudera CDH3, CDH4 and CDH5 distributions
  • Experience in analyzing data using HiveQL, Pig Latin, and custom Map Reduce programs in Java.
  • Experienced the integration of various data sources like Java 1.5, RDBMS, Shell Scripting, Spreadsheets, and Text files.
  • Familiar with Java virtual machine (JVM) and multi-threaded processing.
  • Set up standards and processes for Hadoop based application design and implementation.
  • Experience in managing and reviewing Hadoop Log files.
  • Extensive experience with SQL, PL/SQL and database concepts
  • Worked on NoSQL databases including HBase, Cassandra.
  • Knowledge in job workflow scheduling and monitoring tools like oozie and Zookeeper
  • Experience in developing solutions to analyze large data sets efficiently
  • Experience in designing, developing and implementing connectivity products that allow efficient exchange of data between our core database engine and the Hadoop ecosystem.
  • Good understanding of XML methodologies (XML, XSL, XSD) including Web Services.

TECHNICAL SKILLS:

Hadoop / Spark/Aws: Hive, Sqoop, Pig, Puppet, Ambari, HBase, MongoDB, Cassandra, PowerPivot, Flume, Spark, AWS, Apache Storm

Java & J2EE Technologies: Core Java 1.5,Servlets 2.4

Operating Systems: Windows 95/98/2000/XP/Vista/7/8, Unix, Linux, Solaris

IDE Tools: Eclipse 3.2.2,Net Beans 6.1,RSA, RAD, Oracle Web logic workshop

Methodologies: Agile/ Scrum, Waterfall

Web Technologies: HTML, XML, JavaScript, AJAX, SOAP, WSDL

Programming or Scripting Languages: C, Java, SQL, Unix Shell Scripting, Python,SCALA

Database: Oracle 11g 10g 9i, MySQL,Terradata, MS-SQL Server

PROFESSIONAL EXPERIENCE:

Confidential, Oklahoma City, OK

Hadoop Developer

Responsibilities:

  • Participated in requirement gathering and converting the requirements into technical specifications
  • Analyzed large data sets by running Hive queries and Pig scripts
  • Developed Simple to complex MapReduce Jobs using Hive and Pig
  • Experience in java for streaming Map Reduce programs.
  • Analyze log files through hive and loading Json format to hive, and worked on external and internal tables and hive optimization techniques.
  • Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
  • Set up instances on aws and connecting.
  • Performed transformations, cleaning and filtering on imported data using Hive, Map Reduce, and loaded final data into HDFS.
  • Load the data into Spark RDD and performed in-memory data computation to generate the output response.
  • Stored data in the form of Avroformat, Parquet.
  • Deployed instances and connected the instance from console.
  • Worked on Sqark-sql and created data warehouse for both in Spark and hive.
  • Writing Spark RDD to hive Spark SQL.
  • Worked on Hadoop security using Kerebros.
  • Responsible for managing data from multiple sources.
  • Worked on webserver logs and created data pipelines.
  • Extracted files from Cassandra through Sqoop and placed in HDFS and processed.
  • Experienced in running Hadoop streaming jobs to process terabytes of xml format data.
  • Responsible to understand the input feed and expected output.
  • Responsible to oversee and write scripts for data to be processed to get ready for the analysts.
  • Got good experience with NOSQL database.
  • Load and transform large sets of structured, semi structured and unstructured data.
  • Responsible for loading data from LINUX file systems to HDFS.
  • Involved in creating Hive Tables, loading with data and writing Hive queries which will invoke and run Map Reduce jobs in the backend.
  • Created and maintained Technical documentation for launching HADOOP Clusters and for executing Hive queries and Pig Scripts

Environment: Hadoop, Java (jdk1.6), Hive, Pig, Sqoop, MapReduce, Flat files, Oracle 11g/10g, MySQL, Linux, Spark,AWS

Confidential, Boston, MA

Hadoop Developer

Responsibilities:
  • Worked on the proof-of-concept for Apache Hadoop framework initiation.
  • Involved in various phases of Software Development Life Cycle.
  • Work closely with various levels of individuals to coordinate and prioritize multiple projects.
  • Worked in the BI team in the area of Big Data Hadoop cluster implementation and data integration in developing large-scale system software.
  • Involved in source system analysis, data analysis, data modeling to ETL (Extract, Transform and Load)
  • Worked in tuning Hive and Pig scripts to improve performance.
  • Worked extensively in creating MapReduce jobs to power data for search and aggregation
  • Designed a data warehouse using Hive
  • Developed Oozie workflow for scheduling and orchestrating the ETL process
  • Handling structured and unstructured data and applying ETL processes.
  • Database systems/mainframe and vice-versa. Loading data into HDFS.
  • Extensively used Pig for data cleansing. Created partitioned tables in Hive
  • Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS
  • Developed the Pig UDF’S to pre-process the data for analysis.
  • Develop Hive queries for the analysts
  • Involved in the database migrations to transfer data from one database to other and complete virtualization of many client applications
  • Written build scripts using ant and participated in the deployment of one or more production system
  • Production Rollout Support which includes monitoring the solution post go-live and resolving any issues that are discovered by the client and client services teams.
  • Designed, documented operational problems by following standards and procedures using a software reporting tool JIRA.

Environment: Apache Hadoop, Java (jdk1.6), DataStax, Flat files, Oracle 11g/10g, mySQL, Toad 9.6, WindowsXP, UNIX, Sqoop, Hive, Oozie.

Confidential,Dallas, TX

Hadoop Developer

Responsibilities:
  • Importing and exporting data into HDFS and HIVE, PIG using Sqoop.
  • Responsible to manage data coming from different sources.
  • Migrating databases, logic and reporting systems built in access and excel to long term automated systems using Spark
  • Built data pipeline using Pig and Java/Scala Map Reduce to store onto HDFS.
  • Designing ETL Data Pipeline flow to ingest the data from RDBMS source to Hadoop using shell script, sqoop, package and mysql.
  • Developing business logic using scala.
  • Responsible for loading data from UNIX file systems to HDFS. Installed and configured Hive and also written Pig/Hive UDFs.
  • Involved in creating Hive Tables, loading with data and writing Hive queries which will invoke and run MapReduce jobs in the backend.
  • Writing MapReduce (Hadoop) programs to convert text files into AVRO and loading into Hive (Hadoop) tables
  • Implemented the workflows using Apache Oozie framework to automate tasks.
  • Worked with NoSQL databases like HBase in creating HBase tables to load large sets of semi structured data coming from various sources.
  • Developing design documents considering all possible approaches and identifying best of them.
  • Loading Data into HBase using Bulk Load and Non - bulk load.
  • Developed scripts and automated data management from end to end and sync up b/w all the clusters.
  • Implemented MapReduce programs to handle semi/unstructured data like XML, JSON, and sequence files for log files.
  • Fine-tuned Pig queries for better performance.
  • Involved in writing the shell scripts for exporting log files to Hadoop cluster through automated process.
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.

Confidential

Java Developer

Responsibilities:

  • Responsible for design, development, Java application architecture, use cases, flowcharts, application flow, prototypes, proof concept of sample codes.
  • Responsible for writing detailed design specification document and implementing all business rules.
  • Designed and developed web pages using HTML and JSP’s using JSTL tags.
  • Wrote data access component to perform DML operations using JDBC.
  • Developed complex PL/SQL queries and utilized stored procedures and triggers to interact with Oracle database.
  • Was involved in regression testing of the application using JUNIT.
  • Designed and developed web pages using HTML and JSP’s using JSTL tags.
  • Wrote client side form based validations using JavaScript.
  • Developed scripts for EJB deployment, build releases and generating daily logs on both NT and UNIX. Involved in writing complicated queries and stored procedures using SQL, PL/SQL and Oracle.

Environment: Java (JDK1.3.x), EJB 1.1, JSP, Web Sphere 4.x/5.x, Eclipse 3.1, WSAD 4.0/5.x, Oracle 8i on Windows 2000 and UNIX Environment.

We'd love your feedback!