We provide IT Staff Augmentation Services!

Sr.hadoop Developer Resume

4.00/5 (Submit Your Rating)

Detroit, MI

PROFESSIONAL SUMMARY:

  • Over 8 years of professional IT experience which includes 4+ years of experience in Big data ecosystem and Java/J2EE related technologies.
  • Have 4+ years of comprehensive experience in implementing complete Hadoop solutions, that includes HDFS, MapReduce, HBase, Hive, Pig, Impala, Oozie, Sqoop, Spark SQL, YARN/MRv1/MRv2, Flume,Kafka, Solr, Zookeeper, Storm, MongoDB, Cassandra.
  • Experience in Map Reduce to process data for extraction, transformation and aggregation. 
  • Experience in implementing complex Map reduce algorithms to perform joins on the Map side using distributed cache.
  • Experienced in working with Hive data warehousing infrastructure to analyze large structured datasets. 
  • Extended Hive and Pig core functionality by writing Custom UDFs.
  • Experienced in handling ETL transformations using Pig Latin scripts, expressions, join operations and Custom UDF's for evaluation, filtering and storing data
  • Hands on experience in writing HiveQL queries, Spark SQL 
  • Experience in importing and exporting of data into/from Traditional Database like Teradata/Oracle RDBMS using Sqoop.
  • Working experience in writing Sqoop queries for transferring bulk data between Apache Hadoop and structured data stores.
  •  Good knowledge on dealing with log files to extract data and to import into HDFS using Flume. 
  • Expertise and Knowledge in using job scheduling and monitoring tools like Oozie and Airflow.
  •  Experience with Oozie Workflow Engine in scheduling jobs for Map - Reduce, Pig Hive and Kafka.
  • Good working experience on Spark (spark streaming, spark SQL), Scala and Kafka.
  • Experience in writing Producers/consumers and creating messaging centric applications using Apache Kafka.
  • Have knowledge on Apache Storm to integrate with Apache Kafka for stream processing.
  • Experience in designing the zookeeper to facilitate the servers in clusters and to keep up the information consistency.
  • Expert in analyzing real time queries using different NoSQL data bases including Cassandra, MongoDB and HBase.
  • Extracted files from MongoDB through Sqoop and placed in HDFS and processed.
  • Generated Java APIs for retrieval and analysis on No-SQL database such as HBase and Cassandra.
  • Experience in Data Modeling and working with Cassandra Query Language (CQL). 
  • Good at using sqoop to load data to and from Cassandra cluster. 
  • Experience in converting business process into RDD transformations using Apache Spark and Scala.
  • Excellent understanding of abstraction using Scala and Spark. 
  • Knowledge on handling Hive queries using Spark SQL that integrate Spark environment. 
  • Good knowledge in using Splunk for data monitoring and visualization. 
  • Knowledge on Splunk UI to work at production support to perform log analysis.
  • Good in writing Spark scripts by using Scala and Python.
  • Developed various shell scripts and python scripts to address various production issues.
  • Experience with various SDLC methodologies like Waterfall and Agile and Object Oriented Analysis and Design (OOAD).
  • Involved in daily SCRUM meetings to discuss the development/progress
  • Experience on cloud technologies like Amazon Web Services(AWS). 
  • Expertise in various Hadoop distributions like Amazon AWS EC2, Cloudera, Hortonworks and MapR distributions.
  • Involved in deploying the content Cloud platform on Amazon AWS using EC2, S3, and EBS. 
  • Used different Hadoop components in Talend to design the framework. 
  • Worked on Talend to run ETL jobs on the data in HDFS. 
  • Experience in working with databases DB2, Oracle, MySQL. Extensive experience working on SQL, PL/SQL using tables, triggers, views, packages and stored procedures
  • Strong hands on experience in development of Client/Server Applications using Java/J2EE, XML and MVC Frameworks like Spring, Struts and Hibernate.
  • Expertise in tools and utilities like Eclipse, TOAD for Oracle, Rational Rose (UML tool), WSAD, RAD, Ant, Maven.
  • Have good experience with both Windows, LINUX and UNIX platforms.

TECHNICAL SKILLS:

Programming Languages: C, Java, Scala, SQL, PL/SQL, PYTHON

Distributed File Systems: Apache Hadoop HDFS

Hadoop Distributions: Amazon AWS/EMR, Apache Cloudera, Hortonworks, and MapR

Hadoop Technologies: HDFS, MapReduce, Hive, Pig, Sqoop, Azkaban, Oozie, Zookeeper, Flume sparksql and Apache Kafka

NoSQL data bases: Cassandra, Hbase, MongoDB

Databases: Oracle, MySQL.

Search Platforms: Apache Solr

In-memory/MPP/Search: Apache Spark, Apache Spark Streaming, Apache Storm

Operating Systems: Windows, UNIX, LINUX

Cloud Platforms: Amazon AWS, OpenStack.

Application Servers: JBoss, Tomcat, Web Logic, Web Sphere

Web Services: SOAP, REST, WSDL, JAXB, and JAXP

Frameworks: Hibernate, Spring, Struts, JMS, EJB

PROFESSIONAL EXPERIENCE:

Confidential, Detroit, Mi

Sr.Hadoop Developer

Responsibilities

  • Involved in various phases of development analyzed and developing the system going through Agile Scrum methodology 
  • Exported the analyzed data to the relational databases (MySQL, Oracle) using Sqoop from HDFS.
  • Developed data pipeline using Flume, Sqoop, Pig and Java Map Reduce to ingest data into HDFS for analysis.
  • Used FLUME to dump the application server logs into HDFS.
  • Responsible for writing MapReduce jobs to handle files in multiple formats (JSON, Text, XML etc..).
  • Worked on reading multiple data formats on HDFS using Scala.
  • Extensively worked on creating combiners, Partitioning, Distributed cache to improve the performance of MapReduce jobs.
  • Developed Spark jobs using Scala in test environment for faster data processing and used Spark SQL for querying. 
  • Design technical solution for real-time analytics using Kafka and HBase. 
  • Streaming of real time data using Spark with Kafka. 
  • Implemented spark solution to enable real time reports from Cassandra data
  • Configured spark streaming data to receive real time data from Kafka and store it in HDFS. 
  • Used Kafka to load data in to HDFS and move data into NoSQL databases(Cassandra) 
  • Built Cassandra Cluster on both the physical machines and on AWS.
  • Imported the data from different sources like AWS S3, Local file system into Spark RDD.
  • Experience with developing and maintaining Applications written for Amazon Simple Storage, AWS Elastic Beanstalk, and AWS Cloud Formation.
  • Along with the Infrastructure team, involved in design and developed Kafka and Storm based data pipeline.
  • Implemented messaging system for different data sources using apache Kafka and configuring High level consumers for online and off-line processing.
  • Worked on Talend metadata manager, analyzed and implemented different use cases for handling various types of metadata. 
  • Experience in improving performance of the Talend jobs.
  • Implemented UDFS, UDAFS, UDTFS in java for hive to process the data that can’t be performed using Hive inbuilt functions
  • Wrote Impala scripts for ETL.
  • Used Impala to read, write and query the data in HDFS.
  • Experience in migrating HiveQL into Impala to minimize query response time.
  • Involved in creating Hive Internal/External tables, loading with data and troubleshoot with Hive jobs.
  • Created partitioned tables in Hive for best performance and faster querying.
  • Involved in integrating hive queries into spark environment using SparkSql.
  • Importing and exporting data into HDFS and Hive using Sqoop.
  • Extracted the data from MySQL into HDFS using Sqoop.
  • Developed Pig Latin scripts for extracting data.
  • Extensively used Pig for data cleansing and HIVE queries for the analysts.
  • Created PIG script jobs in maintaining minimal query optimization.
  • Written shell scripts and Python scripts for automation of job.
  • Involved in daily SCRUM meetings to discuss the development/process

Environment: Hadoop, Java, MapReduce, HDFS, Hive, Pig, Sqoop, Flume, Linux, Python, Spark, Impala, Scala, Kafka Storm, Shell Scripting, XML, Eclipse, Cloudera, DB2, SQL Server, MySQL, Autosys, Talend, AWS, HBase.

Confidential, Champaign, IL

 Hadoop Developer

Responsibilities

  • Actively participated with the development team to meet the specific customer requirements and proposed effective Hadoop solutions
  • Structured data was ingested onto the data lake using Sqoop jobs and scheduled using Oozie workflow from the RDBMS data sources for the incremental data.
  • Streaming Data was ingested into the data lake using Flume.
  • Hands on dealing with log files to extract data and to copy into HDFS using flume. 
  • Developed data pipeline using Flume to ingest customer behavioral data into HDFS for analysis. 
  • Implemented custom interceptors for flume to filter data and defined channel selectors to multiplex the data into different sinks.
  • Developed Map Reduce programs using Java programming language that are implemented on the Hadoop cluster.
  • Used Avro data serialization system with Avro tools to handle Avro data files using Map reduce programs.
  • Implemented Data Validation using map reduce programs to remove un-necessary records before move data into Hive tables.
  • Implemented UDFS, UDAFS, UDTFS in java for hive to process the data that can’t be performed using Hive inbuilt functions
  • Implemented optimized map side joins to get data from different data sources, cleaning data .
  • Designed and implemented custom writable, custom input formats, custom partitions and custom comparators.
  • Involved in creating Hive Internal/External tables, loading with data and troubleshoot with Hive jobs.
  • Used the RegEx, JSON and Avro SerDe’s for serialization and de-serialization packaged with Hive to parse the contents of streamed log data and implemented Hive custom UDF’s.
  • Experienced in Using Hive ORC formats for better columnar format, compression and processing.
  • Wrote pig scripts for advanced analytics on the data for recommendations.
  •  Performed operations like Update, Insert and Delete data in MongoDB. 
  • Extracted files from MongoDB through Sqoop and placed in HDFS and processed.
  • Wrote queries in MongoDB to generate reports to display in the dash board.
  • Experience in deploying, managing and developing MongoDB clusters. 
  • Worked on MongoDB database concepts such as locking, transactions, indexes, sharding, replication and schema design
  • Involved in converting business transformations into Spark RDDs using Scala.
  • Involved in integrating hive queries into spark environment using SparkSql.
  • Computing the complex logics and controlling the Data flow through In-memory process tool Apache Spark.
  • Experienced in configuring work flows using Oozie.
  • Involved in deploying multi module applications using Maven and Jenkins.
  • Experienced in working in agile environment and on-site/offshore co-ordination.

Environment: Hadoop Framework, MapReduce, Hive, Sqoop, Pig, HBase, Cassandra, Apache Kafka, MongoDB, Apache Spark Storm, Flume, Oozie, Maven, Jenkins, Java(JDK1.6), UNIX Shell Scripting, Oracle 11g/12g.

Confidential, Oklahoma city, OK

Hadoop Developer

Responsibilities

  • Developed Pig Scripts, Pig UDFs and Hive Scripts, Hive UDFs to analyze HDFS data.
  • Used Sqoop to export data from HDFS to RDBMS.
  • Having experience on Hadoop eco system components HDFS, MapReduce, Hive, Pig, Sqoop and HBase.
  • Expertise with web based GUI architecture and development using HTML, CSS, AJAX, JQuery, Angular Js, and JavaScript.
  • Developed Map Reduce programs for some refined queries on big data.
  • Involved in loading data from UNIX file system to HDFS.
  • Used Pig as ETL tool to do transformations, event joins and some pre-aggregations before storing the data onto HDFS.
  • Extracted the data from Databases into HDFS using Sqoop
  • Handled importing of data from various data sources, performed transformations using Hive, PIG and loaded data into HDFS.
  • Used PIG predefined functions to convert the fixed width file to delimited file
  • Used HIVE join queries to join multiple tables of a source system and load them into Elastic Search Tables.
  • Installed and configured Hadoop MapReduce, HDFS, Developed multiple MapReduce jobs in java for data cleaning and pre-processing.
  • Experienced in managing and reviewing Hadoop log files
  • Developed Sqoop scripts to import export data from relational sources and handled incremental loading on the customer, transaction data by date.
  • Tuning of MapReduce configurations to optimize the run time of jobs.
  • For automating the cluster installation developed Shell Scripts.
  • Wrote the shell scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
  • Involved in loading data from UNIX file system to HDFS.
  • Created java operators to process data using DAG streams and load data to HDFS.
  • Developed custom Input Formats to implement custom record readers for different datasets.
  • Used Pig as ETL tool to do transformations, event joins, filter and some pre-aggregations.
  • Automated all jobs for pulling data from FTP server to load data into Hive tables using Oozie workflow.
  • Experienced in using Java Rest API to perform CURD operations on HBase data.

Environment: Hadoop, MapReduce, HDFS, Hbase, Hive, Java, SQL, Cloudera Manager, Sqoop, Flume, Oozie, Shell Scripts, Java JDK 1.6, Eclipse.

Confidential

Java Developer

Responsibilities

  • Coded front end components using HTML, JavaScript and jQuery, Back End components using Java, spring, Hibernate, Services Oriented components using Restful and SOAP based web services, and Rules based components using JBoss Drools.
  • Developed presentation layer using JSP, HTML and CSS and JQuery.
  • Developed JSP custom tags for front end.
  • Written Java script code for Input Validation.
  • Used Apache CXF open source tool to generate java stubs form WSDL.
  • Developed and consumed SOAP Web services using JBoss ESB framework
  • Developed the Web Services Client using SOAP, WSDL description to verify the credit history of the new customer to provide a connection.
  • Developed RESTful Web services within ESB framework and used Content based Routing to route to ESB's
  • Designed and developed Java batch programs in Spring Batch.
  • Test Driven development is done by maintaining the Junit and FlexUnit test cases throughout the application.
  • Developed stand-alone Java batch applications with spring and Hibernate.
  • Involved in write database schema Through Hibernate.
  • Designed and developed DAO layer with Hibernate standards, to access data from IBM DB2.
  • Developed the UI panels using JSF, XHTML, CSS, DOJO and JQuery.

Environment: Java 6 - JDK 1.6, JEE, Spring 3.1 framework, Spring Model View Controller (MVC), Java Server Pages (JSP) 2.0, Servlets 3.0, JDBC4.0, AJAX, Web services, Rest full, JSON, Java Beans, JQuery, JavaScript, Oracle 10g, JUnit, HTML Unit, XSLT, HTML/DHTML.

Confidential

Java Developer

Responsibilities

  • Performed Requirements gathering and analysis and prepared Requirements Specifications document.
  • Provided high level systems design specifying the class diagrams, sequence diagrams and activity diagrams
  • Developed a web-based system with HTML5, XHTML, JSTL, custom tags and Tiles using Struts framework.
  • Involved in implementation of presentation layer logic using HTML, CSS, JavaScript and XHTML.
  • Used Asynchronous JavaScript (AJAX) and XML for a better interactive Front-End.
  • Responsible for development of the core backend business logic using Java.
  • Developed Servlets, Action classes, Action Form classes and configured the struts-config.xml file.
  • Performed client side validations using Struts Validation Framework.
  • Provided on call support based on the priority of the issues.

Environment: Core Java, Servlets, struts, JSP, XML, XSLT, JavaScript.

We'd love your feedback!