We provide IT Staff Augmentation Services!

Hadoop Developer Resume

3.00/5 (Submit Your Rating)

Dublin, OH

SUMMARY

  • 8+years of professional IT experience in analyzing requirements, designing, building, highly distributed mission critical products and Applications.
  • 3+ years of Data Analytics experience in Apache Hadoop Cloudera and Hortonworks Distributions
  • Expertise in core Hadoop and Hadoop technology stack which includes HDFS, Map Reduce, Oozie, Hive, Sqoop, Pig, Flume, HBase, Spark, Storm, Kafka and Zookeeper.
  • Experience in AWS cloud environment and on s3 storage and ec2 instances and deploying in it.
  • In - depth knowledge of Statistics, Machine Learning, Data mining.
  • Well versed in installation, configuration, supporting and managing of Big Data and underlying infrastructure of Hadoop Cluster.
  • Experienced in implementing complex algorithms on semi/unstructured data using Map reduce programs.
  • Experienced in working with structured data using Hive QL, join operations, Hive UDFs, partitions, bucketing and internal/external tables.
  • Experienced in migrating ETL kind of operations using Pig transformations, operations and UDF's.
  • Good knowledge on Python.
  • Spark Streamingcollects this data from Kafka in near-real-time and performs necessary transformations and aggregation on the fly to build the common learner data model and persists the data in NoSQL store (Hbase).
  • Specialization in Data Ingestion, Processing, Development from Various RDBMS data sources into a Hadoop Cluster using Map Reduce/Pig/Hive/Sqoop
  • Configured different topologies forStormcluster and deployed them on regular basis.
  • Experienced in implementing unified data platform to get data from different data sources using Apache Kafka brokers, cluster, Java producers and Consumers.
  • Excellent Working Knowledge in Spark Core, Spark SQL, Spark Streaming.
  • Experienced in working with in-memory processing frame work like Spark transformations, SprakQL and Spark streaming.
  • Experienced in proving User based recommendation by implementing collaborative filtering and matrix factorization and different classification techniques like random forest, SVM, K-NNusingSpark Mliblibrary.
  • Excellent understanding and knowledge of NOSQL databases like HBase, Cassandra, Mongo DB, Teradata and on Data warehouse.
  • Installed and configured Cassandra and good knowledge about Cassandra architecture, read, write paths and query.
  • Involved in NoSQL (Datastax Cassandra) database design, integration and implementation and written scripts and invoked them using CQLSH.
  • Involved in data modeling in Cassandra and Involved in implementing sharding and replication strategies in MongoDB.
  • Developed fan-out workflow using flume for ingesting data from various data sources like Webservers, Rest API by using different sources and ingested data into Hadoop with HDFS sink.
  • Experienced in implementing custom interceptors and sterilizers in flume for specific customer requirements.
  • Experience in implementing in setting up standards and processes for Hadoop based application design and implementation.
  • Worked on importing and exporting data from MYSQL into HDFS, HIVE andHbaseusingSqoop.
  • Tool monitored log input from several datacenters, viaSparkStream, was analyzed in Apache Storm and data was parsed and saved into Cassandra.
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database SystemsMYSQL and vice versa.
  • Experience in developing strategies for Extraction, Transformation and Loading (ETL) data from various sources into Data Warehouse and Data Marts usinginformatica.
  • Excellent understanding / knowledge of Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and MapReduce programming paradigm.
  • Good Exposure on Apache Hadoop Map Reduce programming, PIG Scripting and Distribute Application and HDFS.
  • Experience in managing Hadoop clusters using Cloudera Manager Tool.
  • Very good experience in complete project life cycle (design, development, testing and implementation) of Client Server and Web applications.
  • Experience in Administering, Installation, configuration, troubleshooting, Security, Backup, Performance Monitoring and Fine tuning of Linux Redhat.
  • Worked on Cluster co-ordination services throughZookeeper.
  • Actively involved in coding using CoreJavaand collection API's such as Lists, Sets and Maps.
  • Hands on experience in application development using Java, RDBMS, and Linux shell scripting.
  • Experience on different operating systems like UNIX, Linux and Windows.
  • Experience on Java Multi-Threading, Collection, Interfaces, Synchronization, and Exception Handling.
  • Involved in writing PL/SQL stored procedures, triggers and complex queries.
  • Worked in Agile environment with active scrum participation.

TECHNICAL SKILLS

Hadoop/Big Data: HDFS, Map reduce, HBase, Pig, Hive, Sqoop, MongoDB, Cassandra, Flume,Oozie, Zookeeper, AWS, Spark, Kafka, Teradata, Storm, ETL, Informatica.

Java & J2EE Technologies: Core Java, Servlets, JSP, JDBC, Java Beans, Maven, Gradle, JUnit, TestNG.

IDE’s: Eclipse, Net beans, Intellij Idea.

Frameworks: MVC, Struts, Hibernate, Spring.

Programming languages: C,C++, Java, Python, Ant scripts, Linux shell scripts

Databases: Oracle 11g/10g/9i, MYSQL, DB2, MS-SQL SERVER

Web Servers: Web Logic, Web Sphere, Apache Tomcat

Web Technologies: HTML, XML, JavaScript, AJAX, SOAP, WSDL, JAX-RS, Restful, JAX-WS.

Network Protocols: TCP/IP, UDP, HTTP, DNS, DHCP

Version Controls: CVS, SVN, GIT.

PROFESSIONAL EXPERIENCE

Confidential, Dublin, OH

Hadoop Developer

Responsibilities:

  • Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, Hbase database and Sqoop, Cassandra, zookeeper, AWS.
  • Evaluated business requirements and prepared detailed specifications that follow project guidelines required to develop written programs.
  • Responsible for building scalable distributed data solutions using Hadoop.
  • Analyzed large amounts of data sets to determine optimal way to aggregate and report on it using Map Reduce programs.
  • Implemented Map reduce programs to retrieve Top-K results from unstructured data set.
  • Migrating various hive UDF’s and queries into Spark SQL for faster requests as part of POC implementation.
  • Optimized Map Reduce Jobs to use HDFS efficiently by using various compression mechanisms.
  • Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and Extracted the data from HDFStoMYSQLusing Sqoop.
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
  • Experience in AWS cloud environment and on s3 storage and ec2 instances
  • Developed fan-out workflow using flume for ingesting data from various data sources like Webservers, Rest API by using different sources and ingested data into Hadoop with HDFS sink.
  • Involved in migrating MongoDBversion 2.4 to 2.6 and implementing new security features and designing more efficient groups.
  • Installed and configured Cassandra and good knowledge about Cassandra architecture, read, write paths and query.
  • Implemented various ETLsolutions as per the business requirement using informatica
  • Experience with creating ETL jobs to load JSON data and server data into MongoDB and transformedMongoDB into the Data Warehouse.
  • Involved in data modeling in Cassandra and MongoDB and involved in choosing indexes and primary keys based on the client requirement.
  • ConfiguredSparkstreaming to receive real time data from theKafkaand store the stream data to HDFS using Scale.
  • Used Spark for Parallel data processing and better performances.
  • Extensively used Pig for data cleansing and extract the data from the web server output files to load into HDFS.
  • Developed a data pipeline usingKafkaand Storm to store data into HDFS.
  • Implemented Kafka Java producers, create custom partitions, configured brokers and implemented High level consumers to implement data platform.
  • Implemented Storm topologies to preprocess data, implemented custom grouping to configure partitions.
  • Managed and reviewed Hadoop log files.
  • Involved in creating Hive tables, loading with data and writing hive queries which will run internally in MapReduce way.
  • Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
  • Installed and configured Pig and also written Pig Latin scripts.
  • Responsible to manage data coming from different sources.
  • Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.

Environment: Hadoop, MapReduce, HDFS, Hive, Pig, Java, SQLSqoop, Java (jdk 1.6), Spark,kafka, AWS, MongoDB, Storm, Cassandra, ETL, Informatica, Talend.

Confidential - Memphis, TN

Hadoop Developer

Responsibilities:

  • Installed and configured Cassandra and good knowledge about Cassandra architecture, read, write paths and quering using Cassandra shell.
  • Worked on writing Map Reduce jobs to discover trends in data usage by customers.
  • Worked on and designed Big Data analytics platform for processing customer interface preferences and comments using Java, Hadoop, Hive and Pig.
  • Involved in hive-Hbase integration by creating hive external tables and specifying storage as Hbase format.
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
  • Experienced in defining job flows to run multiple Map Reduce and Pig jobs using Oozie.
  • Installed and configured Hive and also written Hive QL scripts.
  • Experience with loading the data into relational database for reporting, dash boarding and ad-hoc analyses, which revealed ways to lower operating costs and offset the rising cost of programming.
  • Experience with creating ETL jobs to load JSON data and server data into MongoDB and transformed MongoDB into the Data Warehouse.
  • Involved in ETL code deployment, Performance Tuning of mappings inInformatica.
  • Created reports and dashboards using structured and unstructured data.
  • Experienced with performing analytics on Time Series data using HBase.
  • Implemented HBase co-processors, Observers to work as event based analysis.
  • Hands on Installing and configuring nodes CDH4 Hadoop Cluster on CentOS.
  • Implemented Hive Generic UDF's to implement business logic.
  • Experienced with accessing Hive tables to perform analytics from java applications using JDBC.
  • Experienced in running batch processes using Pig Scripts and developed Pig UDFs for data manipulation according to Business Requirements.
  • Experience with streaming work flow operations and Hadoop jobs using Oozie workflow and scheduled through AUTOSYS on a regular basis.
  • Performed operation using Partitioning pattern in Map Reduce to move records into different categories.
  • Developed Spark SQL scripts and involved in converting hive UDF’s to Spark SQL UDF’s.
  • Responsible for batch processing and real time processing in HDFS and NOSQL Databases.
  • Responsible for retrieval of Data from Casandra and ingestion to PIG.
  • Experience in customizing map reduce framework at various levels by generating Custom Input formats, Record Readers, Partitioner and Data types.
  • Experienced with multiple file in HIVE, AVRO, Sequence file formats.
  • Created and maintained Technical documentation for launching HADOOP Clusters and for executing Pig Script.
  • Implemented business logic by writing Pig UDF's in Java and used various UDFs from Piggybanks and other sources.

Environment: Casandra, Map jobs, Spark SQL, ETL, Pig Scripts, Flume, Hadoop BI, Pig UDF’s, Oozie, AVRO, Hive, Map Reduce, Java, Eclipse, Zookeeper, Informatica.

Confidential, Indianapolis, IN

Hadoop Developer

Responsibilities:

  • Involved in the Complete Software development life cycle (SDLC) to develop the application.
  • Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, Hbase database and Sqoop, Cassandra, zookeeper.
  • Involved in loading data from LINUX file system to HDFS.
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
  • Importing and exporting data into HDFS and Hive using Sqoop.
  • Implemented test scripts to support test driven development and continuous integration.
  • Developed multiple Map Reduce jobs in java for data cleaning.
  • Installed and configured Hadoop Map Reduce, HDFS, Developed multiple MapReduce jobs in java for data cleaning and preprocessing.
  • Created Pig Latin scripts to sort, group, join and filter the enterprise wise data.
  • Involved in creating Hive tables, loading with data and writing hive queries that will run internally in Map Reduce way.
  • Supported MapReduce Programs those are running on the cluster.
  • Analyzed large data sets by running Hive queries and Pig scripts.
  • Worked on tuning the performance Pig queries.
  • Mentored analyst and test team for writing Hive Queries.
  • Installed Oozie workflow engine to run multiple Mapreduce jobs.
  • Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
  • Worked on zookeeper for coordinating between different master node and datanodes

Environment: Hadoop, HDFS, Map Reduce, Hive, Pig, Sqoop, Linux, Java, Oozie, Hbase, zookeeper.

Confidential, Orlando, FL

Java /J2EE Developer

Responsibilities:

  • Work with business users to determine requirements and technical solutions.
  • Followed Agile methodology (Scrum Standups, Sprint Planning, Sprint Review, Sprint Showcase and Sprint Retrospective meetings).
  • Developed business components using core java concepts and classes like Inheritance, Polymorphism, Collections, Serialization and Multithreading etc.
  • Used SPRING framework that handles application logic and makes calls to business make them as Spring Beans.
  • Implemented, configured data sources, session factory and used Hibernate Template to integrate Spring with Hibernate.
  • Developed web services to allow communication between applications through SOAP over HTTP with JMS and mule ESB.
  • Actively involved in coding using CoreJavaand collection API's such as Lists, Sets and Maps
  • Developed a Web Service (SOAP, WSDL) that is shared between front end and cable bill review system.
  • Implemented Rest based web service using JAX-RS annotations, Jersey implementation for data retrieval with JSON.
  • Developed MAVEN scripts to build and deploy the application onto Web logic Application Server and ran UNIX shell scripts and implemented autodeployment process.
  • Used Maven as the build tool and is scheduled/triggered by Jenkins (build tool).
  • Develop JUNIT test cases for application unit testing.
  • Implement Hibernate for data persistence and management.
  • Used SOAP UI tool for testing web services connectivity.
  • Used SVN as version control to check in the code, Created branches and tagged the code in SVN.
  • Used RESTFUL Services to interact with the Client by providing the RESTFUL URL mapping.
  • Used Log4j framework to log/track application and debugging.

Environment: JDK 1.6, Eclipse IDE, Core Java, J2EE, Spring, Hibernate, Unix, Web Services, SOAP UI, Maven, Web logic Application Server, SQL Developer, Camel, Junit, SVN, Agile, SONAR, Log4j, REST, Log 4j, JSON, JBPM.

Confidential

Java Developer

Responsibilities:

  • Involved in analysis, design and development of Expense Processing system.
  • Created used interfaces using JSP.
  • Developed the Web Interface using Servlets, Java Server Pages, HTML and CSS.
  • Developed the DAO objects using JDBC.
  • Business Services using the Servlets and Java.
  • Design and development of User Interfaces and menus using HTML 5, JSP, Java Script, client side and server side validations.
  • Developed GUI using JSP, Struts frame work.
  • Involved in developing the presentation layer using Spring MVC/Angular JS/JQuery.
  • Involved in designing the user interfaces using Struts Tiles Framework.
  • Used Spring 2.0 Framework for Dependency injection and integrated with the Struts Framework and Hibernate.
  • Used Hibernate 3.0 in data access layer to access and update information in the database.
  • Experience in SOA (Service Oriented Architecture) by creating the web services with SOAP and WSDL.
  • Developed JUnit test cases for all the developed modules.
  • Used Log4J to capture the log that includes runtime exceptions, monitored error logs and fixed the problems.
  • Used RESTFUL Services to interact with the Client by providing the RESTFUL URL mapping.
  • Used CVS for version control across common source code used by developers.
  • Used ANT scripts to build the application and deployed on Web logic Application Server 10.0.

Environment: - Struts1.2, Hibernate3.0, Spring2.5, JSP, Servlets, XML,SOAP, WSDL, JDBC, JavaScript, HTML, CVS, Log4J, JUNIT, Web logic App server, Eclipse, Oracle, Restful.

We'd love your feedback!