We provide IT Staff Augmentation Services!

Hadoop Developer Resume

4.00/5 (Submit Your Rating)

Westlake, TX

SUMMARY:

  • 6+ years of extensive IT experience with as a Hadoop Developer.
  • Experience in installation, management and monitoring of Hadoop cluster using Cloudera Manager.
  • Having experience in using Apache Avro to provide both a serialization format for persistent data, and a wire format for communication between Hadoop nodes.
  • Good understanding of NoSQL databases like MongoDB, Cassandra, and HBase.
  • Comprehensive experience in Big Data processing using Hadoop Ecosystem including Pig, Hive, HDFS, MapReduce (MRV1 and YARN), Sqoop, Flume, Kafka, Oozie, Zookeeper, Spark, Impala.
  • Experienced in the Hadoop ecosystem components like Hadoop MapReduce, Cloudera, Hortonworks, HBase, Oozie, Hive, Sqoop, Pig, Tez, Flume, Kafka,Sparksql, Storm, Spark, Scala, MongoDB, Couchbase and Cassandra.
  • Experience to offers a suite of cloud - computing services which makes up an on-demand computing platform with the help of Amazon web services (AWS).
  • Experience in analyzing data using HIVEQL and Pig Latin and custom MapReduce programs in Java.
  • Agile Coaching of 6 different development teams with 6-10 members each.Strategize and manage implementation of a new test automation system while maintaining quarterly release schedule.
  • Having a Good exposure on Big Data technologies and Hadoop ecosystem, In-depth understanding of MapReduce and the Hadoop Infrastructure.
  • Experience to evaluate the relationship between AWS Amazon Redshift and other big data systems.
  • Hands on experience in gathering information from different nodes into Greenplum database and then Sqoop incremental load into HDFS.
  • Recently started using Mahout for machine learning in identifying a more subtle classifier.
  • Knowledge on Data Mining and Analysis including Regression models, Decision Trees, Association rule mining, customer segmentation, Hypothesis Testing and proficient in R Language (including R packages) and SAS.
  • Experienced in configuring Flume to stream data into HDFS.
  • Experienced in real-time Big Data solutions using Hbase, handling billions of records.
  • Extensive experience in working with structured/semi-structured and Unstructured data by implementing complex MapReduce programs using design patterns.
  • Experience in developing and designing POCs using Scala and deployed on the Yarn cluster, compared the performance of Spark, with Hive and SQL/Teradata.
  • Experience in developing and designing Web Services (SOAP and Restful Web services).
  • Extensive experience in writing SQL queries for Oracle, Hadoop and DB2 databases using SQLPLUS.
  • Experience working on Waterfall and Agile methodology Analyzed and synthesized results from Joint Application Development (JAD)
  • Hands on experience in working with Oracle, DB2, MySQL and knowledge on SQL Server.
  • Hands on experience in using SQL and PL/SQL to write Stored Procedures, Functions and Triggers.
  • Ability to adapt to evolving technology, strong sense of responsibility and accomplishment.
  • Well-versed with multiple operating systems such as Windows, Dos, UNIX, LINUX, Sun Solaris.
  • Provide direction on trouble resolution for ISP Interfaces and customers interface. Provide technical support for WAN problems, T-1/ISDN/Frame Relay/HDLC. ent command on change management and coordinating deployments in distributed environments.
  • Hands on experience with RAID using Volume Management Software like Logical Volume Manager,Veritas Volume Manager, Solaris Volume Manager.

TECHNICAL SKILLS:

Hadoop/Big Data: HDFS, Mapreduce, HBase, Pig, Hive, Sqoop, Flume, Cassandra, Impala, Oozie, Zookeeper,Redshift, Hortonworks,greenplum, Amazon Web Services, EMR, MRUnit, Spark, Storm,AVRO,RDBMS

Java & J2EE Technologies: Core Java, JDBC, Servlets, JSP, JNDI, Struts, Spring, Hibernate and Web Services (SOAP and Restful)

IDE s: Eclipse, Net beans, MyEclipse, IntelliJ

Frameworks: MVC, Struts, Hibernate, Spring, HibMS Visual Studio

Programming languages: C,C++, Java, Python, Ant scripts, Linux shell scripts, R

Databases: Oracle 11g/10g/9i, MySQL, DB2, MS-SQL Server, MongoDB, Graph DB

Web Servers: Web Logic, Web Sphere, AOLserver, Apache Tomcat, Apache HTTP Server.

Web Technologies: HTML, XML, JavaScript, AJAX, Restful WS,PHP, XHTML, WordPress

Network Protocols: ARP, UDP, HTTP, BGP, DNS, RPC and ICMP

ETL Tools: Informatica, Qlikview, Microsoft SQL and Cognos

WORK EXPERIENCE:

Confidential, Westlake, TX

Hadoop Developer

Responsibilities:

  • Migrate HDFS data to Edge-Note on AWS(Amazon web service) and setup Cloudera-Impala Environment in the cloud.
  • Responsible for building scalable distributed data solutions using Hadoop cluster environment with Hortonworks distribution on data nodes .
  • Developed custom aggregate functions using SparkSQL and performed interactive querying.
  • Created Data Marts and loaded the data using Informatica Tool.
  • Install, Configure and maintain Hadoop on Multi clustered environment on Virtual Systems and worked with MapReduce, Hbase, Hive, Pig, Sqoop, Spark, Scala and Pig, Latin, Flume, Zookeeper etc.
  • MapReduce code to process the data from social feeds which used to come in various formats like Json, TSV, CSV etc. and to load it into Database.
  • Deployed AWS(Amazon Web services) Big Data solutions using Redshift, Mango, Apache Hadoop, Spark, Casandra .
  • Built and maintained standard operational procedures for all needed Greenplum implementations .
  • Developed Scala scripts, UDF's using both Data frames/SQL and RDD/MapReduce in Spark for Data Aggregation, queries and writing data back into RDBMS through Sqoop.
  • Installed configured, monitored and did maintenance on the Cloudera distribution on Red Hat Enterprise Linux Version.
  • Involved in Hadoop cluster task like Adding and Removing Nodes without any effect to running jobs and data.
  • Worked on Big data platform Hortonworks. Used Kafka as a messaging system to get data from different sources.
  • Worked on evaluation and analysis of Hadoop cluster and different big data analytic tools including Pig, Hbase database and Sqoop.
  • Importing and exporting data into HDFS and Hive from different RDBMS using Sqoop.
  • Setup POC Hadoop cluster on Amazon EC2(AWS).
  • Importing and exporting data into HDFS and Hive using Sqoop.
  • Developing MapReduce Program to transform/process data.
  • Installed and configured Flume, Hive, Pig, Sqoop, HBase on the Hadoop cluster.
  • Installed and configured Hive and also written Hive UDFs.
  • Writing the shell scripts automate the data flow from local file system to hdfs and then to Nosql databases (Hbase, Monbodb) and vice versa.
  • Loaded HIVE data into GreenPlum database using GPload utility for performing real time aggregation.
  • Used Spark API over Hortonworks Hadoop YARN to perform analytics on data in Hive.
  • Create proof of concepts for data science (Spark SQL, Spark Streaming, Spark ML) utilizing Big Data ecosystem and AWS(Amazon web service)Cloud Computing (EMR)
  • Configured authentication and authorization with Kerberos Centrify, and LDAP.
  • Continuous monitoring and managing the Hadoop cluster using Cloudera Manager.
  • Installed Oozie workflow engine to run multiple Hive and Pig jobs.
  • Load log data into HDFS using Flume. Worked extensively in creating MapReduce jobs to power data for search and aggregation.
  • Implemented Spark using Scala and SparkSQL for faster testing and processing of data.
  • Worked on continuous scheduling of informatica workflow.
  • Carried out Linux OS administration for Ubuntu version 14.04 and Red Hat 6.5.
  • Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
  • Implemented Spark to migrate map reduce jobs into Spark RDD transformations, streaming data using Apache Kafka and Spark streaming.
  • Developed PIG Latin scripts to extract the data from the local file system output files to load into HDFS.
  • Installed and configured Hadoop MapReduce, HDFS, Developed multiple MapReduce jobs in java for data cleaning and preprocessing.

Confidential, Lakewood, NJ

Hadoop Developer

Responsibilities:

  • Developed simple and complex MapReduce programs in Java for Data Analysis on different data formats.
  • Create a complete processing engine, based on Hortonworks’ distribution, enhanced to performance .
  • Worked on importing and exporting data from RDBMS into HDFS, HIVE and HBase using Sqoop.
  • Responsible for Creating Workflows and Sessions using Informatica workflow manager and Monitor the workflow run and statistic properties on Informatica Workflow Monitor.
  • Developed Sqoop commands to pull the data from Teradata, Oracle and export into Greenplum.
  • Send data extract to SAS for analytics purpose.
  • Responsible for Installation and configuration of Hive, Pig, Sqoop, Flume and Oozie ontheHadoop cluster.
  • Developed workflows using Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
  • Experience in writing Pig Latin scripts to sort, group, join and filter the data as part of data transformation as per the business requirements.
  • Implemented scripts to transmit sysprin information from Oracle to HDFS using Sqoop.
  • Worked on partitioning the HIVE table and running the scripts in parallel to reduce the run time of the scripts.
  • Designed and deployed scalable, highly available and fault tolerant systems on Amazon AWS in different region/zone.
  • Load log data into HDFS using Kafka.
  • Stored MapReduce program output in Amazon and developed a script to move the data to RedShift for generating a dashboard using QlikView.
  • Optimized Map/Reduced Jobs to use HDFS efficiently by using various compression mechanisms.
  • Used Spark API over Hortonworks Hadoop YARN to perform analytics on data in Hive.
  • Implemented pushing the data from Hadoop to Greenplum.
  • Developed analytical components using Scala, Spark and Spark Stream
  • Extracted files from MySQL/DB2 through Sqoop and placed in HDFS and processed.
  • Analyzes the data by performing Hive queries and running Pig scripts to study data.
  • Implemented business logic by writing Pig UDF's in Java and used various UDF's from Piggybanks and other sources.
  • Implemented and delivered AWS(Amazon web service) infrastructure projects into operational delivery.
  • Continuously monitored and managed the Hadoop cluster using Ganglia.
  • Implemented real-time analytics with Apache Kafka and storm.
  • Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
  • Involved in writing queries in Sparksql using Scala.
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
  • Developed Sqoop scripts to handle change data capture for processing incremental records between new arrived and existing data in RDBMS tables.
  • Supported in settling up QA environment and updating configuration for implementing scripts with Pig and Sqoop.
  • Extracted data from DB2 and Oracle source systems and loaded into Flat files using Informatica.
  • Deployed custom configured cluster on Amazon AWS.
  • Wrote MapReduce job using Pig Latin, MySQL to HDFS on regular basis .
  • Implemented testing scripts to support test driven development and continuous integration.
Confidential, Little Rock, AR

Java developer

Responsibilities:

  • Assisted the analysis team in performing the feasibility analysis of the project.
  • Designed Use Case diagrams, Class diagrams and Sequence diagrams and Object Diagrams in the detailed design phase of the project using Rational Rose 4.0.
  • Involved in the Analysis, Design, Implementation and Testing of Software Development Life Cycle (SDLC) of the project.
  • Developed presentation layer of the project using HTML, JSP 2.0, JSTL and JavaScript technologies. Experienced in developing web-based applications using Python, Django, PHP, C++, XML,CSS, HTML, DHTML,, JavaScript and Jquery.
  • Developed complete Business tier using Stateless and Stateful Session beans with EJB 2.0 standards using Web sphere Studio Application Developer (WSAD 5.0).
  • Used various J2EE design patterns, like DTO, DAO, and Business Delegate, Service Locator, Session Facade, Singleton and Factory patterns.
  • Used Linux OS to convert the existing application to Windows.
  • Consumed Web Service for transferring data between different applications.
  • Integrated Spring DAO for data access using Hibernate.
  • Written complex SQL queries, stored procedures, functions and triggers in PL/SQL.
  • Configured and used Log4J for logging all the debugging and error information.
  • Developed Ant build scripts for compiling and building the project.
  • Used IBM Websphere portal and IBM Websphere application server for deploying the applications.
  • Used CVS Repository for Version Control.
  • Created test plans and JUnit test cases and test suite for testing the application.
  • Good hands on UNIX commands, used to see the log files on the server.
  • Assisted in Developing testing plans and procedures for unit test, system test, and acceptance test.
  • Unit test case preparation and Unit testing as part of the development.
  • Used Log4J components for logging. Perform daily monitoring of log files and resolve issues.
  • Created hibernate mapping files to map POJO to DB tables.
Confidential, Charlotte,NC

Java Developer

Responsibilities:

  • Used Ajax for intensive user operations and client-side validations.
  • Developed the code using object oriented programming concepts.
  • Developed application service components and configured beans using Spring IoC, creation of Hibernate mapping files and generation of database schema.
  • Used Web Services for creating rate summary and used WSDL and SOAP messages for getting insurance plans from different module and used XML parsers for data retrieval.
  • Used JUnit for testing the web application.
  • Used JAXM for making distributed software applications communicate via SOAP and XML.
  • Used DB2 as backend database. Experienced in MVW frameworks like Django, Angular.js, JavaScript,JQueryandNode.js. .Expert knowledge of and experience in Object oriented Design and Programming concepts.
  • Used SQL statements and procedures to fetch the data from the DB2 database.
  • Involved in writing Spring Configuration XML, file that contains declarations and business classes are wired-up to the frontend-managed beans using Spring IOC pattern.
  • Involved in creating various Data Access Objects (DAO) for addition, modification and deletion of records using various specification files.
  • Developed Ant Scripts for the build process and deployed in IBM WebSphere.
  • Implemented Log4J for Logging Errors, debugging and tracking using loggers, appenders' components.
  • Developed user interface using JSP, HTML, XHTML and Java Script to simplify the complexities of the application.
  • Implemented Business processes such as user authentication, Transfer of Service using Session EJB's.
  • Involved in the Bug fixing of various applications reported by the testing teams in the application during the integration and used Bugzilla for the bug tracking.
  • Used Tortoise CVS as version control across common source code used by developers.
  • Deployed the applications on IBM Web Sphere Application Server.
  • Primarily responsible for design and development of Springs based applications.
  • Involved in configuring JDBC connection pooling to access the oracle database.
  • Used JUnit for Unit testing the application. Developed and maintained ANT Scripts

We'd love your feedback!