Sr.hadoop Developer Resume
Plano, TX
SUMMARY
- Having 10+ years of overall IT experience working as a Hadoop Developer in dealing wif Apache Hadoop components like HDFS, Map Reduce, Hive QL, HBase, Pig, Hive, Sqoop, and Oozie, Spark and Scala and also as a Java Developer( 6 years)wif Java and Object - Oriented Methodologies for wide range of development from Enterprise applications to web-based applications.
- Experience in configuring, installing, benchmarking and managing Apache Hadoop, Cloudera Hadoop distribution.
- Experience in deploying scalable Hadoop cluster on Cloud environment like Amazon AWS, Rack-Space and Amazon S3 and S3N as underlying file system for Hadoop.
- Experience in designing and implementation of secure Hadoop cluster using Kerberos.
- Experience in managing teh cluster resources by implementing fair scheduler and capacity scheduler.
- Experience in implementing Hadoop as high available service.
- Experience in upgrading Hadoop cluster to major versions.
- Experience in using Zookeeper for coordinating teh distributed applications.
- Experience in deploying and managing teh Hadoop cluster using Cloudera Manager.
- Experience in developing Map-Reduce programs and custom UDF’s for data processing using Python.
- Experience in developing SCALA scripts to run in SPARK cluster.
- Hands on experience in installing, configuring, and using Hadoop ecosystem components like Hadoop Map Reduce, HDFS, HBase, Hive, Sqoop, Pig, Zookeeper and Flume.
- Good Knowledge on Hadoop Cluster architecture and monitoring teh cluster.
- In-depth understanding of Data Structure and Algorithms.
- Experience in managing and troubleshooting Hadoop related issues.
- Expertise in setting up standards and processes for Hadoop based application design and implementation.
- Importing and exporting data using Sqoop from Relational Database Systems to HDFS and vice-versa.
- Experience in Object Oriented Analysis, Design (OOAD) and development of software using UML Methodology, good noledge of J2EE design patterns and Core Java design patterns.
- Expertise in various JAVA/J2EE technologies such as JSP 2.0, Servlets 2.x, Struts 1.2/2.0, Hibernate 2.0/3.0 ORM, Spring 2.0/3.0, JDBC.
- Very good experience in complete project life cycle (design, development, testing and implementation) of Client Server and Web applications.
- Possess excellent communication and analytical skills along wif a can - do attitude.
- Experience in developing Web Applications wif various Open Source frameworks: Spring, Hibernate 2.0/3.0 ORM Frameworks.
- Proficient in Core Java, J2EE, JDBC, Servlets, JSP, Exception Handling, Multithreading, EJB, XML, HTML5, CSS3, JavaScript, AngularJS.
- Used source debuggers and visual development environments.
- Experience in Testing and documenting software for client applications.
- Writing code to create single-threaded, multi-threaded or user interface event driven applications, either stand-alone and those which access servers or services.
- Good experience in object oriented design (OOPS) concepts.
- Good experience in using Data Modelling techniques to find teh results based on SQL and PL/SQL queries.
- Good working noledge on Spring Framework.
- Strong Experience in writing SQL queries.
- Experience working wif different databases, such as Oracle, SQL Server, MySQL and writing stored procedures, functions, joins, and triggers for different Data Models.
- Expertise in implementing Service Oriented Architectures (SOA) wif XML based Web Services (SOAP/REST).
TECHNICAL SKILLS
Big Data Technologies: Hadoop, HDFS, Hive, MapReduce, Pig, Sqoop, Flume, Oozie, Hadoop distribution, and Hbase,Spark
Programming Languages: Java (5, 6, 7),Python,Scala
Databases/RDBMS: MySQL, SQL/PL-SQL, MS-SQL Server 2005, Oracle 9i/10g/11g
Scripting/ Web Languages: JavaScript, HTML5, CSS3, XML, SQL, Shell
ETL Tools: Cassandra, HBASE,ELASTIC SEARCH
Operating Systems: Linux, Windows XP/7/8
Software Life Cycles: SDLC, Waterfall and Agile models
Office Tools: MS-Office,MS-Project and Risk Analysis tools, Visio
Utilities/Tools: Eclipse, Tomcat, NetBeans, JUnit, SQL, SVN, Log4j, SOAP UI, ANT, Maven, Automation and MR-Unit
Cloud Platforms: Amazon EC2
PROFESSIONAL EXPERIENCE
Confidential, Plano, TX
Sr.Hadoop Developer
Responsibilities:
- Worked on Big Data Hadoop cluster implementation and data integration in developing large-scale system software.
- Developed Storm topology to ingest data from various source into Hadoop Data Lake.
- Configured ActiveMQ for enterprise and resolved ActiveMQ issues
- Developed web application using HBase and Hive API to compare schema between HBase and Hive tables.
- Used JVM monitor to monitor threads and memory usage of HBase and Hive schema check web application.
- Developed code to generate Hive DDL’s from source DDL’s
- Created HBase tables using Sqoop from Relational Database Oracle
- Developed Python Script to import JSON data into MYSQL database.
- Developed Python Script to import data SQL Server into HDFS and created Hive views on data in HDFS using Spark.
- Created scripts to append data from temporary HBase table to target HBase table in Spark.
- Worked on NOSQL Data bases such as Hbase, also usedSPARKfor real time streaming of data into teh cluster.
- Written Spark programs in Scala and ran Spark jobs on YARN.
- Developed complex and Multi-step data pipeline using Spark.
- Involved in converting Hive/SQL queries into Spark transformations using Spark RDDS and Scala.
- Worked on Big Data Integration and Analytics based on Hadoop, SOLR, Spark, Kafka, Storm and web Methods technologies. .
- Populated HDFS and Cassandra wif huge amounts of data using Apache Kafka.
- Monitoring YARN applications. Troubleshoot and resolve cluster related system problems.
- Upgrading teh Hadoop Cluster from CDH3 to CDH4, setting up High availability Cluster and integrating HIVE wif existing applications.
- Assessed existing and EDW (enterprise data warehouse) technologies and methods to ensure our EDW/BI architecture meet teh needs of teh business and enterprise and allows for business growth.
- Developed MapReduce programs to parse teh raw data, populate staging tables and store teh refined data in partitioned tables in teh EDW.
- Capturing data from existing databases that provide MySQL interfaces using Sqoop.
- Worked extensively wif Sqoop for importing and exporting teh data from HDFS to Relational Database systems/mainframe and vice-versa loading data into HDFS.
- Created Hive queries that halped market analysts spot emerging trends by comparing fresh data wif EDW reference tables and historical metrics.
- Enabled speedy reviews and first mover advantages by using Oozie to automate data loading into teh Hadoop Distributed File System and PIG to pre-process teh data.
- Developed multiple Map Reduce jobs in java for data cleaning and preprocessing.
- Developed Map Reduce pipeline jobs to process teh data and create necessary HFiles.
- Involved in loading teh created HFiles into Hbase for faster access of large customer base wifout taking Performance hit.
- Collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis
- Involved in creating Pig tables, loading wif data and writing Pig Latin queries which will run internally in Map Reduce way.
- Involved in writing Unix/Linux Shell Scripting for scheduling jobs and for writing pig scripts and hive QL.
- Developed Scripts and automated data management from end to end and sync up between all teh clusters.
- Involved in creating Hive Tables, loading wif data and writing Hive queries which will invoke and run MapReduce jobs in teh backend.
- Assisted in performing unit testing of Map Reduce jobs using MRUnit.
- Assisted in exporting data into Cassandra and writing column families to provide fast listing outputs.
- Used Oozie Schedulersystems to automate teh pipeline workflow and orchestrate teh map reduce jobs that extract
- Used Zookeeper for providing coordinating services to teh cluster.
- Worked wif Hue GUI in scheduling jobs wif ease and File browsing, Job browsing, Metastore management.
Environment: Apache Hadoop, HDFS, Hive, Java, Sqoop, Spark, Cloudera CDH4, Oracle, MySQL, Tableau, Talend, Elastic search, Kibana, SFTP.
Confidential, Kansas City, MO
Software Developer - Big Data
Responsibilities:
- Worked wif BI team in teh area of Big Data Hadoop cluster implementation and data integration in developing large-scale system software.
- Installed/Configured/Maintained Apache Hadoop clusters for application development and Hadoop tools like Hive, Pig, HBase, Flume, Oozie Zookeeper and Sqoop.
- Responsible for cluster maintenance, adding and removing cluster nodes, cluster monitoring and troubleshoot managing and reviewing data backups and Hadoop log files.
- Continuous monitoring and managing teh Hadoop cluster through Cloudera Manager.
- Extensively involved in Installation and configuration of Cloudera distribution Hadoop, NameNode, Secondary NameNode, JobTracker, TaskTrackers and DataNodes.
- Created POC to store Server Log data in MongoDB to identify System Alert Metrics.
- Implemented Hadoop framework to capture user navigation across teh application to validate teh user interface and provide analytic feedback.
- Monitored Hadoop cluster job performance, performed capacity planning and managed nodes on Hadoop cluster.
- Used Zookeeper operational services for coordinating cluster and scheduling workflows.
- Proficient in using Cloudera Manager, an end to end tool to manage Hadoop operations.
- Worked wif application teams to install operating system, Hadoop updates, patches, version upgrades as required.
- Loaded data into teh cluster from dynamically generated files using Flume and from relational database management systems using Sqoop.
- Performed analysis on teh unused user navigation data by loading into HDFS and writing MapReduce jobs. Teh analysis provided inputs to teh new APM front end developers and lucent team.
- Wrote MapReduce jobs using Java API and Pig Latin.
- Loaded teh data from Teradata to HDFS using Teradata Hadoop connectors.
- Used Flume to collect, aggregate and store teh web log data onto HDFS.
- Wrote Pig scripts to run ETL jobs on teh data in HDFS and further do testing.
- Used Hive to do analysis on teh data and identify different correlations.
- Involved in HDFS maintenance and administering it through Hadoop-Java API.
- Imported data using Sqoop to load data from MySQL to HDFS and Hive on regular basis.
- Written Hive queries for data analysis to meet teh business requirements.
- Automated all teh jobs, for pulling data from FTP server to load data into Hive tables, using Oozie workflows.
- Involved in creating Hive tables & working on them using HiveQL and perform data analysis using Hive and Pig.
- Supported Map Reduce Programs those are running on teh cluster.
- Maintaining and monitoring clusters. Loaded data into teh cluster from dynamically generated files using Flume and from relational database management systems using Sqoop.
- Used Qlikview and D3 for visualization of query required by BI team.
- Defined UDFs using PIG and Hive in order to capture customer behavior.
- Design and implement MapReduce jobs to support distributed processing using java, Hive and Apache Pig.
- Create Hive external tables on teh MapReduce output before partitioning, bucketing is applied on it.
- Maintenance of data importing scripts using Hive and MapReduce jobs.
- Orchestrated hundreds of Sqoop scripts, pig scripts, hive queries using oozie workflows and sub-workflows.
- Loaded teh load ready files from mainframes to Hadoop and files were converted to ASCII format.
- Configured Hive Server (HS2) to enable analytical tools like Tableau, Qlikview and SAS to interact wif Hive tables.
Environment: Hadoop, MapReduce, HDFS, Hive, Java, SQL, Cloudera Manager, Pig, Sqoop, ZooKeeper, Teradata, PL/SQL, MySQL, Windows, Hbase.
Confidential, Costa Mesa, CA
Hadoop Developer
Responsibilities:
- Involved in Installation and configuration of JDK, Hadoop, Pig, Sqoop, Hive, HBase on Linux environment. Assisted wif performance tuning and monitoring.
- Worked on creating MapReduce programs to parse teh data for claim report generation and running teh Jars in Hadoop. Co-ordinate wif Java team in creating MapReduce programs.
- Worked on creating Pig scripts for most modules to give comparison effort estimation on code development.
- Created reports for teh BI team using Sqoop to export data into HDFS and Hive
- Collaborated wif BI teams to ensure data quality and availability wif live visualization
- Created HIVE Queries to process large sets of structured, semi-structured and unstructured data and store in Managed and External tables.
- Analyzed large data sets by running Hive queries and Pig scripts.
- Involved in creating Hive tables, and loading and analyzing data using Hive queries.
- Developed Simple to Complex Map Reduce Jobs using Hive and Pig.
- Involved in running Hadoop Jobs for processing millions of records of text data.
- Developed teh application by using teh Struts framework.
- Created connection through JDBC and used JDBC statements to call stored procedures.
- Developed Pig Latin scripts to extract teh data from teh web server output files to load into HDFS.
- Developed teh Pig UDF’S to pre-process teh data for analysis.
- Implemented multiple Map Reduce Jobs in java for data cleansing and pre-processing.
- Moved all RDBMS data into flat files generated from various channels to HDFS for further processing.
- Developed job workflows in Oozie to automate teh tasks of loading teh data into HDFS.
- Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and extracted data from Teradata into HDFS using Sqoop.
- Writing teh script files for processing data and loading to HDFS.
Environment: Hadoop, MapReduce, HDFS, Pig, Hive, Java (jdk1.7), Flat files, Oracle 11g/10g, PL/SQL, SQL*PLUS, Windows NT, Sqoop.
Confidential, Herndon VA
Hadoop Developer
Responsibilities:
- Processed data into HDFS by developing solutions.
- Analyzed teh data using Map Reduce, Pig, Hive and produce summary results from Hadoop to downstream systems.
- Used Pig as ETL tool to do transformations, event joins and some pre-aggregations before storing teh data onto HDFS.
- Developed data pipeline using flume, Sqoop and pig to extract teh data from weblogs and store in HDFS.
- Used Sqoop to import and export data from HDFS to RDBMS and vice-versa.
- Created Hive tables and involved in data loading and writing Hive UDFs.
- Exported teh analyzed data to teh relational database MySQL using Sqoop for visualization and to generate reports.
- Created HBase tables to load large sets of structured data.
- Managed and reviewed Hadoop log files.
- Involved in providing inputs for estimate preparation for teh new proposal.
- Worked extensively wif HIVE DDLs and Hive Query language (HQLs).
- Developed UDF, UDAF, UDTF functions and implemented it in HIVE Queries.
- Implemented SQOOP for large dataset transfer between Hadoop and RDBMs.
- Created Map Reduce Jobs to convert teh periodic of XML messages into a partition avro Data.
- Used Sqoop widely in order to import data from various systems/sources (like MySQL) into HDFS.
- Created components like Hive UDFs for missing functionality in HIVE for analytics.
- Developing Scripts and Batch Job to schedule a bundle (group of coordinators) which consists of various.
- Used different file formats like Text files, Sequence Files, Avro.
- Cluster co-ordination services through Zookeeper.
- Assisted in creating and maintaining Technical documentation to launching HADOOP Clusters and even for executing Hive queries and Pig Scripts.
- Assisted in Cluster maintenance, cluster monitoring, adding and removing cluster nodes and
- Trouble shooting.
- Installed and configured Hadoop, Map Reduce, HDFS, Developed multiple Map Reduce jobs in java for data cleaning and pre-processing.
Environment: Hadoop, HDFS, Map Reduce, Hive, Pig, Sqoop, HBase, Shell Scripting, Oozie, Oracle 11g.
Confidential, Norristown, PA
Java Developer
Responsibilities:
- Excellent JAVA, J2EE application development skills wif strong experience in Object Oriented Analysis, Extensively involved throughout Software Development Life Cycle (SDLC
- Implemented various J2EE standards and MVC framework involving teh usage of Struts, JSP, AJAX and servlets for UI design.
- Used SOAP/ REST for teh data exchange between teh backend and user interface.
- Utilized Java and MySQL from day to day to debug and fix issues wif client processes.
- Developed, tested, and implemented financial-services application to bring multiple clients into standard database format.
- Assisted in designing, building, and maintaining database to analyze life cycle of checking and debit transactions.
- Created web service components using SOAP, XML and WSDL to receive XML messages and for teh application of business logic.
- Involved in configuring web sphere variables, queues, DSs, servers and deploying EAR into Servers.
- Involved in developing teh business Logic using Plain Old Java Objects (POJOs) and Session EJBs.
- Developed autantication through LDAP by JNDI.
- Developed and debugged teh application using Eclipse IDE.
- Involved in Hibernate mappings, configuration properties set up, creating sessions, transactions and second level cache set up.
- Involved in backing up database & in creating dump files. And also creating DB schemas from dump files. Wrote developer test cases & executed. Prepared corresponding scope & traceability matrix.
- Implemented JUnit and JAD for debugging and to develop test cases for all teh modules.
- Hands-on experience of Sun One Application Server, Web logic Application Server, Web Sphere Application Server, Web Sphere Portal Server, and J2EE application deployment technology.
Environment: Java multithreading, JDBC, Hibernate, Struts, Collections, Maven, Subversion, JUnit, SQL language, Struts, JSP, SOAP, Servlets, Spring, Hibernate, Junit, Oracle, XML, Putty and Eclipse.
Confidential
Java Developer
Responsibilities:
- Involved in analysis and design phase of Software Development Life cycle (SDLC).
- Used JMS to pass messages as payload to track statuses, milestones and states in teh workflows.
- Involved in reading & generating pdf documents using ITEXT. And also merge teh pdfs dynamically.
- Involved in teh software development life cycle coding, testing, and implementation.
- Worked in teh health-care domain.
- Involved in Using Java Message Service (JMS) for loosely coupled, reliable and asynchronous exchange of patient treatment information among J2EE components and legacy system
- Developed MDBs using JMS to exchange messages between different applications using MQ Series.
- Involved in working wif J2EE Design patterns (Singleton, Factory, DAO, and Business Delegate) and Model View Controller Architecture wif JSF and Spring DI.
- Involved in Content Management using XML.
- Developed a standalone module transforming XML 837 module to database using SAX parser.
- Installed, Configured and administered WebSphere ESB v6.x
- Worked on Performance tuning of WebSphere ESB in different environments on different platforms.
- Configured and Implemented web services specifications in collaboration wif offshore team.
- Involved in Creating dash board charts (business charts) using fusion charts.
- Involved in creating reports for teh most of teh business criteria.
- Involved in teh configurations set for Web logic servers, DSs, JMS queues and teh deployment.
- Involved in creating queues, MDB, Worker to accommodate teh messaging to track teh workflows
- Created Hibernate mapping files, sessions, transactions, Query and Criteria’s to fetch teh data from DB.
- Enhanced teh design of an application by utilizing SOA.
- Generating Unit Test cases wif teh halp of internal tools.
- Used JNDI for connection pooling.
- Developed ANT scripts to build and deploy projects onto teh application server.
- Involved in implementation of continuous build tool as Cruise control using Ant
- Used Star Team as version controller.
Environment: JAVA/J2EE, HTML, JS, AJAX, Servlets, JSP, XML, XSLT, XPATH, XQuery, WSDL, SOAP, REST, JAX-RS, JERSEY, JAX-WS, Web Logic server 10.3.3, JMS, ITEXT, Eclipse, JUNIT, Star Team, JNDI, Spring framework - DI, AOP, Batch, Hibernate.
Confidential
Jr. Java Developer
Responsibilities:
- Involved in teh requirement analysis, design, and development of teh new NCP project.
- Involved in teh design and estimation of teh various templates, components which were developed using Day CMS (Communique).
- Teh CMS and Server side interaction was developed using Web services and exposed to teh CMS using JSON and JQuery.
- Designed and developed Struts like MVC 2 Web framework using teh front-controller design pattern, which is used successfully in a number of production systems.
- Worked on Java Mail API. Involved in teh development of Utility class to consume messages from teh message queue and send teh emails to customers.
- Normalized Oracle database, conforming to design concepts and best practices.
- Used JUnit framework for unit testing and Log4j to capture runtime exception logs.
- Performed Dependency Injection using spring framework and integrated wif Hibernate and Struts frameworks.
- Hands on experience creating shell and perl scripts for project maintenance and software migration. Custom tags were developed to simplify JSP applications.
- Applied design patterns and OO design concepts to improve teh existing Java/JEE based code base.
- Identified and fixed transactional issues due to incorrect exception handling and concurrency issues due to unsynchronized block of code.
- Used Validator framework of teh Struts for client side and server side validation.
- Teh UI was designed using JSP, Velocity template, JavaScript, CSS, JQuery and JSON.
- Enhanced teh FAS system using struts MVC and iBatis.
- Involved in developing web services using Apache XFire & integrated wif action mappings.
- Developed Velocity templates for teh various user interactive forms that triggers email to alias. Such forms largely reduced teh amount of manual work involved and were highly appreciated.
- Used Internalization, Localizations, tiles and tag libraries to accommodate for different locations.
- Used JAXP for parsing & JAXB for binding.
- Co-ordinate Application testing wif teh halp of testing team.
- Involved in writing services to write core logic for business processes.
- Involved in writing database queries, stored procedures, functions etc
- Deployed EJB Components on Web Logic, Used JDBC API for interaction wif Oracle DB.
- Involved in Transformations using XSLT to prepare HTML pages from xml files.
- Enhanced Ant Scripts to build and deploy applications
- Involved in Unit Testing, code review for teh various enhancements
- Followed coding guide lines while developing workflows.
- TEMPEffectively managed teh quality deliverables to meet deadlines.
- Involved in end to end implementation of teh application.
Environment: Java 1.4, J2EE (EJB, JSP/Servlets, JDBC, XML), Day CMS, XML, My Eclipse, Tomcat, Resin, Struts, iBatis, Web logic App server, DTD, XSD, XSLT, Ant, SVN.