We provide IT Staff Augmentation Services!

Hadoop Administrator Resume

2.00/5 (Submit Your Rating)

Charleston, SC

SUMMARY:

  • An extensive experience of 8 years in all phases of Software Development Life Cycle on Java/J2EE, Big Data and cloud based applications spanning across technologies and business domains.
  • Over 4+ years of design and development experience in Big Data Hadoop technologies like building, loading and analysing data on AWS - EMR, Spark, HDFS & YARN Cluster, Redshift and knowledge on Azure cluster setup.
  • Experience working with Cassandra, HBase, MongoDB NoSQL database concepts of Spark Streaming, SparkSQL, Flume, Scala, MapReduce, Hive, Impala, Pig, Sqoop, Apache Drill and Oozie.
  • Involved in the Software Development Life Cycle (SDLC)phases which include Analysis, Design, Implementation, Testing and Maintenance.
  • Expertice in designing and deployment of Hadoop cluster and different Big Data analytic tools including Pig, Hive, HBase, Oozie, ZooKeeper, SQOOP, flume, Spark, Impala, Cassandra.
  • Hands on experience on major components in Hadoop Ecosystem like Hadoop Map Reduce, HDFS, HIVE, PIG, Pentaho, Hbase, Zookeeper, Sqoop, Oozie, Cassandra, Flume and Avro.
  • Experienced the deployment of Hadoop Cluster using Puppet tool
  • Work experience with cloud infrastructure of Amazon Web Services (AWS).
  • Experience in importing and exporting the data using Sqoop from HDFS to Relational Database systems/mainframe and vice-versa
  • Installing, configuring and managing of Hadoop Clusters and Data Science tools.
  • Managing the Hadoop distribution with Cloudera Manager, Cloudera Navigator, Hue.
  • Setting up the High-Availability for Hadoop Clusters components and Edge nodes.
  • Experience in developing Shell scripts and Python Scripts for system management.
  • Well versed in using Software development methodologies like Rapid Application Development (RAD), Agile Methodology and Scrum software development processes.
  • Experience with Object Oriented Analysis and Design (OOAD)methodologies.
  • Experience in installations of software, writing test cases, debugging, and testing of batch and online systems.
  • Experience in Production, quality assurance (QA), SIT (System Integration testing) and user acceptance (UA) testing.
  • Expertise in J2EEtechnologies like JSP, Servlets, EJBs 2.0, JDBC, JNDI and AJAX.
  • Extensively worked on implementing SOA (Service Oriented Architecture) using XML Web services (SOAP, WSDL, UDDI and XML Parsers).
  • Worked with XML parsers like JAXP (SAX and DOM)andJAXB.
  • Expertise in applying Java Messaging Service (JMS)for reliable information exchange across Java applications.
  • Proficient with Core Java, AWT and also with the markup languages like HTML 5.0,XHTML, DHTML, CSS, XML 1.1, XSL, XSLT, XPath, XQuery, Angular.js, Node.js
  • Worked with version control systems like Subversion, Perforce, and GIT for providing common platform for all the developers.
  • Articulate in written and verbal communication along with strong interpersonal, analytical, and organizational skills.
  • Highly motivated team player with the ability to work independently and adapt quickly to new and emerging technologies.
  • Creatively communicate and present models to business customers and executives, utilizing a variety of formats and visualization methodologies.

TECHNICAL SKILLS:

Hadoop Ecosystem/ Big Data: HDFS, MapReduce, Mahout, HBase, Pig, Hive, Sqoop, Flume, Power pivot, Puppet, oozie, Zookeeper, Apache spark, Splunk, YARN, Falcon, Avro, Impala

Frameworks in Hadoop: Spark, Kafka, Storm, Cloudera CDHs, Hortonworks HDPs, Hadoop1.0, Hadoop2.0

Databases, Application Servers &NoSQL Databases: Oracle, PL/SQL MySQL, DB2, Database Technologies MySQL, Oracle 8i, 9i, 11i & 10g, MS Access, Microsoft SQL-Server 2000 and DB2 8.x/9.x, PostgreSQL, Teradata, Cassandra, MongoDB, HBase

JAVA & J2EE Technologies: Core Java, Hibernate, Spring framework, JSP, Servlets, Java Beans, JDBC, EJB 3.0, Java Sockets & Java Scripts. jQuery, JSF, Prime Faces, Servlets, SOAP, XSLT and DHTML Messaging Services JMS, MQ Series, MDB, J2EE MVC, Struts 2.1, Spring 3.2, MVC, Spring Web, JUnit, MR-Unit

Amazon Web Services (AWS): Elastic Map Reduce(EMR), Amazon EC2, Amazon S3, AWS CodeCommit, AWS, CodeDeploy, AWS CodePipeline, Amazon CloudFront, AWS Import/Export.

Languages: C, C++, Java, J2EE, PL/SQL, Pig Latin, HiveQL, Unix shell scripts, Perl, Shell script, Seed, Ask, Java Script, XML, HTML, XHTML, JNDI, Python, Scala, HTML5, AJAX, jQuery, CSS, JavaScript, AngularJS, VB Script, WSDL, ODBC Architectures REST

Source Code Control: Github, CVS, SVN, Clearcase

IDE & Build Tools: Eclipse, Net Beans, Spring Tool Suite, Hue (Cloudera specific), Toad, Maven, ANT, Hudson, Sonar, JDeveloper, Assent PMD, DB Visualizer, Maven, Ant, grade

Web/Application servers: Apache Tomcat, WebLogic, JBoss, Web Logic, IBM Web Sphere

Analysis/Reporting: Ganglia, Nagios, Custom Shell scripts, Qlikview, Tableau, BOXI, ETL(Informatica)

Certifications: Sun Certified Java Programmer

Network Protocols: TCP/IP, UDP, HTTP, DNS, DHCP

Operating Systems: Sun Solaris, HP-UNIX, Red Hat Linux, Ubuntu Linux and Windows XP/Vista/7/8

SDLC: Agile, Scrum, Water fall, V- model, Spiral,Iterative and Incremental Methods

PROFESSIONAL EXPERIENCE:

Confidential, Charleston, SC

Hadoop Administrator

Responsibilities:

  • Implemented a Python-based distributed random forest via Hive and Python streaming
  • Involved in implementing security on Hortonworks Hadoop Cluster using with Kerberos by working along with operations team to move non secured cluster to secured cluster.
  • Responsible for upgrading Hortonworks Hadoop HDP2.2.0 and Mapreduce 2.0 with YARN in Multi Clustered Node environment. Handled importing of data from various data sources, performed transformations using Hive, Map Reduce, Spark and loaded data into HDFS.
  • Worked on Hadoop security setup using MIT Kerberos, AD integration(LDAP) and Sentry authorization.
  • Migrated services from a managed hosting environment to AWS including: service design, network layout, data migration, automation, monitoring, deployments and cutover, documentation, overall plan, cost analysis, and timeline.
  • Managing Amazon Web Services (AWS) infrastructure with automation and configuration management tools such as Chef, Ansible, Puppet, or custom-built .designing cloud-hosted solutions, specific AWS product suite experience.
  • Performed a Major upgrade in production environment from HDP 1.3 to HDP 2.2. As an admin followed standard Back up policies to make sure the high availability of cluster.
  • Monitored multiple Hadoop clusters environments using Ganglia and Nagios. Monitored workload, job performance and capacity planning using Ambari.
  • Monitored the servers and Linux scripts regularly and performed troubleshooting steps tested and installed the latest software on server for end-users.
  • Responsible for Patching Linux Servers.
  • Installed and configured Hortonworks and Cloudera distributions on single node clusters for POCs.
  • Implementing a Continuous Delivery framework using Jenkins, Puppet, Maven& Nexus in Linux environment. Integration of Maven/Nexus, Jenkins, Urban Code Deploy with Patterns/Release, Git, Confluence, Jira and Cloud Foundry
  • Involved in running Hadoop jobs for processing millions of records of text data. Troubleshoot the build issue during the Jenkins build process. Implemented Docker to create containers for Tomcat Servers, Jenkins.
  • Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required

Environment: Hortonworks Hadoop, Cassandra, Python, Flat files, Oracle 11g/10g, mySQL, UNIX, Toad 9.6, Windows NT, Sqoop, Hive, Oozie, Ambari, SAS, SPSS, Unix Shell Scripts, Zoo Keeper, Ambari, SQL, Map Reduce, Pig.

Confidential, Houston, TX

Hadoop Administrator

Responsibilities:
  • Involved in architecture design, development and implementation of Hadoop deployment, backup and recovery systems.
  • Developed Chef modules to automate the installation, configuration and deployment of ecosystem tools, OS's and network infrastructure at a cluster level.
  • Performed advanced procedures like text analytics and processing, using the in-memory computing capabilities of Spark using Scala.
  • Performed cluster co-ordination and assisted with data capacity planning and node forecasting using ZooKeeper.
  • Responsible for cluster maintenance, rebalancing blocks, commissioning and decommissioning of nodes, monitoring and troubleshooting, manage and review data backups and log files.
  • Wrote MapReduce jobs in Java to standardize the data and clean it and calculate aggregates.
  • Worked with ETL workflow, analysis of big data and loaded them into Hadoop cluster,ImplementedPig Latin scripts to sort, group, join and filter the data.
  • Implemented Pig UDFs for evaluation, filtering, loading and storing of data for functionalities, which cannot be achieved using built-in Pig functions.
  • Created internal and external Hive tables, defined static and dynamic partitions as per requirement for optimized performance.
  • Used Sqoop to transfer data between RDBMS and HDFS and vice versa.
  • Created Hive based reports and wrote customized Hive UDFs in Java.
  • Created the HBase Tables and inserted data into it.
  • Worked on HBase NoSQL database architecture for data read/write.
  • Integrated the hive warehouse with HBase.
  • Implemented test scripts to support test driven development and continuous integration.
  • Monitored System health logs and responds accordingly to any warning or failure conditions.
  • Effectively used Oozie to develop automatic workflows of Sqoop, MapReduce and Hive jobs.
  • Environment

Environment: Hortonworks (HDP 2.2), HDFS, MapReduce, Apache Cassandra,YARN, Spark, Scala,Hive, Pig, Flume, Sqoop, Puppet, Oozie, ZooKeeper, Ambari, Oracle Database, MySQL, HBase, SparkSQL, Avro, Parquet, RCFile, JSON, UDF, Java (jdk1.7), CentOS

Confidential, Norman, OK

Hadoop Administrator/Developer

Responsibilities:
  • Setting the cluster, configuration and maintenance, install components of the Hadoop ecosystem.
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
  • Stored data from HDFS to respective Hive tables for business analysts to conduct further analysis in identifying data trends.
  • Worked in managing VMs in Amazon using AWS and EC2
  • Develop cloud formation scripts to build on demand EC2 instance formation.
  • Encrypted data on server and client side to maintain edge location to cache data with CDN
  • Deployed a distributed in-memory cache environment in the cloud using Elastic cache to deliver data with less latency
  • Configured and scheduled the scripts to automate the module installation in the environment.
  • Extracted the data from MySQL, Oracle, SQL Server and loaded data into HDFS
  • Applied redirection rules in Apache based on redirection conditions provided by developers.
  • Used Scoop to dump data from relational database into HDFS for processing.
  • Developed Hivead-hoc queries and filtered data in order to increase the effectiveness of the process execution by using functions like Joins, Group By, and Having.
  • Increased the time efficiency of the Hive QL using partitioning of data and reduced the time difference of executing the sets of data by applying the compression techniques like SNAPPY for Map-Reduce Jobs.
  • Created Hive Partitions for storing data for different trends under different partitions.
  • Connected the Hive tables to data analysis tools like Tableau for graphical representation of the trends.
  • Assisted project manager in problem shooting relevant to Hadoop technologies for data integration between different platforms like Sqoop-Sqoop, Hive-Sqoop, and Sqoop-Hive.

Environment: Java 7, HBase, HDFS, MapReduce, Hadoop 2.0, Hive, Pig, Eclipse, Linux, Sqoop, MySQL, Agile, Kafka, Cognos

Confidential, Durham, NC

Hadoop Developer

Responsibilities:
  • Implemented on Hadoop scaling from 6 nodes in POC environment to 10 nodes in development and ended up with 40 nodes of clusters in pilot environment (prod).
  • Included in complete Implementation lifecycle, spent significant time in composing customized MapReduce, Pig and Hive programs.
  • Solid involvement with Big data processing using Hadoop technologies HDFS, MapReduce, Crunch, Hive and Pig.
  • Written Hive jobs to parse the logs and structure them in tabular format to facilitate effective querying on the log data.
  • Involved in creating Hive tables, loading with data and writing hive queries that will run internally in map reduce way. Broadly used Hive/HQL or Hive queries to query or search for a particular string in Hive tables in HDFS.
  • Did various performance optimizations like using distributed cache for small datasets, Partition, Bucketing in hive and Map Side joins.
  • Expert in developing customized user define functions (UDF's) in java to extend Hive and Pig Latin functionality.
  • Developed Pig program for loading and filtering the streaming data into HDFS using Flume.
  • Applied Pig to do transformations, event joins, filter boot traffic and some pre-aggregations before storing the data onto HDFS.
  • Experience on tuning the performance of Pig queries.
  • Involved in developing Pig Scripts for change data capture (CDC) and delta record processing between newly arrived data and already existing data in HDFS.
  • Scheduling Jobs Managing to remove the duplicate log data files in HDFS using Oozie.
  • Used Flume extensively in gathering and moving log data files from Application Servers to a central location in Hadoop Distributed File System (HDFS).
  • Created Mappings using TalendOpen Studio for Evaluation and POC.
  • Designed Risk Audit process for a Healthcare client and created Risk assessment database on hive for performing Risk Assessment and Audits and used Tableau for visualization.

Environment: Hadoop 2.x, HDFS, Map Reduce, Flume, Hive 0.10, Pig 0.11, Sqoop, HBase, YARN, Shell Scripting, Maven, Git Hub, Ganglia, Apache Solr, AWS, Talend Open studio for Big data, Java and Cloudera.

Confidential, Philadelphia, PA

Java/Hadoop Developer

Responsibilities:
  • Design and creation of GUI screens using JSP, Servlets and HTML based on Struts MVC Framework.
  • Operated JDBC to access Database.
  • Manipulated JavaScript for client side validation.
  • Validations were performed using Struts Validation Framework.
  • Commit and Rollback methods were provided for transactions processing.
  • Designed and developed the action form beans and action classes and implemented MVC using Struts framework.
  • Written Oracle SQL Stored procedures, functions and triggers.
  • Developed both Session and Entity beans representing different types of business logic abstractions.
  • Maintained the server log document.
  • Performed Unit /Integration testing for the test cases.
  • Implemented and designed user interface for web based customer application.
  • Understanding business needs, analyzing functional specifications and map those to develop and designing MapReduce programs and algorithms.
  • Written Pig and Hive jobs to parse the logs and structure them in tabular format to facilitate effective querying on the log data. Also have hand on Experience on Pig and Hive User Define Functions (UDF).
  • Execution of Hadoop ecosystem and Applications through Apache HUE.
  • Optimizing Hadoop MapReduce code, Hive/Pig scripts for better scalability, reliability and performance.

Environment: Java, JSP, HTML, CSS, Java Script, JQuery, Struts 2.0, MySQL, Oracle, Hibernate, JDBC, Eclipse, SQL Stored Procedures, Tomcat, Hive, Pig, Sqoop, Flume and Cloudera.

Confidential, Bothell, WA

Associate Java Developer

Responsibilities:

  • Developed web pages using Struts framework, JSP, XML, JavaScript, Html/ DHTML and CSS, configure struts application, use tag library.
  • Embedded a custom-built Java application in Sales Cloud using JSON Web Token (JWT) as the security mechanism.
  • Developed Application using Spring and Hibernate, Spring batch, Web Services like Soap and restful Web services.
  • Used Spring Framework at Business Tier and also Spring's Bean Factory for initializing services.
  • Used AJAX, JavaScript to create interactive user interface.
  • Implemented client side validations using JavaScript & server side validations.
  • Developed Single Page application using angular JS & backbone JS.
  • Developed app using Front Controller, Business delegate, DAO and Session Facade Patterns.
  • Implemented Hibernate to persist the data into Database and wrote HQL based queries to implement CRUD operations on the data.
  • Used Hibernate annotations and created Hibernate POJOs.
  • Developed Web Services to communicate to other modules using XML based SOAP and WSDL.
  • Designed and implemented (SOA, SOAP) next generation system on distributed platform.
  • Designed and developed most of the application's GUI screens using GWT framework.
  • Used JAXP for Xml parsing & JAXB for marshalling & un marshalling.
  • Used SOAP-UI to test the Web Services using WSDL.
  • Involved in doing analysis on DB Schema as per new design in DB2 from Oracle.
  • DOJO toolkit Used for UI development and sending asynchronous AJAX requests to the server.

Environment: Java/J2EE, JSP, Servlets, EJB, XML, XSLT, Struts, Rational Rose, Apache Struts Framework, Web Services, DB2, Beyond Compare, Web Services, CVS, JUnit, Log4j, Windows XP, Red Hat LINUX.

Confidential

Associate Java Developer

Responsibilities:

  • Responsible for the Requirement Analysis and Design of Smart Systems Pro (SSP)
  • Involved in Object Oriented Design (OOD) and Analysis (OOA).
  • Analysis and Design of the Object models using JAVA/J2EE Design Patterns in various tiers of the application.
  • Worked with Restful Web Services and WSDL.
  • Worked with Maven build tool to build the Project.
  • Involved in Coding JavaScript code for UI validation and worked on Struts validation frameworks.
  • Analyzing the Client Requirements and designing the specification document based on the requirements.
  • Worked on implementing directives and scope values using AngularJs for an existing webpage.
  • Familiar with the state-of-the-art standards, processes, design processes used in creating and designing optimal UI using Web 2.0 technologies like Ajax, JavaScript, CSS, and XSLT.
  • Involved in the Preparation of Program Specification and Unit Test Case Document.
  • Designed the Proto according to the Business requirements.
  • Developed the web tier using JSP, Struts MVC to show account details and summary.
  • Used Struts Tiles Framework in the presentation tier.
  • Designed and developed the UI using Struts view component, JSP, HTML, CSS and JavaScript.
  • Used AJAX for asynchronous communication with server
  • Utilized Hibernate for Object/Relational Mapping purposes for transparent persistence onto the SQL Server database.

Environment: Java, J2EE Servlet, JSP, JUnit, AJAX, XML, JSON, CSS, JavaScript, Spring, Struts, Hibernate, Eclipse, Apache Tomcat, and Oracle.

We'd love your feedback!