- Over 8+ years of professional IT experience in requirement gathering, design, development, testing, implementation and maintenance.
- Progressive experience in all phases of the iterative Software Development Life Cycle (SDLC)
- In - depth knowledge and understanding of theHadoopCluster architecture and monitoring.
- Experience in managing and reviewingHadooplog files.
- Excellent understanding and experience with NOSQL databases like HBase with good exposure to MongoDB.
- Experience in implementing in setting up standards and processes forHadoopbased application design and implementation.
- Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice versa.
- Excellent understanding / knowledge ofHadooparchitecture and various components such as HDFS, Resource manager, Node manager, Name Node, Data Node and MapReduce/Tez programming paradigm.
- Hands on experience in installing, configuring, and usingHadoopecosystem components like HadoopMapReduce, TEZ, Apache SPARK, HDFS, HBase, Hive, Sqoop, Pig, Zookeeper, Storm and Flume.
- Good Exposure on ApacheHadoopMap Reduce programming, PIG and HIVE Scripting and Distributed Application and HDFS.
- Deep experience in managingHadoopclusters using Hortonworks Ambari /Cloudera manager Tools.
- Very good experience in complete project life cycle (design, development, testing and implementation) of Client Server and Web applications.
- Experience in Administering, Installation, configuration, troubleshooting, Security, Backup, Performance Monitoring and Fine-tuning of Linux (Redhat and CENTOS) .
- In-depth understanding and experience in using SPLUNK for monitoring and alerting.
- Extensive experience working in Oracle, DB2, SQL Server and My SQL database.
- Hands on experience in VPN, Cygwin, winSCP, VNCviewer, etc.
- Scripting to deploy monitors, checks and critical system admin functions automation.
- Ability to adapt to evolving technology, strong sense of responsibility and accomplishment.
Ecosystem Big Data: HDFS, HBase,MapReduce, Tez, Zookeeper, Hive, Pig, Sqoop,Ranger
Data Bases: Oracle (SQL & PL/SQL), MS SQL Server, MySQL
Web Technologies: HTML, XML, AJAX, SOAP, ODBC, JDBC, Java Beans, EJB, MVC, JSP, Servlets, Java Mail, Struts, Junit Frameworks MVC, Spring, Struts, Hibernate, .NET
Configuration Management Tools: TFS, CVS IDE / Testing Tools Eclipse.
Data warehousing and NoSQL Database: Hbase.
Methodologies: Agile, V-model
Operating System: Windows, UNIX, Linux
Confidential, San Francisco, CA
Hadoop Systems Administrator
- Deployed new Hadoop infrastructure, Hadoop cluster upgrades, and resource optimization.
- Hadoop system monitoring, maintenance, troubleshooting, performance tuning and capacity planning
- Reviewed, developed, and implemented strategies that preserve the availability, stability, security and scalability
- Interact with developers, data scientist and other operations teams to resolve cluster and job performance issues.
- Preparation of architecture, design and operational run book documentation.
- Participation in weekly on call support
- Versioning, change control, problem management.
- Managing day to day Access issues, creating permissions and adding users for HDFS access
Environment: Hortonworks HDP 2.4.2,Centos 6.5, HBase 220.127.116.11.4,HDFS 18.104.22.168.4, Hive 22.214.171.124.4, Oozie 126.96.36.199.4,Pig 0.15.0.2.4 , Sqoop 188.8.131.52.4, Storm 0.10.0.2.4 , Tez 0.7.0.2.4, YARN + MapReduce 2 184.108.40.206.4 , ZooKeeper 220.127.116.11.4, Falcon 0.6.1.2.4, Flume 18.104.22.168.4.
Confidential, Foster City, CA
- Installed and configuredHadoopMapreduce, HDFS, Developed multiple MapReduce jobs in java for data cleaning and preprocessing.
- Developed workflows using custom MapReduce, Pig, Hive, Sqoop
- Tuned the cluster for optimal performance to process these large data sets
- Built reusable Hive UDF libraries for business requirements which enabled users to use these UDF's in Hive querying
- The logs and semi structured content that are stored on HDFS were preprocessed using PIG and the processed data is imported into Hive warehouse which enabled business analysts to write Hive queries
- Configured big data workflows to run on the top ofHadoopusing Control M and these workflows comprises of heterogeneous jobs like Pig, Hive, Sqoop and MapReduce
- Developed suit of Unit Test Cases for Mapper, Reducer and Driver classes using MR Testing library
- Developed workflow in Control M to automate tasks of loading data into HDFS and preprocessing with PIG
- Used Maven extensively for building jar files of MapReduce programs and deployed to Cluster
- Bug fixing and 247production support
Environment: CDH3, PIG(0.8.1), HIVE(0.7.1), Sqoop (V1), Oozie (V2.3.2), Core Java, Oracle 11g, SQL Server 2008, Hbase, ClouderaHadoopDistribution, MapReduce, DataStax, IBM DataStage 8.1, Oracle 11g / 10g, PL/SQL, SQL*PLUS, Toad 9.6, Windows NT, LINUX, UNIX Shell Scripting.
Confidential, New York, NY
- Involved in review of functional and nonfunctional requirements.
- Facilitated knowledge transfer sessions.
- Installed and configuredHadoopMap reduce, HDFS, Developed multiple Map Reduce jobs in java for data cleaning and preprocessing.
- Importing and exporting data into HDFS and Hive using Sqoop.
- Experience in defining job flows.
- Experience in managing and reviewingHadooplog files.
- Extracted files from RDBMS through Sqoop and placed in HDFS and processed.
- Experience in runningHadoopstreaming jobs to process terabytes of xml format data.
- Got good experience with NOSQL database.
- Supported Map Reduce Programs those are running on the cluster.
- Involved in loading data from UNIX file system to HDFS.
- Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.
- Replaced default Derby metadata storage system for Hive with MySQL system.
- Executed queries using Hive and developed Map Reduce jobs to analyze data.
- Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS.
- Developed the Pig UDF's to preprocess the data for analysis.
- Developed Hive queries for the analysts.
- Involved in loading data from LINUX and UNIX file system to HDFS.
- Supported in setting up QA environment and updating configurations for implementing scripts with Pig.
- Developed a custom File System plug in forHadoopso it can access files on Data Platform. This plugin allowsHadoopMapReduce programs, HBase, Pig and Hive to work unmodified and access files directly.
- Designed and implemented Mapreduce based large scale parallel relation learning system
- Extracted feeds form social media sites such as Facebook, Twitter using Python scripts.
- Setup and benchmarkedHadoop/HBase clusters for internal use
- SetupHadoopcluster on Amazon EC2 using whirr for POC.
- Wrote recommendation engine using mahout.
Environment: Java, Eclipse, Oracle 10g, Sub Version,Hadoop, Hive, HBase, MapReduce, HDFS, Pig Hive, Cassandra, Java (JDK 1.6),HadoopDistribution of Cloudera, MapReduce, IBM DataStage 8.1, Oracle 11g / 10g, PL/SQL, SQL*PLUS, Toad 9.6, Windows NT, LINUX, UNIX Shell Scripting.
Confidential, Albany, NY
- Involved in business requirements analysis.
- Built the application using Struts framework with JSP as view part.
- Developed Dispatch Actions, Action Forms and Custom taglibs in Struts framework.
- Designed JSP pages as view in Struts for frontend templates.
- Developed Session Beans for handling the back business requirements.
- Used the RSD IDE for development and Clear Case for the versioning.
- Involved in configuring the resources and administering the Web sphere application server 6.
- Built and deployed the application on Web sphere application server.
- Written stored procedures in DB2.
- Developed code to handle web requests involving Request Handlers, Business Objects, and Data Access
- Objects. Has coded different package structures based on the purpose and security issues handled by that particular package which assistsdevelopersin future enhancements or modifications of code.
- Involved in code reviews, system integration and testing. Developed unit test cases using JUnit framework.
- Involved in deploying the application on UNIX (DEV, QA and Prod Environments) box.
- Used Change management tool Service Center for promoting the War file from one environment to other.
- Involved in user acceptance testing, fixing bugs and Production support.
Environment: Java Java, J2EE, Apache Struts, Websphere 5 & 6, JNDI, JDBC, JSP, UNIX and Windows NT, DB2 and SQL Server.
Confidential, Shelton, CT
- Involved in Analysis, Design, Implementation, and Testing of the project.
- Developed web components using JSP, Servlets and JDBC.
- Implemented database using SQL Server
- Designed Tables and indexes
- Wrote complex TSQL and Stored Procedures.
- Involved in fixing defects and unit testing with test cases using JUnit.
- Developed user and technical documentation.
- Creating of the database tables, writing the queries and stored procedures.
- Coding Java, JSP, and Servlets using the extended Contata Struts framework.
- Used JNI for calling the libraries and other implemented functionality in C language.
- Involved in writing the programs for the XA transaction management on multiple databases of the application.
- Writing stored procedures & functions (TSQL equal to PL/SQL) in the Sql server DB.
- Used the Stax API / JAXP to read / manipulate the xml properties files.
- Review, Deploying.
- Junit Testing.