- Software professional having 7+ years of Industry Experience as a Big Data/Oracle PL/SQL Technical Consultant, which includes 4 years of experience in Big Data/Hadoop and 3 years with Oracle PL/SQL.
- In depth understanding of Hadoop Architecture and its components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and MapReduce.
- Expertise in writing HadoopJobs for analyzing data using MapReduce, Hive, Pig & Spark.
- Experienced in administrative tasks such as installing, configuring, commission&de - commission nodes, troubleshooting, backups and recovery of Hadoop and its ecosystem components such as Hive, Pig, Sqoop, HBase, Spark.
- Experienced in Amazon AWS cloud services (EC2, EBS, and S3).
- Worked on real-time, in-memory processing engines such as Spark, Impala and integration with BI Tools such as Tableau.
- Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa
- Experienced in extending Hive and Pig core functionality by writing custom UDFs using Java.
- Extensive experience in developing MySQL, DB2 and Oracle DatabaseTriggers , Stored Procedures and Packages within quality standards using SQL and PL/SQL .
- Comprehensive knowledge in Debugging , Optimizing and Performance Tuning of DB2,Oracle and MySQL databases.
- Strong expertise on BIGDATA data modeling techniques with Hive, Hbase
- Hands-on experience working with source control tools such as CVS, SVN and Clear Case.
- Proficiency in programming with different Java IDE’s like Eclipse, Spring Tool Suite, IBM RAD
- Intermediate level skills on different Application Servers like WebLogic, WebSphere, JBoss,Oracle Application Server and Web Server like Tomcat.
- Very good experience in knowledge transfer (KT) to support team giving pre and post deployment support.
- Designed Use case Diagrams, Class Diagrams, Sequence Diagrams, Flow Charts and Deployment diagrams using MS VISIO and UML Rational Rose Tool.
- Good hands-on experience in object-oriented analysis, design (OOA/D), modeling and programming tools in conjunction with Unified modeling language (UML) .
- Experience working with Hadoop/HBase/Hive/MRV1/MRV2.
- Experience on implementing Log4J for application logging and notification tracing mechanisms.
- Good working knowledge on Maven, ANT to build the application and Jenkins for continuous integration.
- Committed to excellence, self-motivator, team-player, and a far-sighted developer with strong problem-solving skills and with zeal to learn new technologies.
- Strengths include good team player, excellent communication interpersonal and analytical skills and ability to work effectively in a fast-paced, high volume, deadline-driven environment.
Languages: C, Java, SQL, PL/SQL
J2EE Technologies: Servlets, JSP, JDBC, JNDI, Hibernate
Technologies: Hadoop, HDFS, MapReduce, Hive, Hbase, PIG, Cloudera manager and Navigator
Database: Oracle 9i, 10g; IBM DB2 and MySQL
Tools: Oracle Apps, Oracle Forms, Oracle Reports, EclipseMaven, ANT, JUnit, TestNG, Jenkins, Soap UI, Putty, Log4j, Bugzilla
Web Services: SOAP, WSDL, REST
Servers: Apache Tomcat, WebLogic, Websphere, Oracle Application Server, JBoss
Development Tools: Eclipse, RSA, RAD
Database Tools: Oracle SQL Developer, TOAD and PLSQL Developer
Build and Log Tools: Build tools(ANT, MAVEN), Logging tool(Log4J), Version Control (CVS, SVN, Clear Case)
Methodologies & Standards: Software Development Lifecycle (SDLC), RUP, OOA/D, Waterfall Model and Agile
Hardware(Operating Systems): Linux, Unix and Windows 8, 7, XP
Others: MS Office, Apache Open Office, Putty, WinSCP, MS-Visio
Confidential, Boston, MA
Senior Big Data/Hadoop Engineer
- Worked in the BI team in the area of Big Data Hadoop cluster implementation and data integration in developing large-scale system software.
- Assess existing and available data warehousing technologies and methods to ensure our Data warehouse/BI architecture meets the needs of the business unit and enterprise and allows for business growth.
- Used Hbase for real time searching on log data and PIG,HIVE, MapReduce for analysis
- Developed MapReduce programs to parse the raw data, populate staging tables and store the refined data in partitioned tables in the EDW.
- Capturing data from existing databases that provide SQL interfaces using Sqoop.
- Worked extensively with Sqoop for importing and exporting the data from HDFS to Relational Database systems/mainframe and vice-versa. Loading data into HDFS.
- Develop and maintains complex outbound notification applications that run on custom architectures, using diverse technologies including Core Java, J2EE, SOAP, XML, JMS, JBoss and Web Services.
- Involved in design and Implementation of proof of concept for the system to be developed on BIGDATA Hadoop with Hbase, HIVE, Pig, and Flume.
- Created Hive queries that helped market analysts spot emerging trends by comparing fresh data with EDW reference tables and historical metrics.
- Worked with Apache Hadoop,Spark and Scala.
- Enabled speedy reviews and first mover advantages by using Oozie to automate data loading into the Hadoop Distributed File System and PIG to pre-process the data.
- Provided design recommendations and thought leadership to sponsors/stakeholders that improved review processes and resolved technical problems.
- Involved in installing the Hive, Hbase, PIG, Flume and other Hadoop ECO system software.
- Managed and reviewed Hadoop log files.
- Tested raw data and executed performance scripts.
- Shared responsibility for administration of Hadoop, Hive and Pig.
- Developed Hive queries for the analysts.
Environment: Hadoop, MapReduce, HDFS, Hive, Hbase, Java (jdk1.6), Hadoop distribution of Hortonworks, Cloudera, MapR, DataStax, IBM DataStage 8.1(Designer, Director, Administrator), PL/SQL, SQL*PLUS, Toad 9.6, Windows NT, UNIX Shell Scripting.
Confidential, El Segundo, CA
Big Data/Hadoop Developer
- Helped business processes by developing, installing and configuring Hadoop ecosystem components that moved data from individual servers to HDFS.
- Installed and configured MapReduce, HIVE and the HDFS; implemented CDH3 Hadoop cluster on Centos. Assisted with performance tuning and monitoring.
- Created HBase tables to load large sets of structured, semi-structured and unstructured data coming from UNIX, NoSQL and a variety of portfolios.
- Integrated the hive warehouse with HBase
- Worked with Apache Hadoop, Spark and Scala.
- Supported code/design analysis, strategy development and project planning.
- Created reports for the BI team using Sqoop to export data into HDFS and Hive.
- Developed multiple MapReduce jobs in Java for data cleaning and preprocessing.
- Assisted with data capacity planning and node forecasting.
- Maintained System integrity of all sub-components (primarily HDFS, MR, HBase, and Hive).
- Collaborated with the infrastructure, network, database, application and BI teams to ensure data quality and availability.
- Administrator for Pig, Hive and HBase installing updates, patches and upgrades.
- Handling structured and unstructured data and applying ETL processes.
- Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
- Coding complex Oracle stored procedures, functions, packages, and cursors for the client specific applications.
- Production Rollout Support which includes monitoring the solution post go-live and resolving any issues that are discovered by the client and client services teams.
Confidential, Houston, TX
- Installed and configured fully distributed Hadoop cluster.
- Performed Hadoop cluster environment administration that includes adding and removing cluster nodes, cluster capacity planning, performance tuning, cluster monitoring, and trouble shooting.
- Extensively used Cloudera Manager to manage the Hadoop cluster.
- Used Oozie to automate/schedule business workflows which invoke Sqoop, MapReduce and Pig jobs as per the requirements.
- Developed Sqoop scripts to import and export the data from relational sources and handled incremental loading on the customer and transaction data by date.
- Worked with various HDFS file formats like Avro, SequenceFile and various compression formats like Snappy, bzip2.
- Developed efficient MapReduce programs for filtering out the unstructured data.
- Developed the Pig UDF's to pre-process the data for analysis.
- Developed Hive queries for data sampling and analysis to the analysts.
- Loaded data into the cluster from dynamically generated files using Flume and from relational database management systems using Sqoop.
- Developed custom UnixSHELL scripts to do pre and post validations of master and slave nodes, before and after configuring the name node and data nodes respectively.
- Involved in HDFS maintenance and administering it through Hadoop-Java API.
- Supported Map Reduce Programs those are running on the cluster.
- Developed Java Map Reduce programs using Mahout to apply on different datasets.
- Identified several PL/SQL batch applications in General Ledger processing and conducted performance comparison to demonstrate the benefits of migrating to Hadoop.
- Configured Sentry to secure access to purchase information stored in Hadoop.
- Involved in several POCs for different LOBs to benchmark the performance of data-mining using Hadoop.
Environment: RedHat Linux 5, MS SQL Server, Mongo DB, Oracle, Hadoop CDH 3/4/5, PIG, Hive, ZooKeeper, Mahout, HDFS, HBase, Sqoop, Python, Java, Oozie, Hue, Tez, UNIX Shell Scripting, PL/SQL, Maven, Ant
Confidential, NYC, NY
- Setup an application for TCL which was used in service assurance for enterprise customers of TCL Internet business.
- Enabled L1 agents to use the system to raise trouble tickets for customers upon receiving requests from them.
- Assisted L2 and L3 users to rectify the problem and resolved tickets in BMC Remedy application.
- Carried out analysis, design and development of system components.
- Utilized OO techniques and UML methodology (use cases, sequence diagrams and activity diagrams).
- Coordinated discussions with clients and team in preparing requirements and specification documents.
- Developed and deployed the project by following best practices of SDLC.
- Constructed server side applications using Servlets and JDBC.
- Designedtables, triggers, stored procedures and packages in Oracledatabase.
- Executedproof of concepts (POC) and carried development on integrating modules for ticket data flow between Oracle and BMC remedy application using SOAP Web Services and JAX-WS.
- Established a POC on “consuming SOAP Web Services from Oracle DB”.
- Conducted development under Eclipseintegrated development environment (IDE).
- Utilized Ant for automating the build process of Java applications and components.
- Prepared test suites and recorded test results in a document.
- Automated test cases using JUNIT and check API performance.
- Employed EasyMock which providedmock objects by generating them on the fly using Java proxy mechanism.
- Deployed the application on Oracle 9i application server and provided post-production support.
- Monitored error logs using Log4J and fixed problems.
- Performed in the end-to-end phase of the project.
- Provideddemo to application users and knowledge transfer to operations (support) team.
Environment: J2EE, JDBC, SOAP WebServices, WSDL, JAX-WS, XML, Oracle DB, Oracle 9i Application Server, JBoss JUnit, Log4j, SVN, Windows 7
Junior Oracle/Java Programmer
Confidential, NYC, NY
- Carried out requirement analysis and implementedchange requests.
- Wrote different Servlets and Java Beans which contained validation and business logic.
- Established many utility methods and followed factory patterns and singleton design patterns.
- Developed student registration and account module.
- Preparedtechnical design document and specificationdocuments.
- Stored and retrieved data to and from MySQL database using JDBC connectivity.
- Designed the database and normalized tables.
- Set up the unit test plan and unit test results document preparation.
- Worked with BugZilla tracker for bug fixing.
- Deployed the application on Apache Tomcatweb server.
- Utilized Log4j for logging.
- Used Oracle Apps, Oracle Forms and Reports.