We provide IT Staff Augmentation Services!

Senior Big Data/hadoop Engineer Resume

Boston, MA


  • Software professional having 7+ years of Industry Experience as a Big Data/Oracle PL/SQL Technical Consultant, which includes 4 years of experience in Big Data/Hadoop and 3 years with Oracle PL/SQL.
  • In depth understanding of Hadoop Architecture and its components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and MapReduce.
  • Expertise in writing HadoopJobs for analyzing data using MapReduce, Hive, Pig & Spark.
  • Experienced in administrative tasks such as installing, configuring, commission&de - commission nodes, troubleshooting, backups and recovery of Hadoop and its ecosystem components such as Hive, Pig, Sqoop, HBase, Spark.
  • Experienced in Amazon AWS cloud services (EC2, EBS, and S3).
  • Worked on real-time, in-memory processing engines such as Spark, Impala and integration with BI Tools such as Tableau.
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa
  • Experienced in extending Hive and Pig core functionality by writing custom UDFs using Java.
  • Extensive experience in developing MySQL, DB2 and Oracle DatabaseTriggers , Stored Procedures and Packages within quality standards using SQL and PL/SQL .
  • Comprehensive knowledge in Debugging , Optimizing and Performance Tuning of DB2,Oracle and MySQL databases.
  • Strong expertise on BIGDATA data modeling techniques with Hive, Hbase
  • Hands-on experience working with source control tools such as CVS, SVN and Clear Case.
  • Proficiency in programming with different Java IDE’s like Eclipse, Spring Tool Suite, IBM RAD
  • Intermediate level skills on different Application Servers like WebLogic, WebSphere, JBoss,Oracle Application Server and Web Server like Tomcat.
  • Very good experience in knowledge transfer (KT) to support team giving pre and post deployment support.
  • Designed Use case Diagrams, Class Diagrams, Sequence Diagrams, Flow Charts and Deployment diagrams using MS VISIO and UML Rational Rose Tool.
  • Good hands-on experience in object-oriented analysis, design (OOA/D), modeling and programming tools in conjunction with Unified modeling language (UML) .
  • Experience working with Hadoop/HBase/Hive/MRV1/MRV2.
  • Experience on implementing Log4J for application logging and notification tracing mechanisms.
  • Good working knowledge on Maven, ANT to build the application and Jenkins for continuous integration.
  • Committed to excellence, self-motivator, team-player, and a far-sighted developer with strong problem-solving skills and with zeal to learn new technologies.
  • Strengths include good team player, excellent communication interpersonal and analytical skills and ability to work effectively in a fast-paced, high volume, deadline-driven environment.


Languages: C, Java, SQL, PL/SQL

J2EE Technologies: Servlets, JSP, JDBC, JNDI, Hibernate

Technologies: Hadoop, HDFS, MapReduce, Hive, Hbase, PIG, Cloudera manager and Navigator

Database: Oracle 9i, 10g; IBM DB2 and MySQL

Tools: Oracle Apps, Oracle Forms, Oracle Reports, EclipseMaven, ANT, JUnit, TestNG, Jenkins, Soap UI, Putty, Log4j, Bugzilla

Web Services: SOAP, WSDL, REST

Servers: Apache Tomcat, WebLogic, Websphere, Oracle Application Server, JBoss

Development Tools: Eclipse, RSA, RAD

Database Tools: Oracle SQL Developer, TOAD and PLSQL Developer

Build and Log Tools: Build tools(ANT, MAVEN), Logging tool(Log4J), Version Control (CVS, SVN, Clear Case)

Methodologies & Standards: Software Development Lifecycle (SDLC), RUP, OOA/D, Waterfall Model and Agile

Hardware(Operating Systems): Linux, Unix and Windows 8, 7, XP

Others: MS Office, Apache Open Office, Putty, WinSCP, MS-Visio


Confidential, Boston, MA

Senior Big Data/Hadoop Engineer


  • Worked in the BI team in the area of Big Data Hadoop cluster implementation and data integration in developing large-scale system software.
  • Assess existing and available data warehousing technologies and methods to ensure our Data warehouse/BI architecture meets the needs of the business unit and enterprise and allows for business growth.
  • Used Hbase for real time searching on log data and PIG,HIVE, MapReduce for analysis
  • Developed MapReduce programs to parse the raw data, populate staging tables and store the refined data in partitioned tables in the EDW.
  • Capturing data from existing databases that provide SQL interfaces using Sqoop.
  • Worked extensively with Sqoop for importing and exporting the data from HDFS to Relational Database systems/mainframe and vice-versa. Loading data into HDFS.
  • Develop and maintains complex outbound notification applications that run on custom architectures, using diverse technologies including Core Java, J2EE, SOAP, XML, JMS, JBoss and Web Services.
  • Involved in design and Implementation of proof of concept for the system to be developed on BIGDATA Hadoop with Hbase, HIVE, Pig, and Flume.
  • Created Hive queries that helped market analysts spot emerging trends by comparing fresh data with EDW reference tables and historical metrics.
  • Worked with Apache Hadoop,Spark and Scala.
  • Enabled speedy reviews and first mover advantages by using Oozie to automate data loading into the Hadoop Distributed File System and PIG to pre-process the data.
  • Provided design recommendations and thought leadership to sponsors/stakeholders that improved review processes and resolved technical problems.
  • Involved in installing the Hive, Hbase, PIG, Flume and other Hadoop ECO system software.
  • Managed and reviewed Hadoop log files.
  • Tested raw data and executed performance scripts.
  • Shared responsibility for administration of Hadoop, Hive and Pig.
  • Developed Hive queries for the analysts.

Environment: Hadoop, MapReduce, HDFS, Hive, Hbase, Java (jdk1.6), Hadoop distribution of Hortonworks, Cloudera, MapR, DataStax, IBM DataStage 8.1(Designer, Director, Administrator), PL/SQL, SQL*PLUS, Toad 9.6, Windows NT, UNIX Shell Scripting.

Confidential, El Segundo, CA

Big Data/Hadoop Developer


  • Helped business processes by developing, installing and configuring Hadoop ecosystem components that moved data from individual servers to HDFS.
  • Installed and configured MapReduce, HIVE and the HDFS; implemented CDH3 Hadoop cluster on Centos. Assisted with performance tuning and monitoring.
  • Created HBase tables to load large sets of structured, semi-structured and unstructured data coming from UNIX, NoSQL and a variety of portfolios.
  • Integrated the hive warehouse with HBase
  • Worked with Apache Hadoop, Spark and Scala.
  • Supported code/design analysis, strategy development and project planning.
  • Created reports for the BI team using Sqoop to export data into HDFS and Hive.
  • Developed multiple MapReduce jobs in Java for data cleaning and preprocessing.
  • Assisted with data capacity planning and node forecasting.
  • Maintained System integrity of all sub-components (primarily HDFS, MR, HBase, and Hive).
  • Collaborated with the infrastructure, network, database, application and BI teams to ensure data quality and availability.
  • Administrator for Pig, Hive and HBase installing updates, patches and upgrades.
  • Handling structured and unstructured data and applying ETL processes.
  • Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
  • Coding complex Oracle stored procedures, functions, packages, and cursors for the client specific applications.
  • Production Rollout Support which includes monitoring the solution post go-live and resolving any issues that are discovered by the client and client services teams.

Environment: Hadoop, MapReduce, HDFS, Hive, Java (jdk1.6), Hadoop distribution of Hortonworks, Cloudera, MapR, DataStax, Spring 2.5, Hibernate 3.0, JSF, Servlets, JDBC, JSP,JSTL, JPA, JavaScript, Eclipse 3.4, log4j,Oracle 10g, CVS, CSS, Xml, XSLT, SMTP, Windows-XP.

Confidential, Houston, TX

Hadoop Developer


  • Installed and configured fully distributed Hadoop cluster.
  • Performed Hadoop cluster environment administration that includes adding and removing cluster nodes, cluster capacity planning, performance tuning, cluster monitoring, and trouble shooting.
  • Extensively used Cloudera Manager to manage the Hadoop cluster.
  • Used Oozie to automate/schedule business workflows which invoke Sqoop, MapReduce and Pig jobs as per the requirements.
  • Developed Sqoop scripts to import and export the data from relational sources and handled incremental loading on the customer and transaction data by date.
  • Worked with various HDFS file formats like Avro, SequenceFile and various compression formats like Snappy, bzip2.
  • Developed efficient MapReduce programs for filtering out the unstructured data.
  • Developed the Pig UDF's to pre-process the data for analysis.
  • Developed Hive queries for data sampling and analysis to the analysts.
  • Loaded data into the cluster from dynamically generated files using Flume and from relational database management systems using Sqoop.
  • Developed custom UnixSHELL scripts to do pre and post validations of master and slave nodes, before and after configuring the name node and data nodes respectively.
  • Involved in HDFS maintenance and administering it through Hadoop-Java API.
  • Supported Map Reduce Programs those are running on the cluster.
  • Developed Java Map Reduce programs using Mahout to apply on different datasets.
  • Identified several PL/SQL batch applications in General Ledger processing and conducted performance comparison to demonstrate the benefits of migrating to Hadoop.
  • Configured Sentry to secure access to purchase information stored in Hadoop.
  • Involved in several POCs for different LOBs to benchmark the performance of data-mining using Hadoop.

Environment: RedHat Linux 5, MS SQL Server, Mongo DB, Oracle, Hadoop CDH 3/4/5, PIG, Hive, ZooKeeper, Mahout, HDFS, HBase, Sqoop, Python, Java, Oozie, Hue, Tez, UNIX Shell Scripting, PL/SQL, Maven, Ant

Oracle/Java Programmer

Confidential, NYC, NY


  • Setup an application for TCL which was used in service assurance for enterprise customers of TCL Internet business.
  • Enabled L1 agents to use the system to raise trouble tickets for customers upon receiving requests from them.
  • Assisted L2 and L3 users to rectify the problem and resolved tickets in BMC Remedy application.
  • Carried out analysis, design and development of system components.
  • Utilized OO techniques and UML methodology (use cases, sequence diagrams and activity diagrams).
  • Coordinated discussions with clients and team in preparing requirements and specification documents.
  • Developed and deployed the project by following best practices of SDLC.
  • Constructed server side applications using Servlets and JDBC.
  • Designedtables, triggers, stored procedures and packages in Oracledatabase.
  • Executedproof of concepts (POC) and carried development on integrating modules for ticket data flow between Oracle and BMC remedy application using SOAP Web Services and JAX-WS.
  • Established a POC on “consuming SOAP Web Services from Oracle DB”.
  • Conducted development under Eclipseintegrated development environment (IDE).
  • Utilized Ant for automating the build process of Java applications and components.
  • Prepared test suites and recorded test results in a document.
  • Automated test cases using JUNIT and check API performance.
  • Employed EasyMock which providedmock objects by generating them on the fly using Java proxy mechanism.
  • Deployed the application on Oracle 9i application server and provided post-production support.
  • Monitored error logs using Log4J and fixed problems.
  • Performed in the end-to-end phase of the project.
  • Provideddemo to application users and knowledge transfer to operations (support) team.

Environment: J2EE, JDBC, SOAP WebServices, WSDL, JAX-WS, XML, Oracle DB, Oracle 9i Application Server, JBoss JUnit, Log4j, SVN, Windows 7

Junior Oracle/Java Programmer

Confidential, NYC, NY


  • Carried out requirement analysis and implementedchange requests.
  • Utilized JSP, HTML and CSS for web development and JavaScript for user data validation.
  • Wrote different Servlets and Java Beans which contained validation and business logic.
  • Established many utility methods and followed factory patterns and singleton design patterns.
  • Developed student registration and account module.
  • Preparedtechnical design document and specificationdocuments.
  • Stored and retrieved data to and from MySQL database using JDBC connectivity.
  • Designed the database and normalized tables.
  • Set up the unit test plan and unit test results document preparation.
  • Worked with BugZilla tracker for bug fixing.
  • Deployed the application on Apache Tomcatweb server.
  • Utilized Log4j for logging.
  • Used Oracle Apps, Oracle Forms and Reports.

Environment: Java, Servlets, JDBC, Java Beans, JSP, JavaScript, HTML, CSS, MySQL, Tomcat, Log4j, MS Visio, CVS, Windows XP, Oracle Apps, Developer 2K.

Hire Now