Senior Big Data/Hadoop Engineer Resume Boston, MA - Hire IT People

PROFESSIONAL SUMMARY:

Software professional having 7+ years of Industry Experience as a Big Data/Oracle PL/SQL Technical Consultant, which includes 4 years of experience in Big Data/Hadoop and 3 years with Oracle PL/SQL.
In depth understanding of Hadoop Architecture and its components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and MapReduce.
Expertise in writing HadoopJobs for analyzing data using MapReduce, Hive, Pig & Spark.
Experienced in administrative tasks such as installing, configuring, commission&de - commission nodes, troubleshooting, backups and recovery of Hadoop and its ecosystem components such as Hive, Pig, Sqoop, HBase, Spark.
Experienced in Amazon AWS cloud services (EC2, EBS, and S3).
Worked on real-time, in-memory processing engines such as Spark, Impala and integration with BI Tools such as Tableau.
Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa
Experienced in extending Hive and Pig core functionality by writing custom UDFs using Java.
Extensive experience in developing MySQL, DB2 and Oracle DatabaseTriggers , Stored Procedures and Packages within quality standards using SQL and PL/SQL .
Comprehensive knowledge in Debugging , Optimizing and Performance Tuning of DB2,Oracle and MySQL databases.
Strong expertise on BIGDATA data modeling techniques with Hive, Hbase
Hands-on experience working with source control tools such as CVS, SVN and Clear Case.
Proficiency in programming with different Java IDE’s like Eclipse, Spring Tool Suite, IBM RAD
Intermediate level skills on different Application Servers like WebLogic, WebSphere, JBoss,Oracle Application Server and Web Server like Tomcat.
Very good experience in knowledge transfer (KT) to support team giving pre and post deployment support.
Designed Use case Diagrams, Class Diagrams, Sequence Diagrams, Flow Charts and Deployment diagrams using MS VISIO and UML Rational Rose Tool.
Good hands-on experience in object-oriented analysis, design (OOA/D), modeling and programming tools in conjunction with Unified modeling language (UML) .
Experience working with Hadoop/HBase/Hive/MRV1/MRV2.
Experience on implementing Log4J for application logging and notification tracing mechanisms.
Good working knowledge on Maven, ANT to build the application and Jenkins for continuous integration.
Committed to excellence, self-motivator, team-player, and a far-sighted developer with strong problem-solving skills and with zeal to learn new technologies.
Strengths include good team player, excellent communication interpersonal and analytical skills and ability to work effectively in a fast-paced, high volume, deadline-driven environment.

TECHNICAL SKILLS:

Languages: C, Java, SQL, PL/SQL

J2EE Technologies: Servlets, JSP, JDBC, JNDI, Hibernate

Technologies: Hadoop, HDFS, MapReduce, Hive, Hbase, PIG, Cloudera manager and Navigator

Database: Oracle 9i, 10g; IBM DB2 and MySQL

Tools: Oracle Apps, Oracle Forms, Oracle Reports, EclipseMaven, ANT, JUnit, TestNG, Jenkins, Soap UI, Putty, Log4j, Bugzilla

Web Services: SOAP, WSDL, REST

Servers: Apache Tomcat, WebLogic, Websphere, Oracle Application Server, JBoss

Development Tools: Eclipse, RSA, RAD

Database Tools: Oracle SQL Developer, TOAD and PLSQL Developer

Build and Log Tools: Build tools(ANT, MAVEN), Logging tool(Log4J), Version Control (CVS, SVN, Clear Case)

Methodologies & Standards: Software Development Lifecycle (SDLC), RUP, OOA/D, Waterfall Model and Agile

Hardware(Operating Systems): Linux, Unix and Windows 8, 7, XP

Others: MS Office, Apache Open Office, Putty, WinSCP, MS-Visio

PROFESSIONAL EXPERIENCE:

Confidential, Boston, MA

Senior Big Data/Hadoop Engineer

Responsibilities:

Worked in the BI team in the area of Big Data Hadoop cluster implementation and data integration in developing large-scale system software.
Assess existing and available data warehousing technologies and methods to ensure our Data warehouse/BI architecture meets the needs of the business unit and enterprise and allows for business growth.
Used Hbase for real time searching on log data and PIG,HIVE, MapReduce for analysis
Developed MapReduce programs to parse the raw data, populate staging tables and store the refined data in partitioned tables in the EDW.
Capturing data from existing databases that provide SQL interfaces using Sqoop.
Worked extensively with Sqoop for importing and exporting the data from HDFS to Relational Database systems/mainframe and vice-versa. Loading data into HDFS.
Develop and maintains complex outbound notification applications that run on custom architectures, using diverse technologies including Core Java, J2EE, SOAP, XML, JMS, JBoss and Web Services.
Involved in design and Implementation of proof of concept for the system to be developed on BIGDATA Hadoop with Hbase, HIVE, Pig, and Flume.
Created Hive queries that helped market analysts spot emerging trends by comparing fresh data with EDW reference tables and historical metrics.
Worked with Apache Hadoop,Spark and Scala.
Enabled speedy reviews and first mover advantages by using Oozie to automate data loading into the Hadoop Distributed File System and PIG to pre-process the data.
Provided design recommendations and thought leadership to sponsors/stakeholders that improved review processes and resolved technical problems.
Involved in installing the Hive, Hbase, PIG, Flume and other Hadoop ECO system software.
Managed and reviewed Hadoop log files.
Tested raw data and executed performance scripts.
Shared responsibility for administration of Hadoop, Hive and Pig.
Developed Hive queries for the analysts.

Environment: Hadoop, MapReduce, HDFS, Hive, Hbase, Java (jdk1.6), Hadoop distribution of Hortonworks, Cloudera, MapR, DataStax, IBM DataStage 8.1(Designer, Director, Administrator), PL/SQL, SQL*PLUS, Toad 9.6, Windows NT, UNIX Shell Scripting.

Confidential, El Segundo, CA

Big Data/Hadoop Developer

Responsibilities:

Helped business processes by developing, installing and configuring Hadoop ecosystem components that moved data from individual servers to HDFS.
Installed and configured MapReduce, HIVE and the HDFS; implemented CDH3 Hadoop cluster on Centos. Assisted with performance tuning and monitoring.
Created HBase tables to load large sets of structured, semi-structured and unstructured data coming from UNIX, NoSQL and a variety of portfolios.
Integrated the hive warehouse with HBase
Worked with Apache Hadoop, Spark and Scala.
Supported code/design analysis, strategy development and project planning.
Created reports for the BI team using Sqoop to export data into HDFS and Hive.
Developed multiple MapReduce jobs in Java for data cleaning and preprocessing.
Assisted with data capacity planning and node forecasting.
Maintained System integrity of all sub-components (primarily HDFS, MR, HBase, and Hive).
Collaborated with the infrastructure, network, database, application and BI teams to ensure data quality and availability.
Administrator for Pig, Hive and HBase installing updates, patches and upgrades.
Handling structured and unstructured data and applying ETL processes.
Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
Coding complex Oracle stored procedures, functions, packages, and cursors for the client specific applications.
Production Rollout Support which includes monitoring the solution post go-live and resolving any issues that are discovered by the client and client services teams.

Environment: Hadoop, MapReduce, HDFS, Hive, Java (jdk1.6), Hadoop distribution of Hortonworks, Cloudera, MapR, DataStax, Spring 2.5, Hibernate 3.0, JSF, Servlets, JDBC, JSP,JSTL, JPA, JavaScript, Eclipse 3.4, log4j,Oracle 10g, CVS, CSS, Xml, XSLT, SMTP, Windows-XP.

Confidential, Houston, TX

Hadoop Developer

Responsibilities:

Installed and configured fully distributed Hadoop cluster.
Performed Hadoop cluster environment administration that includes adding and removing cluster nodes, cluster capacity planning, performance tuning, cluster monitoring, and trouble shooting.
Extensively used Cloudera Manager to manage the Hadoop cluster.
Used Oozie to automate/schedule business workflows which invoke Sqoop, MapReduce and Pig jobs as per the requirements.
Developed Sqoop scripts to import and export the data from relational sources and handled incremental loading on the customer and transaction data by date.
Worked with various HDFS file formats like Avro, SequenceFile and various compression formats like Snappy, bzip2.
Developed efficient MapReduce programs for filtering out the unstructured data.
Developed the Pig UDF's to pre-process the data for analysis.
Developed Hive queries for data sampling and analysis to the analysts.
Loaded data into the cluster from dynamically generated files using Flume and from relational database management systems using Sqoop.
Developed custom UnixSHELL scripts to do pre and post validations of master and slave nodes, before and after configuring the name node and data nodes respectively.
Involved in HDFS maintenance and administering it through Hadoop-Java API.
Supported Map Reduce Programs those are running on the cluster.
Developed Java Map Reduce programs using Mahout to apply on different datasets.
Identified several PL/SQL batch applications in General Ledger processing and conducted performance comparison to demonstrate the benefits of migrating to Hadoop.
Configured Sentry to secure access to purchase information stored in Hadoop.
Involved in several POCs for different LOBs to benchmark the performance of data-mining using Hadoop.

Environment: RedHat Linux 5, MS SQL Server, Mongo DB, Oracle, Hadoop CDH 3/4/5, PIG, Hive, ZooKeeper, Mahout, HDFS, HBase, Sqoop, Python, Java, Oozie, Hue, Tez, UNIX Shell Scripting, PL/SQL, Maven, Ant

Oracle/Java Programmer

Confidential, NYC, NY

Responsibilities:

Setup an application for TCL which was used in service assurance for enterprise customers of TCL Internet business.
Enabled L1 agents to use the system to raise trouble tickets for customers upon receiving requests from them.
Assisted L2 and L3 users to rectify the problem and resolved tickets in BMC Remedy application.
Carried out analysis, design and development of system components.
Utilized OO techniques and UML methodology (use cases, sequence diagrams and activity diagrams).
Coordinated discussions with clients and team in preparing requirements and specification documents.
Developed and deployed the project by following best practices of SDLC.
Constructed server side applications using Servlets and JDBC.
Designedtables, triggers, stored procedures and packages in Oracledatabase.
Executedproof of concepts (POC) and carried development on integrating modules for ticket data flow between Oracle and BMC remedy application using SOAP Web Services and JAX-WS.
Established a POC on “consuming SOAP Web Services from Oracle DB”.
Conducted development under Eclipseintegrated development environment (IDE).
Utilized Ant for automating the build process of Java applications and components.
Prepared test suites and recorded test results in a document.
Automated test cases using JUNIT and check API performance.
Employed EasyMock which providedmock objects by generating them on the fly using Java proxy mechanism.
Deployed the application on Oracle 9i application server and provided post-production support.
Monitored error logs using Log4J and fixed problems.
Performed in the end-to-end phase of the project.
Provideddemo to application users and knowledge transfer to operations (support) team.

Environment: J2EE, JDBC, SOAP WebServices, WSDL, JAX-WS, XML, Oracle DB, Oracle 9i Application Server, JBoss JUnit, Log4j, SVN, Windows 7

Junior Oracle/Java Programmer

Confidential, NYC, NY

Responsibilities:

Carried out requirement analysis and implementedchange requests.
Utilized JSP, HTML and CSS for web development and JavaScript for user data validation.
Wrote different Servlets and Java Beans which contained validation and business logic.
Established many utility methods and followed factory patterns and singleton design patterns.
Developed student registration and account module.
Preparedtechnical design document and specificationdocuments.
Stored and retrieved data to and from MySQL database using JDBC connectivity.
Designed the database and normalized tables.
Set up the unit test plan and unit test results document preparation.
Worked with BugZilla tracker for bug fixing.
Deployed the application on Apache Tomcatweb server.
Utilized Log4j for logging.
Used Oracle Apps, Oracle Forms and Reports.

Environment: Java, Servlets, JDBC, Java Beans, JSP, JavaScript, HTML, CSS, MySQL, Tomcat, Log4j, MS Visio, CVS, Windows XP, Oracle Apps, Developer 2K.

We provide IT Staff Augmentation Services!

Senior Big Data/hadoop Engineer Resume

Boston, MA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship