Sr. Bigdata/Hadoop Engineer Resume Fort Washington, PA - Hire IT People

SUMMARY

Around 6 years of strong experience in software development using Big Data, Hadoop, Apache Spark Java/J2EE, Scala, Python technologies.
Solid Mathematics, Probability and Statistics foundation and broad practical statistical and data mining techniques cultivated through various industry work and academic programs
Involved in the Software Development Life Cycle (SDLC) phases which include Analysis, Design, Implementation, Testing and Maintenance.
Strong technical, administration, and mentoring knowledge in Linux and Big Data/Hadoop technologies.
Hands on experience on major components in Hadoop Ecosystem like Hadoop Map Reduce, HDFS, HIVE, PIG, Pentaho, Hbase, Zookeeper, Sqoop, Oozie, Cassandra, Flume and Avro.
Work experience with cloud infrastructure like Amazon Web Services (AWS).
Experience in importing and exporting the data using Sqoop from HDFS to Relational Database systems/mainframe and vice - versa
Installing, configuring and managing of Hadoop Clusters and Data Science tools.
Managing the Hadoop distribution with Cloudera Manager, Cloudera Navigator, Hue.
Setting up the High-Availability for Hadoop Clusters components and Edge nodes.
Experience in developing Shell scripts and Python Scripts for system management.
Experience in profiling huge sets of data using Informatica BDM 10
Well versed in using Software development methodologies like Rapid Application Development (RAD), Agile Methodology and Scrum software development processes.
Experience with Object Oriented Analysis and Design (OOAD)methodologies.
Experience in installations of software, writing test cases, debugging, and testing of batch and online systems.
Experience in Production, quality assurance (QA), SIT (System Integration testing) and user acceptance (UA) testing.
Expertise in J2EEtechnologies like JSP, Servlets, EJBs 2.0, JDBC, JNDI and AJAX.
Extensively worked on implementing SOA (Service Oriented Architecture) using XMLWeb services (SOAP, WSDL, UDDI and XML Parsers).
Worked with XML parsers like JAXP (SAX and DOM) and JAXB.
Expertise in applying Java Messaging Service (JMS)for reliable information exchange across Java applications.
Proficient with Core Java,AWT and also with the markup languages likeHTML 5.0,XHTML, DHTML, CSS, XML 1.1, XSL, XSLT, XPath, XQuery, Angular.js, Node.js
Worked with version control systems like Subversion, Perforce, and GIT for providing common platform for all the developers.
Highly motivated team player with the ability to work independently and adapt quickly to new and emerging technologies.
Creatively communicate and present models to business customers and executives, utilizing a variety of formats and visualization methodologies.

TECHNICAL SKILLS

Big Data Frameworks: Hadoop, Spark, Scala, Hive, Kafka, AWS, Cassandra, HBase, Flume, Pig, Sqoop, Map Reduce, Cloudera, Mongo DB.

Big data distribution: Cloudera, Amazon EMR

Programming languages: Core Java, Scala, Python, SQL, Shell Scripting

Operating Systems: Windows, Linux (Ubuntu)

Databases: Oracle, SQL Server

Designing Tools: Eclipse

Java Technologies: JSP, Servlets, Junit, Spring, Hibernate

Web Technologies: XML, HTML, JavaScript, JVM, JQuery, JSON

Linux Experience: System Administration Tools, Puppet, Apache

Web Services: Web Service (RESTfuland SOAP)

Frame Works: Jakarta Struts 1.x, Spring 2.x

Development methodologies: Agile, Waterfall

Logging Tools: Log4j

Application / Web Servers: Cherrypy,Apache Tomcat, WebSphere

Messaging Services: ActiveMQ, Kafka, JMS

Version Tools: Git, SVN and CVS

Analytics: Tableau, SPSS, SAS EM and SAS JMP

PROFESSIONAL EXPERIENCE

Confidential, Fort Washington, PA

Sr. Bigdata/Hadoop Engineer

Responsibilities:

Worked on importing data from various sources and performed transformations using MapReduce, hive to load data into HDFS.
Worked on compression mechanisms to optimize MapReduce Jobs.
Developed Big Data Solutions that enabled the business and technology teams to make data-driven decisions on the best ways to acquire customers and provide them business solutions.
Created scripts to automate the process of Data Ingestion.
Performed joins, group by and other operations in MapReduce by using Java and PIG.
Configured Sqoop jobs to import data from RDBMS into HDFS using Oozie workflows.
Worked on setting up Pig, Hive and HBase on multiple nodes and developed using Pig, Hive, HBase and MapReduce
Worked on the conversion of existing MapReduce batch applications for better performance.
Created HBase tables to store variable data formats coming from different portfolios
Performed real time analytics on HBase using Java API and Rest API
Implemented HBase Co-processors to notify Support team when inserting data into HBase Tables
Worked on compression mechanisms to optimize MapReduce Jobs
Analyzed the customer behavior by performing click stream analysis and to ingest the data used flume
Experienced with working on Avro Data files using Avro Serialization system
Implemented business logic by writing UDF's in Java and used various UDF's from Piggybanks and other sources
Continuous monitoring and managing the Hadoop cluster using Cloudera Manager
Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team

Environment: Hive, HDFS, MapReduce, Flume, Pig,Spark Core, Spark -SQL, Oozie, Oracle, Yarn, Netezza,GitHub, Junit, Linux, HBase, Cloudera, sqoop, HDFS, Java, Scala, Maven and Splunk, Eclipse.

Confidential, Woodbridge NJ

Hadoop Developer

Responsibilities:

Installed and configured Hadoop MapReduce, HDFS, developed multiple MapReduce jobs in Java for data cleaning and preprocessing.
Good understanding and related experience with Hadoop stack - internals, Hive, Pig and Map/Reduce
The system was initially developed using Java. The Java filtering program was restructured to have business rule engine in a jar that can be called from both java and Hadoop.
Wrote MapReduce jobs to discover trends in data usage by users.
Involved in defining job flows
Involved in managing and reviewing Hadoop log files
Load and transform large sets of structured, semi structured and unstructured data
Responsible to manage data coming from different sources
Supported Map Reduce Programs those are running on the cluster
Involved in loading data from UNIX file system to HDFS.
Responsible to manage data coming from different sources.
Installed and configured Hive and developed Hive UDFs to extend core functionality of hive
Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way
Implemented Partitioning, Dynamic Partitions, Buckets in HIVE.
Monitor System health and logs and respond accordingly to any warning or failure conditions.

Environment: Apache Hadoop, HDFS, Map Reduce, Pig, Hive tables, Hive UDFs, Linux, MySQL, HBase, UNIX, Java, ETL, Eclipse.

Confidential

Hadoop Developer

Responsibilities:

Develop JAVA MapReduce Jobs for the aggregation and interest matrix calculation for users.
Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way
Experienced in managing and reviewing applicationlog files.
Ingest the application logs into HDFS and processes the logs using map reduce jobs.
Create and maintain Hive warehouse for Hive analysis.
Generate test cases for the new MR jobs.
Involved in the pilot of Hadoop cluster hosted on Amazon Web Services (AWS)
Run various Hive queries on the data dumps and generate aggregated datasets for downstream systems for further analysis.
Developed dynamic partitioned Hive tables to store data by date and workflow id partition.
Use Apache Scoop to dump the user incremental data into the HDFS on a daily basis.
Run clustering and user recommendation agents on the weblogs and profiles of the users to generate the interest matrix.
Worked on installing and configuring EC2 instances on Amazon Web Services (AWS) for establishing clusters on cloud.
Installed and configured Hive and also written Hive UDFs in java and python
Prepare the data for consumption by formatting it for upload to the UDB system.
Lead & Programmed the recommendation logic for various clustering and classification algorithms using JAVA.
Involved in migration Hadoop jobs into higher environments like SIT, UAT and Prod.

Environment: Hadoop, MapReduce, HDFS, Hive, Java, SQL, Cloudera Manager, Scala, Cassandra, Pig, Sqoop, Oozie, ZooKeeper, Teradata, PL/SQL, MySQL, Windows, Horton works, Oozie, HBase

Confidential

Java Developer

Responsibilities:

Communicate with Clients for Requirements Gathering, Explaining the requirements to Team Members
Analyzing the Requirements and Designing Screen Proto types.
Involved in Project Documentation.
Involved in creation of Basic DB Architecture for the application.
Involved in adding solution to VSS.
Designing & Development of Screens.
Coded JS functions for client validations.
Created user Controls for reusability.
Creation of Tables, Views, Packages, Sequences, Functions for all the modules of the project.
Developed Crystal Reports.
Integrating the functionality of all modules.
Involved in deploying the application.
Unit testing & integration testing.
Designing test plan, test cases and checking the validation.
Test whether the application meets the business requirements.
Implementation ofthe system at client Location.
Giving Training to Application users, interacting with the client, understanding the change requests if any from client.
Responsible for Immediate Error Resolving.

Environment: Core Java, JavaScript, J2EE, Servlets, JSP, Design Patterns, JDBC, HTML, CSS, AJAX, Hibernate, WebLogic, Oracle 8i, ANT, LINUX, SVN, Windows XP

We provide IT Staff Augmentation Services!

Sr. Bigdata/hadoop Engineer Resume

Fort Washington, PA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship