Bigdata Developer Resume Warren, NJ - Hire IT People

PROFESSIONAL SUMMARY:

Around 8 years of experience in IT industry with 3+ years of experience in Big Data implementing complete Hadoop solutions along with Java.
Good working experience in using Apache Hadoop eco system components like MapReduce, HDFS, Hive, Sqoop, Pig, Oozie, Flume, HBase and Zoo Keeper.
Writing UDFs and integrating with Hive and Pig.
Experience with Sequence files, AVRO and ORC file formats and compression.
Experience in Hadoop Distributions: Cloudera and Hortonworks,
Performed importing and exporting data into HDFS and Hive using Sqoop.
Experience in job workflow scheduling and monitoring tools like Oozie and Zookeeper.
Extensive knowledge in using SQL Queries for backend database analysis.
Strong knowledge in NOSQL column oriented databases like HBase, Cassandra and its integration with Hadoop cluster.
Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems (RDBMS) and vice - versa.
Led many Data Analysis & Integration efforts involving HADOOP along with ETL.
Hands on experience on Enterprise data lake to provide support for various uses cases including Analytics, processing, storing and Reporting of voluminous, rapidly changing, structured and unstructured data.
Extensive experience with SQL, PL/SQL and database concepts.
Transferred bulk data from RDBMS systems like Teradata into HDFS using Sqoop.
Experience in analyzing data using Hive QL, Pig Latin, and custom MapReduce programs in Java.
Well-versed in Agile, other SDLC methodologies and can coordinate with owners and SMEs.
Worked on different operating systems like UNIX, Linux, and Windows
Diverse experience utilizing Java tools in business, Web, and client-server environments including Java Platform, Enterprise Edition (Java EE), Enterprise Java Bean (EJB), JavaServer Pages (JSP), Java Servlets (including JNDI), Struts, and Java database Connectivity (JDBC) technologies.
Fluid understanding of multiple programming languages, including C#, C, C++, JavaScript, HTML, and XML.
Experience in web application design using open source MVC , Spring and Struts Frameworks.

TECHNICAL SKILLS:

Hadoop Core Services: HDFS, Map Reduce, Spark, YARN

Hadoop Distribution: Horton works, Cloudera, Apache

NO SQL Databases: HBase, Cassandra

Hadoop Data Services: Hive, Pig, Sqoop, Flume

Hadoop Operational Services: Zookeeper, Oozie

Monitoring Tools: Cloudera Manager

Cloud Computing Tools: Amazon AWS

Languages: C, Java/J2EE, Python, SQL, PL/SQL, Pig Latin, HiveQL, Unix Shell Scripting

Java & J2EE Technologies: Core Java, Servlets, Hibernate, Spring, Struts

Application Servers: Web Logic, Web Sphere, JBoss, Tomcat

Databases: Oracle, MySQL, Postgress, Teradata

Operating Systems: UNIX, Windows, LINUX

Build Tools: Jenkins, Maven, ANT

Development Tools: Microsoft SQL Studio, Toad, Eclipse, NetBeans

Development methodologies: Agile/Scrum

Visualization and analytics tool: Tableau Software, Qlik View

PROFESSIONAL EXPERIENCE:

Confidential, Warren, NJ

Bigdata Developer

Responsibilities:

Involved in complete Big Data flow of the application data ingestion from upstream to HDFS, processing the data in HDFS and analyzing the data using several tools.
Imported the data from various formats like JSON, Sequential, Text, CSV, AVRO and Parquet to HDFS cluster with compressed for optimization.
Experienced on ingesting data from RDBMS sources like - Oracle, SQL Server and Teradata into HDFS using Sqoop.
Configured Hive and written Hive UDF's and UDAF's Also, created partitions such as Static and Dynamic with bucketing.
Managing and scheduling Jobs on a Hadoop cluster.
Created Hive External tables and loaded the data in to tables and query data using HQL.
Written Hive jobs to parse the logs and structure them in tabular format to facilitate effective querying on the log data.
Developed PIG scripts for the analysis of semi structured data and Latin scripts to extract the data from the web server output files to load into HDFS.
Developed PIG UDF'S for manipulating the data according to Business Requirements and also worked on developing custom PIG Loaders.
Developed Oozie workflow for scheduling and orchestrating the ETL process and worked on Oozie workflow engine for job scheduling.
Managed and reviewed the Hadoop log files using Shell scripts.
Migrated ETL jobs to Pig scripts to do Transformations, even joins and some pre-aggregations before storing the data onto HDFS.
Using Hive join queries to join multiple tables of a source system and load them to Elastic search tables.
Experience in managing and reviewing huge Hadoop log files.
Collected the logs data from web servers and integrated in to HDFS using Flume.
Expertise in designing and creating various analytical reports and Automated Dashboards to help users to identify critical KPIs and facilitate strategic planning in the organization.
Involved in Cluster maintenance, Cluster Monitoring and Troubleshooting
Created Data Pipelines as per the business requirements and scheduled it using Oozie Coordinators.
Maintaining technical documentation for each and every step of development environment and launching Hadoop clusters.
Worked on different file formats like Sequence files, XML files and Map files using MapReduce Programs.
Worked with Avro Data Serialization system to work with JSON data formats.
Used Amazon Web Services S3 to store large amount of data in identical/similar repository.
Worked with the Data Science team to gather requirements for various data mining projects.
Wrote shell scripts for rolling day-to-day processes and it is automated.
Involved in build applications using Maven and integrated with Continuous Integration servers like Jenkins to build jobs.
Used Enterprise Data Warehouse database to store the information and to make it access all over organization.
Worked on BI tools as Tableau to create dashboards like weekly, monthly, daily reports using tableau desktop and publish them to HDFS cluster.

Environment: Hadoop, HDFS, Hive, Oozie, Pig, Sqoop, Shell Scripting, HBase, Jenkins, Tableau, Oracle, MySQL, Teradata and AWS.

Confidential, Florham Park, NJ

Big Data Developer

Responsibilities:

Responsible for building scalable distributed data solutions using Hadoop
Imported data using Sqoop to load data from MySQL to HDFS on regular basis from various sources.
Written multiple MapReduce programs to power data for extraction, transformation and aggregation from multiple file formats including XML, JSON, CSV & other compressed file formats.
Reviewed the HDFS usage and system design for future scalability and fault-tolerance.
Involved in loading data from LINUX file system to HDFS.
Loaded and transformed large sets of structured, semi structured and unstructured data in various formats like text, zip, XML and JSON.
Defined job flows and developed simple to complex Map Reduce jobs as per the requirement.
Optimized MapReduce Jobs to use HDFS efficiently by using various compression mechanisms.
Developed PIG UDFs for manipulating the data according to Business Requirements and also worked on developing custom PIG Loaders.
Hands on experience in setting up HBase Column based storage repository for archiving and retro data.
Responsible for creating Hive tables based on business requirements.
Used Enterprise data lake to provide support for various uses cases including Analytics, processing, storing and Reporting of voluminous, rapidly changing, structured and unstructured data.
Along with the Infrastructure team, involved in design and developed Kafka and Storm based data pipeline.
Implemented Partitioning, Dynamic Partitions and Buckets in HIVE for efficient data access.
Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.
Involved in data modeling and sharding and replication strategies in Cassandra.
Load the data into Spark RDD and do in memory data Computation to generate the Output response.
Knowledge on handling Hive queries using Spark SQL that integrate Spark environment.
Exported the analyzed data into relational databases using Sqoop for visualization and to generate reports for the BI team.

Environment: Apache Hadoop 2x, Cloudera, HDFS, MapReduce, Hortonworks, Hive, Pig, HBase, Spark, Scala, Sqoop, Kafka, FLUME, Cassandra, Oracle 11g/10g, Linux, XML,MYSQL.

Confidential

Hadoop Developer

Responsibilities:

Understanding business needs, analyzing functional specifications and map those to develop and designing MapReduce programs and algorithms.
Optimizing Hadoop MapReduce code, Hive and Pig scripts for better scalability, reliability and performance.
Developed the OOZIE workflows for the Application execution.
Performing data migration from Legacy Databases RDBMS to HDFS using Sqoop.
Writing Pig scripts for data processing.
Created HBase tables to load large sets of structured, semi-structured and unstructured data coming from UNIX, NoSQL and a variety of portfolios.
Implemented Hive tables and HQL Queries for the reports.
Imported data from Cassandra into HDFS using Mongo export utility.
Involved in developing shell scripts and automated data management from end to end integration work
Experience in performing data validation using HIVE dynamic partitioning and bucketing.
Written and used complex data type in storing and retrieved data using HQL in Hive.
Developed Hive queries to analyze reducer output data.
Implemented ETL code to load data from multiple sources into HDFS using pig scripts.
Highly involved in designing the next generation data architecture for the Unstructured data.
Developed PIG Latin scripts to extract data from source system.
Created and maintained technical documentation for executing Hive queries and Pig scripts.
Involved in Extracting, loading Data from Hive to Load an RDBMS using Sqoop

Environment: HDFS, Map Reduce, MySQL, Cassandra, Hive, HBase, Oozie, PIG, ETL, Hortonworks(HDP 2.0), Shell Scripting, Linux, Sqoop, Flume and Oracle 11g.

Confidential

Hadoop Developer

Responsibilities:

Involved in importing data from Microsoft SQLServer, MySQL, Teradata. into HDFS using Sqoop.
Developed workflow in Oozie to automate the tasks of loading the data into HDFS.
Used Hive to analyze the partitioned and bucked data to compute various metrices of reporting.
Involved in creating Hive tables loading data, and writing queries that will run internally in MapReduce
Involved in creating Hive External tables for HDFS data.
Solved performance issues in Hive and Pig Scripts with understanding of Joins, Group and Aggregation and perform the MapReduce jobs.
Used Spark for transformations, event joins and some aggregations before storing the data into HDFS.
Troubleshoot and resolve data quality issues and maintain elevated level of data accuracy in the data being reported.
Analyze the large amount of data sets to determine optimal way to aggregate.
Worked on the Oozie workflow to run multiple Hive and Pig jobs.
Involved in creating Hive UDF's.
Developed Automated shell script to execute Hive Queries.
Involved in processing ingested raw data using Apache Pig.
Monitored continuously and managed the Hadoop cluster using cloudera manager.
Worked on different file formats like JSON, AVRO, ORC, Parquet and Compression like Snappy, zlib, ls4 etc.
Executed HiveQL in Spark using SparkSQL.
Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs, scala.
Gained Knowledge in creating Tableau dashboard for reporting analyzed data.
Expertise with NoSQL databases like HBase.
Experienced in managing and reviewing the Hadoop log files.
Used GitHub as repository for committing code and retrieving it and Jenkins for continuous integration.

Environment: HDFS, MapReduce, Sqoop, Hive, Pig, Oozie, MySQL, Eclipse, Git, GitHub, Jenkins.

Confidential

Java Developer

Responsibilities:

Involved in various stages of Enhancements in the Application by doing the required analysis, development, and testing.
Prepared the High and Low level design document and Generating Digital Signature
For analysis and design of application created Use Cases, Class and Sequence Diagrams.
For the registration and validation of the enrolling customer developed logic and code.
Developed web-based user interfaces using struts frame work.
Handled Client side Validations used JavaScript
Wrote SQL queries, stored procedures and enhanced performance by running explain plans.
Involved in integration of various Struts actions in the framework.
Used Validation Framework for Server side Validations
Created test cases for the Unit and Integration testing.
Front-end was integrated with Oracle database using JDBC API through JDBC-ODBC Bridge driver at server side.
Designed project related documents using MS Visio which includes Use case, Class and Sequence diagrams.
Writing end-to-end flow i.e. controllers classes, service classes, DAOs classes as per the Spring MVC design and writing business logics using core java API and data structures
Used Spring JMS related MDB to receive the messages from other team with IBM MQ for queuing
Developed presentation layer code, using JSP, HTML, AJAX and JQuery
Developed the Business layer using spring (IOC, AOP), DTO, and JTA
Developed application service components and configured beans using Spring IOC. Implemented persistence layer and Configured EH Cache to load the static tables into secondary storage area.
Involved in the development of the User Interfaces using HTML, JSP, JS, Dojo Tool Kit, CSS and AJAX
Created tables, triggers, stored procedures, SQL queries, joins, integrity constraints and views for multiple databases, Oracle 11g using Toad tool.
Developed the project using industry standard design patterns like Singleton, Business Delegate Factory Pattern for better maintenance of code and re-usability
Developed unit test cases using Junit framework for testing accuracy of code and logging with SLF4j + Log4j
Worked with defect tracking system Clear Quest
Worked with IDE as Spring STS and deployed into spring tomcat server, WebSphere& used Maven as build tool
Responsible for code sanity in integration stream used Clear Case as version control tool

Environment: Java, J2EE, Spring, Spring Batch, Spring JMS, MyBatis HTML, CSS, AJAX, JQuery, JavaScript, JSP, XML, UML, JUNIT, IBM WebSphere, Maven, Clear Case, SoapUI, Oracle 11g,, IBM MQ.

We provide IT Staff Augmentation Services!

Bigdata Developer Resume

Warren, NJ

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship