Sr. Big Data Consultant. Resume Basking Ridge, NJ - Hire IT People

SUMMARY

Around 7.5 years of IT experience in software development and support with experience in developing strategic methods for deploying Big Data technologies to efficiently solve Big Data processing requirement.
Expertise in Hadoop eco system components HDFS, Map Reduce, Yarn, HBase, Pig, Sqoop and Hive for scalability, distributed computing and high performance computing.
Have experience in Apache Spark, Spark Streaming, Spark SQL and No SQL databases like Cassandra and Hbase
Experienced in Integrating Hadoop with Apache Storm and Kafka. Expertise in uploading Click stream data from Kafka to HDFS, Hbase and Hive by integrating with Storm.
Experience in using Hive Query Language for data Analytics.
Experienced in Installing, Maintaining and Configuring Hadoop Cluster.
Strong knowledge on creating and monitoring Hadoop clusters on Amazon EC2, VM, Hortonworks Data Platform 2.1 & 2.2, CDH3, CDH4 Cloudera Manager on Linux, Ubuntu OS etc.
Capable of processing large sets of structured, semi - structured and unstructured data and supporting systems application architecture.
Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa.
Expertise on Scala Programming language and Spark Core
Experienced in job workflow scheduling and monitoring tools like Oozie and Zookeeper.
Good knowledge on Amazon EMR, S3 Buckets, Dynamo DB, RedShift.
Analyze data, interpret results and convey findings in a concise and professional manner
Partner with Data Infrastructure team and business owners to implement new data sources and ensure consistent definitions are used in reporting and analytics
Promote full cycle approach including request analysis, creating/pulling dataset, report creation and implementation and providing final analysis to the requestor
Very Good understanding of SQL, ETL and Data Warehousing Technologies
Knowledge of MS SQL Server2012/2008/2005and Oracle 11g/10g/9i and E-Business Suite.
Expert in TSQL, creating and using Stored Procedures, Views, User Defined Functions, implementing Business Intelligence solutions using SQL Server 2000/2005/2008.
Developed Web-Services module for integration using SOAP and REST.
Flexible with Unix/Linux and Windows Environments working with Operating Systems like Centos 5/6, Ubuntu 13/14, Cosmos.
Knowledge of java virtual machines (JVM) and multithreaded processing.
Strong programming skills in designing and implementation of applications using Core Java, J2EE, JDBC, JSP, HTML, Spring Framework, Spring batch framework, Spring AOP, Struts, JavaScript, Servlets.
Java Developer with extensive experience on various Java Libraries, API's and frameworks.
Hands on development experience with RDBMS, including writing complex SQL queries, Stored procedure and triggers.
Have sound knowledge on designing data warehousing applications with using Tools like Teradata, Oracle and SQL Server.
Experience on using Talend ETL tool.
Strong understanding of Agile Scrum and Waterfall SDLC methodologies.
Strong communication, collaboration & team building skills with proficiency at grasping new Technical concepts quickly and utilizing them in a productive manner.
Adept in analyzing information system needs, evaluating end-user requirements, custom designing solutions and troubleshooting information systems.
Strong analytical and Problem solving skills.

TECHNICAL SKILLS:

BigData Platforms: Apache Spark, Spark SQL, Spark Streaming, Amazon EMR, Red Shift, Cloudera, Big Data, Hadoop, Yarn, Map Reduce, PIG, HIVE, HBASE, Storm, Kafka, Impala, Mongo DB and Cassandra

Languages: JAVA, J2EE, JSP, Servlets, Spring MVC, Spring MVC Portlet, Struts

Databases: Oracle10g/9i/8i/8.0/7.0,MS SQL Server 6.5/7.0/2000/2003.

Tools and Products: Eclipse, Vignette Content Management systems, Documentum, ATG e Commerce and Team Connect.

Web: HTML, DHTML, JavaScript, JSP, XSL and XML.

Build Tools: Maven and Ant

Version Controls: Clear Case, StarTeam, Serena and SVN

Operating Systems: UNIX, Linux, Microsoft Windows 95/98/00/NT/XP, MS-DOS.

PROFESSIONAL EXPERIENCE:

Confidential, Basking Ridge, NJ

Sr. Big Data Consultant.

Responsibilities:

Ingested multiple sources into Hive warehouse tenant space for report generation.
Worked on Hl7 Data and parsed the data using Spark Rdd and DataFrame api’s.
Created Oozie workflow for automation and scheduling which run independently with time and dataavailability.
Created the incremental framework with HBase control table and Entity Instance table.
Involved in extracting the Transactional data from various policies of Confidential by writing the map reduce jobs and automating it with UNIX shell script.
Experienced in handling large datasets using Partitions, Spark in Memory capabilities, Broadcasts in Spark, Effective & efficient Joins, Transformations and other during ingestion process itself.
Used Oozie workflow engine to manage interdependent Hadoop jobs and to automate several types of Hadoop jobs such as Java map-reduce Hive, Pig, and Sqoop.
Integrated spark with Kafka to perform web analytics. Uploaded click stream data from Kafka to Hdfs, Hbase and Hive.
Improved the job performance running on large data by using spark optimization technique’s.
Efficiently used HBase with spark. Reading the Hbase data into Spark Rdd and performed the computations.
Implemented Kafka messaging services to stream large data and insert into database.
Using HBase to store majority of data which needs to be divided based on region.
Built reusable Hive UDF libraries for business requirements which enabled users to use these UDF's in Hive Querying.
Created the streaming pipeline with Rabbit MQ and service calls.
Supported the automation testing team with integrating the spark with cucumber.
Deployed the code with CICD tools like Genkins and GitHub.
Streamed the HL7 messages to the rabbitMQ using the spark and scala.
Ingested the huge volume of XML data into the lake with the incremental framework.

Environment: Hadoop, HBase, MapR, ORC, Map Reduce, RabbitMQ, Spark, Spark SQL, Kafka, Storm, HDFS, Hive, Sqoop, Oozie, Java, SQL, Shell script.

Confidential, Atlanta, GA

Big Data Consultant

Responsibilities:

Involved in requirement and design phase to implement DMF(Data movement flow) application to ingest data from many sources to hadoop.
Developed export jobs for IDW data to export into Teradata for BI reports.
Worked on AWS platform for Real stream data pipeline.
Designed utility jobs to move data into Amazon Redshift from Hortonworks in-house platform.
Used Spark DataFrame API to process Structured and Semi Structured files and load them back into S3 Bucket.
Ingested huge amount of JSON files into Hadoop with in Spark jobs. Extracted Daily Sales, Hourly Sales and Product Mix of offers and loaded them into Global Data Warehouse.
Used Oozie to automate the data loading into Hadoop Distributed File System and Control-M for job scheduling.
Involved in creating workflow to run multiple hive and Pig Jobs, which run independently with time and data availability.
Processed large data sets utilizing Hadoop cluster. The data that are stored on HDFS were preprocessed/validated using PIG then the processed data is stored into Hive warehouse which enabled business analysts to get the required data from Hive.
Developed Hive queries to join click stream data with the relational data for determining the interaction of search guests on the website

Environment: Spark, Spark SQL, Kafka, Active MQ, Hadoop, Hortonworks, ORC, Parquet, Map Reduce, Storm, HDFS, Hive, Sqoop, Oozie, Scala, Shell script.

Confidential, Basking Ridge, NJ

Sr. Hadoop/Spark Developer.

Responsibilities:

Involved in Design, implement and maintain applications that receives a transaction-based and Product mix data generated from the insurance policies.
Job duties involved the design, development of various modules in Hadoop Big Data Platform and processing data using Spark Streaming, SparkSQL, Map Reduce, Hive, Pig, Scoop and Talend.
Design, developed and tested Spark Application named ECI Builder which used over many applications Automated with Shell Script and scheduled using the Talend Tac.
Involved in migrating the map reduce jobs into Spark Jobs and Used Spark SQL and Dataframes API to load structured and semi structured data into Spark Clusters
Involved in developing shell scripts and automated data management from end to end integration work
Involved in extracting the Transactional data from various policies of Confidential by writing the map reduce jobs and automating it with UNIX shell script.
Involved in requirement and design phase to implement Streaming Lambda Architecture to use real time streaming using Spark and Kafka.
Performed advanced procedures like text analytics and processing, using the in-memory computing capabilities of Spark.
Experienced in handling large datasets using Partitions, Spark in Memory capabilities, Broadcasts in Spark, Effective & efficient Joins, Transformations and other during ingestion process itself.
Involved in creating Hive tables, and loading and analyzing data using hive queries.
Using Pig Scripts, transformed and loaded data into HBase tables.
Involved in coordinating and part of the client meetings for clarity of the requirements to ingest the Customers data for Various policies.
Worked on ORC hive tables and MapR Environment.

Environment: Hadoop, HBase,MapR,ORC,Map Reduce, Spark, Spark SQL, Kafka, Storm, HDFS, Hive, Sqoop, Oozie, Java, SQL, Shell script.

Confidential, Austin, TX

Sr. Big Data Developer.

Responsibilities:

Importing and exporting data into HDFS and Hive using Sqoop.
Used Oozie workflow engine to manage interdependent Hadoop jobs and to automate several types of Hadoop jobs such as Javamap-reduceHive, Pig, and Sqoop.
Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
Integrated Apache Storm with Kafka to perform web analytics. Uploaded click stream data from Kafka to Hdfs, Hbase and Hive by integrating with Storm.
Migrated Map reduce jobs to Spark Jobs to achieve better performance.
Created Hive External tables and loaded the data in to tables and query data using HQL.
Responsible to manage data coming from different sources.
Load and transform large sets of structured, semi structured and unstructured data even joins and some pre-aggregations before storing data into HDFS.
Involved in requirement and design phase to implement Streaming Lambda Architecture to use real time streaming using Spark and Kafka.
Used Spark Data Frame API to process Structured and Semi Structured files and load them back into S3 Bucket.
Migrated Map reduce jobs to Spark Jobs to achieve better performance
Automating and scheduling the Sqoop jobs in a timely manner using Unix Shell Scripts.
Developed Scala & Python scripts, UDFs using both Data frames/SQL and RDD/MapReduce in Spark1.3+ for Data Aggregation, queries and writing data back to OLTP system directly or through Sqoop.
Involved in creating workflow to run multiple hive and Pig Jobs, which run independently with time and data availability.
Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.

Environment: Hadoop,Map Reduce, Spark, Spark SQL, Kafka, Storm, HDFS, Hive, Sqoop, Oozie,Java, SQL, Shell script.

Confidential, New York, NY

Java/Hadoop Developer

Responsibilities:

Installed and configured Hadoop Map Reduce, HDFS and developed multiple Map Reduce jobs in Java for data cleansing and preprocessing.
Data back up and synchronization using Amazon Web Services.
Designed utility jobs to move data into Amazon Redshift from Hortonworks in-house platform
Importing and exporting data into HDFS and Hive using Sqoop.
Configured Flume to transport web server logs into HDFS
Extracted files from CouchDB, MongoDB through Sqoop and placed in HDFS for processed
Used Flume to collect, aggregate, and store the web log data from different sources like web servers, mobile and network devices and pushed to HDFS
Worked on Amazon Web Services as the primary cloud platform
Using Packer, Terraform and Ansible, migrate legacy and monolithic systems to Amazon Web Services.
Load and transform large sets of structured, semi structured and unstructured data
Supported Map Reduce Programs those are running on the cluster
Load log data into HDFS using Flume, Kafka and performing ETL integrations
Worked on loading of data from several flat files sources to Staging using Teradata Multiload, FastLoad.

Environment: Hadoop, Map Reduce, HDFS, Hive, Apache Spark, Kafka, CouchDB, Flume, AWS, Cassandra, Java, Struts, Servlets, HTML, XML, SQL, J2EE, MRUnit, JUnit, JDBC, SQL, XML, Eclipse.

Confidential

Software Engineer

Responsibilities:

Involved in designing of shares and cash modules using UML.
Effectively used the iterative waterfall model software development methodology during this time constraint project.
Used HTML and JSP for the web pages and used JavaScript for Client side validation.
Created XML pages with DTD’sfor front-end functionality and information exchange.
Responsible for writing Java SAX parsers programs.
Developed ANT build scripts to build and deploy application in enterprise archive format (.ear)
Performed Unit testing using JUnit and Functional Testing.
Used the Json response format to retrieve data from web servers.
Used JDBC 2.0 extensively and was involved in writing several SQL queries for the data retrieval.
Prepared program specifications for the loans module and involved in database designing.

Environment:Java, J2EE, EJB 2.0, Servlets, JavaScript, OO, JSP, JNDI, Java Beans, Web Logic, XML, XSL, Eclipse, PL/SQL, Oracle 8i, HTML, DHTML, UML.

We provide IT Staff Augmentation Services!

Sr. Big Data Consultant. Resume

Basking Ridge, NJ

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship