Big Data Consultant Resume Weehawken, NJ - Hire IT People

SUMMARY:

Over 6 years professional IT experience which includes 3 years in Hadoop ecosystems and continuous working experience in Java
Cloudera Certificated Spark & Hadoop developer (CCA175)
Technical experience in manufacturing, finance and internet industry
Proficient in Java, Python, Scala and R
Working knowledge in NoSQL storage, such as Hive, Hbase, Cassandra, Redis, Cassandra, MongoDB, Impala
Exposed to setting up and maintaining Hadoop cluster on YARN.
Expert in importing and exporting data using Sqoop from HDFS to Relational Database Systems (Oracle/MySQL) and vice - versa
Involved in moving log files generated from various sources to HDFS for further processing through Flume
Experienced in building real time high throughput streaming service to transport high volume data using Kafka
Strong expertise in building traditional ETL pipelines using Informatica best practices
Capable of writing custom UDFs in Java for HIVE and Pig Latin to extend functionality
Experienced in job, workflow scheduling and monitoring tools like Oozie, Appworx
Developed data analysis and visualization using SQL, R, HiveQL, Spark SQL and Tableau.
Experienced in working with Apache Spark streaming API for near real time data processing
Skillful in Data Validation, Cleansing, Verification and identifying data mismatch
Experienced in writing custom MapReduce programs in Java
Familiar with Machine Learning and Statistical Analysis using R, Python and Spark
Knowledge in Machine Learning Framework including Scikit-learn, NLTK and MLlib
Algorithms including K-Means, KNN, Regression, SVM and Neural Network
Extensive experienced in writing complex SQL queries using Oracle Analytic Functions
Strong Database Experience in PL/SQL database programming to create Packages, Stored Procedures, Functions, Triggers, Index, Materialized Views and Cursors
Clear understanding of theories in ER modeling for OLTP and dimension modeling for OLAP
Strong in core java, data structure, algorithms design, Object-Oriented Design(OOD) and Java components like Collections Framework, Exception handling, I/O system, and Multithreading
Hands on experience in MVC architecture and Java EE frameworks like Struts, Spring MVC, and Hibernate.
Experienced in interacting with Clients, Business Analysts, IT leads, UAT Users and developers
Exposed to Agile Development environment, tools and methodologies
Authorized to work in the US for any employee

TECHNICAL SKILLS:

Hadoop Ecosystem\ Web Development: Hadoop2.0, Spark2.0, MapReduce, Pig0.15+ Hibernate, HTML, CSS, AJAX, Bootstrap, J2EE Hive, Sqoop, Flume, Kafka1.0+, Zookeeper3.0+\ Spring MVC, Node.js, Django Hbase, Oozie, Storm1.0

Programming Language\ Data Analysis & Visualization\: Java, Scala, Python, JavaScript, PL/SQL\ Python, R, SQL, Tableau, D3.js

Cloud Platform\ Scripting Language\: Amazon Web Service, Heroku\ UNIX Shell, HTML, XML, CSS, JSP

Operating Systems\ Environment\: Mac OS, Ubuntu, CentOS, Windows\ Agile, Scrum, TDD, JIRA, Confluence, Jenkins

Machine Learning Algorithm\ Database\: Linear Regression, Logistic Regression, MySQL, Oracle 11g, Exadata, PostgreSQL9.x, Decision, Tree, Neural Network, K Means, \ SQL Server 2012, 2016, MongoDB 3.2, HBase KNN, Support, Vector Machine\ 0.98, Cassandra3.0, Redis3.2

Others\ Machine Learning Framework: Docker, Informatica9.0, SSIS \ Spark MLlib, SciPy, Matplotlib, Pandas, Numpy

PROFESSIONAL EXPERIENCE:

Confidential, Weehawken, NJ

Big Data Consultant

Responsibilities:

Integrated data from relational database (Oracle, MySQL, SQL Server) to HDFS using Sqoop
Configured flume agents to collect real time logging data from application servers
Implemented reliable and scalable Kafka message system for high throughput data ingestion
Wrote MapReduce programs in Java for offline batch processing
Created multiple Hive tables with partitioning and bucketing for efficient data access
Processed stream data using Spark Streaming for risk evaluation and product recommendation
Cached key result from streaming and batch processing system in Redis for fast access
Involved in Analyzing time series data using Spark MLlib, Scipy, Matplotlib
Evaluated model accuracy and tune parameters with offline simulation data
Automated workflows using Oozie and shell Scripts

Environment: Hadoop 2.6, Cloudera CDH 5X, Kafka, Sqoop, Flume, Zookeeper, Spark 2.0, Scala, Redis, Oozie, Shell script, Oracle, MySQL, SQL Server, Python, Java

Confidential, Erie, PA

Senior Data Warehouse Consultant

Responsibilities:

Integrated structured data from various portfolios of sources to Hive using Sqoop
Ingested large amount of semi-structured Application data using Flume in real time
Stored data in Kafka cluster as a central buffer
Developed MapReduce jobs for data cleaning, validation and categorization
Built operational data store in Hive for raw data
Created Fact/Dim/Bridge tables with Star Schema by Kimball Approach
Developed periodic analytic/ aggregation queries and saved results in Hbase for fast access
Wrote Hive UDFs for data transformation and aggregation
Used Informatica, SSIS to integrate Oracle E-business suite with Exadata and SQL Server
Involved in building business intelligence dashboards using tools like Tablea, OBIEE
Drafted shell scripts for job execution and scheduling using Appworx and Oozie
Supported production Data Lake in terms of data, accuracy, consistency and performance
Worked by Agile/SCRUM methodologies

Environment: Hadoop 2.4, Hive 0.12, Sqoop, Flume, Kafka, Hbase, Oozie, Appworx, Tableau, OBIEE, MySQL, SQL Server2012, 2016, Informatica9.0, Java, SQL, PL/SQL, Shell script, Exadata, Agile

Confidential

Hadoop developer

Responsibilities:

Cleaned data using Map Reduce programs in Java for data cleaning and categorization
Wrote shell script to manipulate files on application servers
Used Flume to collect, aggregate and store log data from different sources
Built Informatica workflows to capture change in application database
Captured streaming data by Kafka and do real time analysis using Storm
Stored analysis from streaming data in Hbase for responsive ad hoc query
Created thousands Fact/Dim tables in Hive with partitioning and bucketing for efficient access
Wrote HiveQL scripts for data analysis and exploration
Extracted data using Sqoop from HDFS to MySQL for business intelligence team
Worked with analytics team to prepare and visualize results in Tableau for reporting
Used Oozie to orchestrate the MapReduce jobs in order to setup automated workflow

Environment: Hadoop, Java, HDFS, Flume, Hive, MapReduce, Sqoop, HQL, Eclipse, MySQL, Tableau, D3, Hbase, Kafka, Spark

Confidential

Hadoop developer

Responsibilities:

Used Sqoop to import data from Oracle and MySQL to Hive
Wrote HiveQL queries to retrieve and analyze the Hive storage
Used Flume to stream the log data and social media JSON Format data from sources.
Developed MapReduce programs, Hive SerDes to clean and parse data in HDFS obtained from various data sources.
Used Oozie to orchestrate the MapReduce jobs in order to setup automated workflow
Collected high throughput data using Kafka and analyzed by Spark
Applied Logistic Regression algorithm to build the model
Visualized the data with Python Matplotlib, Tableau and D3.js

Environment: Hadoop 2.2, Spark, MapReduce, MySQL, Oracle SQL, Hive, Sqoop, Tableau, Matplotlib, D3.js

Confidential

Oracle E-Business-Suite Developer

Responsibilities:

Develop various Oracle Applications using PL/SQL , SQL*Plus , Forms, Reports, Workflow Builder, and Application Object Library
Use Oracle E-Business Suite applications for accounts payable/accounts receivable, general ledger and cash management supporting data extraction, integration, filtering, and validation
Data integration from legacy system into Oracle E-Business Suite 11i
Maintain and build workflow extension to support new business process
Build customized data quality checking jobs to maintain data quality between data warehouse and source system in ETL process using Informatica
Improve production performance through determining bottlenecks like implementing database partitioning and increasing block size, data cache size, sequence buffer length, target based commit interval and SQL overrides

Environment: Linux, Java, Shell script, SQL, PL/SQL, Oracle form/report builder, Oracle E-Business Suite 11i, Informatica

Confidential

Backend Developer

Responsibilities:

Developed the data parsing system on XML/JSON
Involved in system design, which is based on Spring Struts Hibernate framework.
Worked in Spring Hibernate Template to access the MySQL database.
Involved in Unit testing of the components and created unit test cases and did unit test review
Developed online data analysis system by using IBM Cognos
Developed and fine-tuned database(MySQL)
Optimize current data models and design diagrams
Ensured data integrity and detected data errors and misuse

Environment: J2SE/J2EE 5.0, JSP, HTML, JavaScript, JDBC, Eclipse, IBM Cognos, IBM DataStage, MySQL, MySQL Workbench, Toad, Linux, shell script

We provide IT Staff Augmentation Services!

Big Data Consultant Resume

Weehawken, NJ

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship