BigData Developer/Administrator Resume NJ - Hire IT People

SUMMARY

An overall experience of around 7 years in professional IT Services including Big Data and Cloud based applications spanning across technologies and business domains
Proficiency with Certification on Big Data/ NoSQL technologies as Developer/Administrator
Hands on experience in Hadoop ecosystem with components - Hadoop Map Reduce, HDFS, Oozie, HiveQL, Sqoop, HBase, MongoDB, Zookeeper, Pig and Flume with M5, CDH3&4 clusters and EMR cloud computing with Amazon Web Services (AWS)
Excellent understanding / knowledge of Hadoop (1 & 2) architecture Hands-on experience with components in Hadoop Ecosystem and knowledge of Mapper/Reducer/HDFS Frame work for scalability, distributed computing like CDN and high performance computing
Great understanding of Storm architecture and working structure for real-time processing
Experience in analyzing data using PIG Latin, HIVEQL and Custom MapReduce programing in JAVA, PYTHON(Streaming) using Development tools like Eclipse and Visual Studio
Worked on NoSQL databases including Cassandra, MongoDB, MarkLogic, and HBase. Managing and reviewing Hadoop log files, worked with HCatalog to open up access to Hive’s Metastore
Experience in importing / exporting the data using Sqoop from HDFS to Relational Database Systems/Mainframe and vice-versa. Used Hadoop Streaming utility to run Map/Reduce jobs
Good working knowledge on Hadoop Administration activities like Installing cluster, commissioning & decommissioning of Datanode, namenode recovery, capacity planning and slots configuration, Installing and configuring HBase, HDFS, PIG, HIVE, and Hadoop MapReduce
Expertise in Tableau Server Configuration and Dashboard building
Experience in data cleansing and visualization using Paxata, Tableau
Hands on experience configuring fully automated Microsoft cloud solutions like Microsoft System Center
Utilizing Virtualization (Hyper-V) and also VM Ware Virtualization Solutions
Performed Database administration activities like backup, recovery, integrity check and index reorganization. Involved in working with disaster recovery solutions such as replication and log shipping
Hands on experience in application development using Java, Python, C, COBOL, JCL, RDBMS, DB2 and Linux shell scripting
Expertise in DB2 UDB, Oracle, SQL Server 2008/2005/2000 , PL/SQL and My SQL with oracle workforce certifications
Hands on experience in implementing Lambda Architecture
Experience in working with Flume, Kafka, Storm to handle large volumes of streaming data
Hands on experience in developing Sqoop jobs to import the data from RDBMS sources like MySQL, Oracle, PostgreSQL into HDFS as well as exporting vice versa
Experience in writing workflows using Apache Oozie with job controllers like Hive and Sqoop
Expertise in handling multiple relational databases like SQL Server, PostgreSQL, MySQL and Oracle
Expertise in NoSQL databases like MarkLogic, Cassandra, MongoDB, HBase
Performed Database administration activities like backup, recovery, integrity check and index reorganization. Involved in working with disaster recovery solutions for data as well as log information
Highly motivated with strong Analytical Skills, Excellent communication and Interpersonal skills

TECHNICAL SKILLS

PROGRAMING LANGUAGES: JAVA/J2EE | JAVA SCRIPT | PYTHON | C | PL/SQL

NOSQL DATASTORE: MARKLOGIC | CASSANDRA | MONGODB | HBASE

BIGDATA TECHNOLOGIES: HADOOP ECOSYSTEM | SOLR

HADOOP DISTRIBUTIONS: CLOUDERA | HORTONWORKS | MAPR

CLOUD PLATFORMS: AWS | MICROSOFT SYSTEM CENTER | AZURE

DATA PREPARATION/BI TOOLS: PAXATA | TABLEAU

IDES, FRAMEWORKS, TOOLS: JMS | MAVEN | ECLIPSE IDE | MS OFFICE

AMAZON WEB SERVICES: EC2 | S3 | RDS | EMR | VPC | GLACIER | CLOUD WATCH | CLOUDTRAIL | IDENTITY AND ACCESS MANAGEMENT |EMR | DATA PIPELINE | CLOUD |FORMATION | SES | SNS | REDSHIFT

NETWORKING: VIRTUALIZATION | RIP V2 | VLANS | NETWORK ADMINISTRATION-UNIX |WINDOWS

DATABASE: SQL SERVER | ORACLE 9i | 10G

CONFIGURATION MANAGEMENT: CHEF | PUPPET

WEB TECHNOLOGY: RESTFUL SERVICES | HTML | XTML | CSS | JAVASCRIPT

METHODOLOGY: AGILE SCRUM

IDES, FRAMEWORKS, TOOLS: JMS | MAVEN | ECLIPSE IDE | MS OFFICE | ADOBE APPS

SERVERS: APACHE TOMCAT

MAINFRAME APPLICATION: OS 360 | Z-OS | SQL | ORACLE 9I | 10G | COBOL | JCL | IMS DB | DB2 | TSO/ISPF | SDF | VSAM | FILE-MANAGER

SOFTWARE DEVELOPMENT: SOFTWARE PROJECT MANAGEMENT | SOFTWARE QUALITY ASSURANCE

OPERATING SYSTEMS: CENTOS | FEDORA | UBUNTU| Z-OS| OS 360 | Z-OS | WINDOWS 2008/2012

PROFESSIONAL EXPERIENCE

Confidential, NJ

BigData Developer/Administrator

Responsibilities:

Involved in Analysis, Design and Development for technical specifications incorporating various AWS cloud technologies, making use of parallel transformation frameworks such as MapReduce (EMR)
Written JSON templates to launch configuration in AWS Cloud Formation
Moved centralized in house departments store data to S3 for transformation and analytics
Used Amazon Elastic MapReduce to transform the data stored in S3 and store the cleansed the data back to S3
Written custom MapReduce jobs along with Hive and Pig Scripts to do the same in EMR
Later as part of the process, loaded transformed data to RedShift from S3 on scheduled basis for warehousing
Used AWS DataPipeline for scheduling as well as pipelining the complete ETL workflow
Also installed and configured Tableau Server in AWS EC2 instance connecting to Redshift for BI analytics by different departments
Active Directory using AWS Directory Service was used for authenticating Tableau server

Environment: Amazon EMR, S3, EMR, DataPipeline, RedShift, Cloud Formation, Hive, Pig, Tableau Server

Confidential, IL

BigData Engineer

Responsibilities:

Designed, developed, and deployed lambda architecture
Designed and implemented persistence layer for sensor data collected on appserver through MQTT and HTTP protocol
IOT devices sends the real time events through their inbuilt protocols either MQTT or HTTP
All the events are captured by respective brokers which is running behind a load balancer.
Implemented messaging system using distributed Kafka cluster. All the brokers (or Appserver for HTTP events) push the incomings events through Kafka
Designed and implemented the processing layer(Speed Layer) of the events using Apache Storm by consuming them from Kafka Cluster
Implemented complex business rules in Storm and stored the processed data in Cassandra as well as Solr.
Storm also handles the ingestion of raw data from Kafka to HDFS.
HDFS serves the same for batch analytics.
Designed Retrieval Layer for the enterprise app by obtaining the data from Cassandra as well as Solr.
Designed data model in Cassandra for faster query retrieval as well as searching mechanism in Apache Solr.
Build a scheduled reporting on raw Datastore for BI analytics

Environment: Hadoop, Hive, Zookeeper, Storm, Cassandra, Java, Spring MVC, Apache Tomcat, Apache SOLR, Kafka, HortonWorks

Confidential

Responsibilities:

Coded Java Maven module to perform various Amazon S3 Operations
Written custom MapReduce program to cleanse and enrich data in EMR
Load transformed data back to S3
Load data to Marklogic server in AWS
Coded Java script for search operations to build search app with MarkLogic
Querying for keyword search with Facets number of years, URI’s against skillset
Coded Python module for URI based analysis
Build a Python Dictionary for predefines keyword from URI crawling data
Ingested the above data to MarkLogic
Build a Faceted Search on top of data stored in MarkLogic

Environment: Amazon EMR, S3, Java, Java Script, Python, MarkLogic-NoSQL

Confidential, MN

Big Data Developer/Analyst

Responsibilities:

Involved in Analysis, Design and Development for technical specifications incorporating various BigData technologies
Involved in the design and development phases of Software Development Life Cycle (SDLC) using Scrum methodology
Management of the existing cluster, worked on commissioning & decommissioning of Datanode, Namenode recovery, capacity planning, and slots configuration
Developed the data pipeline using Flume, Sqoop to ingest customer behavioral data and purchase histories into HDFS for analysis
Used Pig to perform data validation on the data ingested using scoop and flume and the cleansed data set is pushed into HBase
Used Cassandra integrations and computed various metrics for reporting on the tableau dashboard
Extensively used Cloudera CDH distribution of Hadoop
Developed job flows in Oozie to automate the workflow for pig and hive jobs.
Loaded the aggregated data onto DB2 from hadoop environment using Sqoop for reporting on the dashboard.
Used Impala to connect with Tableau for Reporting
Participate in requirement gathering and analysis phase of the project in documenting the business requirements by conducting workshops/meetings with various business users

Environment: Red Hat Linux, HDFS, Cloudera CDH, Map-Reduce, Hive, Java JDK1.6, Pig, Sqoop, Flume, Zookeeper, Oozie, DB2, Impala, Tableau

Confidential, NY

No-SQL/ DB2 BI Developer

Responsibilities:

Loaded the structured data from different applications which is mainly stored in DB2 into MongoDB using JDBC-ODBC connectivity
Data extracted from different RDBMS is converted to JSON object and pushed the same to MongoDB.
Unstructured files like XML’s, JSON files are processed using custom built Java API and pushed into mongodb.
Responsible for data modeling in MongoDB in order to load data which is coming as structured as well as unstructured data
Developed a Unified Turnover system which can take both structured and unstructured data and insert accordingly into MongoDB based on the rules defined in the Turnover system.
Performed data validation in Turnover system while ingesting data into MongoDB
Developed data export options in Turnover System from Data Warehouse in which we can load data when required into any RDBMS or can even exported as flat flies

Environment: Linux, Java JDK1.6, MongoDB, Java J2EE, various RDBMS

Confidential

Mainframe/ DB2 Developer

Responsibilities:

Coding JCl with change in storage disk
Coded New Cobol Modules
Analyzing structural Schema for improvements
Integration new DB2 queries for module improvements
Tool based analytics with SPUFI, SDF etc.

Environment: Z/OS, JCL, COBOL, DB2, CICS, SPUFI, QMF, SDF, CICS

Confidential

Responsibilities:

Tracking modules for various DML operations on DB2
Altered DB2 modules to improve efficiency
Tool based analytics with SPUFI, SDF etc.
CICS operation on various modules to perform addition, deletion, updation etc.
Tuning operations on recovery modules

Environment: Z/OS, JCL, COBOL, DB2, CICS, SPUFI, QMF, SDF

Confidential

Responsibilities:

Performed day to day maintenance on data retrieval
Tracking modules for various Hierarchical relationship in IMS DB implemented modules
Analyzing structural Schema for improvements
Coding modules for fetch operation on with IMSDB.
Tuning operations on recovery modules

Environment: Z/OS, JCL, COBOL, VSAM, IMS DB

We provide IT Staff Augmentation Services!

Bigdata Developer/administrator Resume

NJ

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship