We provide IT Staff Augmentation Services!

Bigdata Developer/administrator Resume

2.00/5 (Submit Your Rating)

NJ

SUMMARY

  • An overall experience of around 7 years in professional IT Services including Big Data and Cloud based applications spanning across technologies and business domains
  • Proficiency with Certification on Big Data/ NoSQL technologies as Developer/Administrator
  • Hands on experience in Hadoop ecosystem with components - Hadoop Map Reduce, HDFS, Oozie, HiveQL, Sqoop, HBase, MongoDB, Zookeeper, Pig and Flume with M5, CDH3&4 clusters and EMR cloud computing with Amazon Web Services (AWS)
  • Excellent understanding / knowledge of Hadoop (1 & 2) architecture Hands-on experience with components in Hadoop Ecosystem and knowledge of Mapper/Reducer/HDFS Frame work for scalability, distributed computing like CDN and high performance computing
  • Great understanding of Storm architecture and working structure for real-time processing
  • Experience in analyzing data using PIG Latin, HIVEQL and Custom MapReduce programing in JAVA, PYTHON(Streaming) using Development tools like Eclipse and Visual Studio
  • Worked on NoSQL databases including Cassandra, MongoDB, MarkLogic, and HBase. Managing and reviewing Hadoop log files, worked with HCatalog to open up access to Hive’s Metastore
  • Experience in importing / exporting the data using Sqoop from HDFS to Relational Database Systems/Mainframe and vice-versa. Used Hadoop Streaming utility to run Map/Reduce jobs
  • Good working knowledge on Hadoop Administration activities like Installing cluster, commissioning & decommissioning of Datanode, namenode recovery, capacity planning and slots configuration, Installing and configuring HBase, HDFS, PIG, HIVE, and Hadoop MapReduce
  • Expertise in Tableau Server Configuration and Dashboard building
  • Experience in data cleansing and visualization using Paxata, Tableau
  • Hands on experience configuring fully automated Microsoft cloud solutions like Microsoft System Center
  • Utilizing Virtualization (Hyper-V) and also VM Ware Virtualization Solutions
  • Performed Database administration activities like backup, recovery, integrity check and index reorganization. Involved in working with disaster recovery solutions such as replication and log shipping
  • Hands on experience in application development using Java, Python, C, COBOL, JCL, RDBMS, DB2 and Linux shell scripting
  • Expertise in DB2 UDB, Oracle, SQL Server 2008/2005/2000 , PL/SQL and My SQL with oracle workforce certifications
  • Hands on experience in implementing Lambda Architecture
  • Experience in working with Flume, Kafka, Storm to handle large volumes of streaming data
  • Hands on experience in developing Sqoop jobs to import the data from RDBMS sources like MySQL, Oracle, PostgreSQL into HDFS as well as exporting vice versa
  • Experience in writing workflows using Apache Oozie with job controllers like Hive and Sqoop
  • Expertise in handling multiple relational databases like SQL Server, PostgreSQL, MySQL and Oracle
  • Expertise in NoSQL databases like MarkLogic, Cassandra, MongoDB, HBase
  • Performed Database administration activities like backup, recovery, integrity check and index reorganization. Involved in working with disaster recovery solutions for data as well as log information
  • Highly motivated with strong Analytical Skills, Excellent communication and Interpersonal skills

TECHNICAL SKILLS

PROGRAMING LANGUAGES: JAVA/J2EE | JAVA SCRIPT | PYTHON | C | PL/SQL

NOSQL DATASTORE: MARKLOGIC | CASSANDRA | MONGODB | HBASE

BIGDATA TECHNOLOGIES: HADOOP ECOSYSTEM | SOLR

HADOOP DISTRIBUTIONS: CLOUDERA | HORTONWORKS | MAPR

HADOOP ECOSYSTEM: HDFS | MAPREDUCE | PIG | HIVE | OOZIE | SQOOP | ZOOKEEPER | MAHOUT | FLUME | SPARK | KAFKA

CLOUD PLATFORMS: AWS | MICROSOFT SYSTEM CENTER | AZURE

DATA PREPARATION/BI TOOLS: PAXATA | TABLEAU

IDES, FRAMEWORKS, TOOLS: JMS | MAVEN | ECLIPSE IDE | MS OFFICE

AMAZON WEB SERVICES: EC2 | S3 | RDS | EMR | VPC | GLACIER | CLOUD WATCH | CLOUDTRAIL | IDENTITY AND ACCESS MANAGEMENT |EMR | DATA PIPELINE | CLOUD |FORMATION | SES | SNS | REDSHIFT

NETWORKING: VIRTUALIZATION | RIP V2 | VLANS | NETWORK ADMINISTRATION-UNIX |WINDOWS

DATABASE: SQL SERVER | ORACLE 9i | 10G

CONFIGURATION MANAGEMENT: CHEF | PUPPET

WEB TECHNOLOGY: RESTFUL SERVICES | HTML | XTML | CSS | JAVASCRIPT

METHODOLOGY: AGILE SCRUM

IDES, FRAMEWORKS, TOOLS: JMS | MAVEN | ECLIPSE IDE | MS OFFICE | ADOBE APPS

SERVERS: APACHE TOMCAT

MAINFRAME APPLICATION: OS 360 | Z-OS | SQL | ORACLE 9I | 10G | COBOL | JCL | IMS DB | DB2 | TSO/ISPF | SDF | VSAM | FILE-MANAGER

SOFTWARE DEVELOPMENT: SOFTWARE PROJECT MANAGEMENT | SOFTWARE QUALITY ASSURANCE

OPERATING SYSTEMS: CENTOS | FEDORA | UBUNTU| Z-OS| OS 360 | Z-OS | WINDOWS 2008/2012

PROFESSIONAL EXPERIENCE

Confidential, NJ

BigData Developer/Administrator

Responsibilities:

  • Involved in Analysis, Design and Development for technical specifications incorporating various AWS cloud technologies, making use of parallel transformation frameworks such as MapReduce (EMR)
  • Written JSON templates to launch configuration in AWS Cloud Formation
  • Moved centralized in house departments store data to S3 for transformation and analytics
  • Used Amazon Elastic MapReduce to transform the data stored in S3 and store the cleansed the data back to S3
  • Written custom MapReduce jobs along with Hive and Pig Scripts to do the same in EMR
  • Later as part of the process, loaded transformed data to RedShift from S3 on scheduled basis for warehousing
  • Used AWS DataPipeline for scheduling as well as pipelining the complete ETL workflow
  • Also installed and configured Tableau Server in AWS EC2 instance connecting to Redshift for BI analytics by different departments
  • Active Directory using AWS Directory Service was used for authenticating Tableau server

Environment: Amazon EMR, S3, EMR, DataPipeline, RedShift, Cloud Formation, Hive, Pig, Tableau Server

Confidential, IL

BigData Engineer

Responsibilities:

  • Designed, developed, and deployed lambda architecture
  • Designed and implemented persistence layer for sensor data collected on appserver through MQTT and HTTP protocol
  • IOT devices sends the real time events through their inbuilt protocols either MQTT or HTTP
  • All the events are captured by respective brokers which is running behind a load balancer.
  • Implemented messaging system using distributed Kafka cluster. All the brokers (or Appserver for HTTP events) push the incomings events through Kafka
  • Designed and implemented the processing layer(Speed Layer) of the events using Apache Storm by consuming them from Kafka Cluster
  • Implemented complex business rules in Storm and stored the processed data in Cassandra as well as Solr.
  • Storm also handles the ingestion of raw data from Kafka to HDFS.
  • HDFS serves the same for batch analytics.
  • Designed Retrieval Layer for the enterprise app by obtaining the data from Cassandra as well as Solr.
  • Designed data model in Cassandra for faster query retrieval as well as searching mechanism in Apache Solr.
  • Build a scheduled reporting on raw Datastore for BI analytics

Environment: Hadoop, Hive, Zookeeper, Storm, Cassandra, Java, Spring MVC, Apache Tomcat, Apache SOLR, Kafka, HortonWorks

Confidential

Responsibilities:

  • Coded Java Maven module to perform various Amazon S3 Operations
  • Written custom MapReduce program to cleanse and enrich data in EMR
  • Load transformed data back to S3
  • Load data to Marklogic server in AWS
  • Coded Java script for search operations to build search app with MarkLogic
  • Querying for keyword search with Facets number of years, URI’s against skillset
  • Coded Python module for URI based analysis
  • Build a Python Dictionary for predefines keyword from URI crawling data
  • Ingested the above data to MarkLogic
  • Build a Faceted Search on top of data stored in MarkLogic

Environment: Amazon EMR, S3, Java, Java Script, Python, MarkLogic-NoSQL

Confidential, MN

Big Data Developer/Analyst

Responsibilities:

  • Involved in Analysis, Design and Development for technical specifications incorporating various BigData technologies
  • Involved in the design and development phases of Software Development Life Cycle (SDLC) using Scrum methodology
  • Management of the existing cluster, worked on commissioning & decommissioning of Datanode, Namenode recovery, capacity planning, and slots configuration
  • Developed the data pipeline using Flume, Sqoop to ingest customer behavioral data and purchase histories into HDFS for analysis
  • Used Pig to perform data validation on the data ingested using scoop and flume and the cleansed data set is pushed into HBase
  • Used Cassandra integrations and computed various metrics for reporting on the tableau dashboard
  • Extensively used Cloudera CDH distribution of Hadoop
  • Developed job flows in Oozie to automate the workflow for pig and hive jobs.
  • Loaded the aggregated data onto DB2 from hadoop environment using Sqoop for reporting on the dashboard.
  • Used Impala to connect with Tableau for Reporting
  • Participate in requirement gathering and analysis phase of the project in documenting the business requirements by conducting workshops/meetings with various business users

Environment: Red Hat Linux, HDFS, Cloudera CDH, Map-Reduce, Hive, Java JDK1.6, Pig, Sqoop, Flume, Zookeeper, Oozie, DB2, Impala, Tableau

Confidential, NY

No-SQL/ DB2 BI Developer

Responsibilities:

  • Loaded the structured data from different applications which is mainly stored in DB2 into MongoDB using JDBC-ODBC connectivity
  • Data extracted from different RDBMS is converted to JSON object and pushed the same to MongoDB.
  • Unstructured files like XML’s, JSON files are processed using custom built Java API and pushed into mongodb.
  • Responsible for data modeling in MongoDB in order to load data which is coming as structured as well as unstructured data
  • Developed a Unified Turnover system which can take both structured and unstructured data and insert accordingly into MongoDB based on the rules defined in the Turnover system.
  • Performed data validation in Turnover system while ingesting data into MongoDB
  • Developed data export options in Turnover System from Data Warehouse in which we can load data when required into any RDBMS or can even exported as flat flies

Environment: Linux, Java JDK1.6, MongoDB, Java J2EE, various RDBMS

Confidential

Mainframe/ DB2 Developer

Responsibilities:

  • Coding JCl with change in storage disk
  • Coded New Cobol Modules
  • Analyzing structural Schema for improvements
  • Integration new DB2 queries for module improvements
  • Tool based analytics with SPUFI, SDF etc.

Environment: Z/OS, JCL, COBOL, DB2, CICS, SPUFI, QMF, SDF, CICS

Confidential

Responsibilities:

  • Tracking modules for various DML operations on DB2
  • Altered DB2 modules to improve efficiency
  • Tool based analytics with SPUFI, SDF etc.
  • CICS operation on various modules to perform addition, deletion, updation etc.
  • Tuning operations on recovery modules

Environment: Z/OS, JCL, COBOL, DB2, CICS, SPUFI, QMF, SDF

Confidential

Responsibilities:

  • Performed day to day maintenance on data retrieval
  • Tracking modules for various Hierarchical relationship in IMS DB implemented modules
  • Analyzing structural Schema for improvements
  • Coding modules for fetch operation on with IMSDB.
  • Tuning operations on recovery modules

Environment: Z/OS, JCL, COBOL, VSAM, IMS DB

We'd love your feedback!