Sr. Cassandra Administrator Resume
Tampa, FL
SUMMARY
- Around 4.5 Years of IT experience among which 4+ years wif excellent noledge on Cassandra Administration and Development (NoSQL)
- Hands - on experience in development and design of applications in Java and its related frameworks.
- Experience in Designing, Planning, Administering, Installation, Configuring, Troubleshooting, Performance monitoring and Fine-tuning of Cassandra Cluster.
- Experience in Restful Services.
- Server-side scripting engine development and deployment wif JavaScript andNODEJS
- Service Oriented Architecture
- Web Services design, coding and deployment.
- Experience in developing responsive feature rich web applications using front and back end technologies like Object Oriented (OO) JavaScript, React JS, Angular JS, NodeJS
- Extensive experience wif client and server-side JavaScript frameworks such as React JS, Angular JS and building AJAX driven Single Page Applications (SPA) wif Accessible Web Applications
- Superior noledge onCassandraarchitecture wif better understanding of read and write processes including SSTable, Mem-table and Commit log.
- Good Understanding of Distributed Systems and Parallel Processing architecture.
- Excellent noledge on CQL (CassandraQuery Language), for obtaining the data present inCassandraby running queries in CQL.
- Experience inCassandrastress tool for benchmarkingCassandraCluster.
- Good Knowledge inCassandracluster topology and Virtual nodes.
- Experience in installing multi-data center and multi-rackCassandracluster.
- Two projects of experience in exporting data into DataStaxCassandracluster from RDBMS using Java Driver or Sqoop tools.
- Experience inCassandradata modeling along wif managing and scheduling of the data backup and restore operations.
- Experience in Tuning the JVM.
- Good noledge in implementation of DataStax Java driver to connect, load and retrieve data fromCassandradatabase.
- Experience in setting up alerts in OpsCenter.
- Excellent noledge on tired and leveled sized compaction strategies.
- Involved in designing various stages of migrating data from RDBMS toCassandra.
- Experience in deploying theCassandracluster in cloud, on premises, working on the data storage and disaster recovery forCassandra.
- Experience in development Using Cassandra JAVA driver (2.X).
- Designing data models inCassandraand working wifCassandraQuery Language.
- In depth noledge inCassandraread and write paths and internal architecture.
- Implemented multi-data center and multi-rackCassandracluster.
- Experience in design, development, maintenance and support of Big Data Analytics using Hadoop Ecosystem components like HDFS, Hive, Pig, HBase, Sqoop, Flume, Zookeeper, MapReduce, and Oozie.
- Strong working experience wif ingestion, storage, querying, processing and analysis of big data.
- Experience in installation, configuration, supporting and managing Hadoop clusters.
- Worked in Multiple Environments in installation and configuration of Hadoop Clusters
- Experience wif SQL, PL/SQL and database concepts.
- Good understanding of NoSQL databases.
- Hands on experience wif Amazon Web Services (AWS), Amazon EC2 and EMR
- Experience on creating databases, tables and views in HIVE, IMPALA
- Experience in working wif different data sources like Flat files, XML files and Databases.
- Experience in database design, entity relationships, database analysis, programming SQL, stored procedure's PL/ SQL, packages and triggers in Oracle and MongoDB on Unix / Linux.
- Experience in various phases of Software Development Life Cycle (Analysis, Requirements gathering, Designing) wif expertise in documenting various requirement specifications, functional specifications, Test Plans, Source to Target mappings, SQL Joins.
- Worked on different operating systems like UNIX/Linux, Windows XP and Windows 2K.
- Goal oriented self-starter, quick learner, team player and proficient in handling multiple projects simultaneously.
TECHNICAL SKILLS
Cassandra: CassandraDataStax EnterpriseCassandra, Open sourceCassandra, Cluster Management Tools OpsCenter, ccm, Kafka, Spark, Stress tool, Sqoop, Cloudera Manager
Server Automation Tools: Chef, Puppet
Databases: Dynamo DB, Microsoft SQL Server, MySQL, Oracle
Languages: C, Java, Python, JavaScript, HTML, CSS
Operating Systems: Linux (Red Hat, CentOS, Ubuntu), Windows
Version Control, related Tools: Git, SVN, gerrit, jenkins
Bug Tracking Tools: QC, Bugzilla
Job Scheduling: Autosys, Cron
PROFESSIONAL EXPERIENCE
Confidential, Tampa, FL
Sr. Cassandra Administrator
Responsibilities:
- Maintained a Multi-DatacenterCassandracluster.
- Experience in performance tuning aCassandracluster to optimize writes and reads.
- Involved in the process of data modelingCassandraSchema.
- Developed designs in securing the application using form-based autantication using HTML5, XHTML, JavaScript, jQuery and CSS3.
- Created Node.JS backend for creating RESTful Web Services using the Express Framework and Mongoose to connect wif MongoDB.
- Written Ajax driven JSON consuming JavaScript functions to save User selections such as radio button, drop-down menu selections into a cookie.
- Installed and Configured DataStax OpsCenter forCassandraCluster maintenance and alerts.
- BenchmarkedCassandracluster based on the expected traffic for the use case and optimized for low latency.
- BuiltCassandraCluster on both the physical machines and on AWS.
- AutomatedCassandraBuilds/installation/monitoring etc.
- Involved in requirements gathering and capacity planning for multi data center (four)Cassandra cluster.
- Administered and maintained multi rackCassandracluster using OpsCenter based on the use case implemented consistency level for reads and writes.
- Automated and deployedCassandraenvironments using Chef recipes.
- Optimized theCassandracluster by making changes inCassandraconfiguration file and Linux OS configurations.
- Setup, upgrade and maintainCassandraDSE clusters.
- Tune databases and provide design changes and support stress tests to proactively fix problems.
- Working asCassandraAdmin (Datastax DSE-DevOps-NoSQL DB) on 39 node cluster.
- Created tables in HIVE and one-time data load into HIVE Environment.
- Converting existing SQL Queries/Stored Procedures into Hive environment.
- Used Apache KAFKA for Asynchronous exchange of information between Different Business Applications.
- Worked wif Docker Containers for Testing.
- Administration and maintenance of the cluster using OpsCenter, Devcenter, Linux, Node tool etc.
- Data migration from Teradata toCassandrausing Teradata Fexport &Cassandraloader.
- Installing, configuringCassandraon AWS platform.
- Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.
- Developed Spark Streaming applications for Real Time Processing.
- Working on OpsCenter(monitoring), Devcenter and Node tool.
- Worked onCassandraData modelling, NoSQL Architecture, DSECassandraDatabase administration. Key space creation, Table creation, Secondary and Solr index creation, User creation & access administration.
- Working closely wif Datastax to resolve issues on cluster using ticketing mechanism.
- Node tool repair, Compaction, Secondary index issues resolution.
- Query tuning & performance tuning on cluster& suggesting best practice for developers.
- Working closely wifCassandraloading activity on history load and incremental loads from Teradata and Oracle Databases and resolving loading issues and tuning the loader for optimal performance.
Environment: Cassandra2.1, AWS, Tera Data, SOL, Datastax 4.7, DevCenter, Cqlsh, OpsCenter, Shell Scripting, HIVE, APACHE KAFKA, DOCKER, Stack trace.
Confidential, Dallas, TX
Sr. Cassandra Administrator
Responsibilities:
- Responsible for building scalable distributed data solutions using DatastaxCassandra.
- Involved in business requirement gathering and proof of concept creation.
- Created data models in CQL for customer data.
- Involved in Hardware installation and capacity planning for cluster setup.
- Involved in the hardware decisions like CPU, RAM and disk types and quantities.
- Used the Spark -CassandraConnector to load data to and fromCassandra.
- Worked wif the Data architect and the Linux admin team to set up, configure, initialize and troubleshoot an experimental cluster of 12 nodes wif 3 TB of RAM and 60 TB of disk space.
- Ran many performance tests using theCassandra-stress tool in order to measure and improve the read and write performance of the cluster.
- Wrote and modified YAML scripts to set the configuration properties like node addresses, replication factors, client storage space, memTable size and flush times etc.
- Used the Datastax OpsCenter for maintenance operations and Key space and table management.
- Loaded and transformed large sets of structured, semi structured and unstructured data in various formats like text, zip, XML, YAML and JSON.
- Created data-models for customer data using theCassandraQuery Language.
- Used collections like lists, sets and maps to create data models highly optimized for reads and writes.
- Created User defined types to store specialized data structures inCassandra.
- Developed PIG UDFs for manipulating the data and extracting useful information according to Business
- Requirements and implemented them using the Datastax Pig functionality.
- Responsible for creating Hive tables based on business requirements
- Implemented advanced procedures like text analytics and processing using the in-memory computing capabilities like Spark.
- Enhanced and optimized product Spark code to aggregate, group and run data mining tasks using the Spark framework.
- Implemented the clustering algorithms in Mahout to cluster consumer by location of purchase and general category of purchase to create specialized and targeted credit and foreign exchange products.
- Implemented a distributed messaging queue to integrate wifCassandrausing Apache Kafka and Zookeeper.
- Involved in a POC to implement a failsafe distributed data storage and computation system using Apache YARN.
- Involved in the implementation of a POC using the OpenStack Cloud Computing Framework.
- Tuned and recorded performance ofCassandraclusters by altering the JVM parameters like -Xmx and -Xms. Changed garbage collection cycles to place them in tune wif backups/compactions to mitigate disk contention.
- Queried and analyzed data from DatastaxCassandrafor quick searching, sorting and grouping.
- Implemented Partitioning, Dynamic Partitions and Buckets in HIVE for efficient data access.
- Participated in NoSQL database integration and implementation.
- Exported the analyzed data into relational databases using Sqoop for visualization and to generate reports.
- Gathered the business requirements from the Business Partners and Subject Matter Experts like Data Scientists.
Environment: Apache Hadoop 2.2.0, Cloudera 4.5, HDP 1.2, Apache Kafka,Cassandra, MapReduce, Spark, Hive 0.12, Pig 0.11, HBase, Linux, XML.
Confidential
Cassandra Administrator
Responsibilities:
- Responsible for the build out, day-to-day managing and supportingCassandraclusters.
- Configure backup, alerts, repairs and monitoring ofCassandraclusters using Opscenter.
- Troubleshoot the performance issues.
- Troubleshoot read/write latency and timeout issues using nodetool cfstats, tpstats, and cfhistogram.
- Involved in migrating data from Oracle toCassandra.
- Created the upgrade plans for DSE upgrades.
- Designed and developed an API for rider's preferences wif all CRUD capabilities.
- Installed DatastaxCassandra4.5.1 in Production, Testing environments as per best practices.
- Installed Datastax OpsCenter for monitoring purposes.
- Administered, monitored and maintained multi data-centerCassandracluster using OpsCenter and Nagios in production.
- Involved closely wif developers for choosing right compaction strategies and consistency levels.
- Involved inCassandraCluster environment administration dat includes commissioning and decommissioning nodes, cluster capacity planning, performance tuning, cluster Monitoring and Troubleshooting.
- Performed daily administrative tasks of Cluster health check, balancing, and name node metadata backup.
- Performed back up, added libraries and jars successfully migrate from the existing infrastructure to latest releases.
Environment: Datastax 4.7,Cassandra2.1, DevCenter, Cqlsh, OpsCenter, Shell Scripting, Oracle11g, Eclipse, SQL, windows7, Log4J, GIT, AWS.
Confidential
Big data Developer
Responsibilities:
- Installed and Configured Apache Hadoop clusters for application development and Hadoop tools like Hive, Pig, Oozie, Zookeeper, HBase, Flume and Sqoop.
- Implemented multiple Map Reduce Jobs in java for data cleansing and pre-processing.
- Responsible for Cluster maintenance, adding and removing cluster nodes, Cluster Monitoring
- Hands on experience in writing Linux/Unix Shell scripting and python Scripting
- Developed a data cleaner for IMDB data files in python for a research project.
- Written python scripts for internal testing which pushes the data reading form a file into Kafka queue which in turn is consumed by the Storm application.
- Troubleshooting, manage and review data backups and log files.
- Responsible to manage data coming from various sources.
- Worked wif python to create UDF's and used them as part of Pig Scripts and used python for creating graphs for data analysis.
- Involved in importing the real-time data to Hadoop using Kafka and implemented the Oozie job for daily imports.
- Managed and scheduled Jobs on a Hadoop cluster.
- REST APIs were developed in python for Video streaming interfaces.
- Involved in defining job flows, managing and reviewing log files.
- Developed analytics data store in python, MongoDB for data analysis
- Installed Oozie workflow engine to run multiple Map Reduce, HiveQL and Pig jobs.
- Implemented a script to transmit sysprin information from Oracle to HBase using Sqoop.
- Participated in requirement gathering form the Experts and Business Partners and converting the requirements into technical specifications.
- Created Hive tables and impala to store the processed results in a tabular format.
- Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs and Scala.
- Learning Supported and maintained HDFS architecture
- Collaborated wif application teams to install operating system and Hadoop updates, patches, version upgrades when required.
Environment: Hadoop, MapReduce, HDFS, Java, SQOOP, Flume, Kafka, LINUX, OOZIE, Python, Pig, AWS, Scala, ETL, MySQL, JIRA, Hive, Jenkins, HBASE, Oracle.
