Sr. Cassandra Developer/admin Resume
Mt Laurel, NJ
SUMMARY
- Over 7 years of professional IT experience in Big Data Technologies and Data Analysis with 5+ years of hands - on experience in development and design of applications in Java and its related frameworks .
- Experienced in installing, configuring and monitoring the Datastax Cassandra Cluster , DevCenter and OpsCenter .
- Excellent understanding of Cassandra Architecture and management tool like OpsCenter
- Commendable knowledge on read and write processes, including SSTables, MemTables and Commitlog
- Experience with querying on data present in Cassandra cluster using CQL (Cassandra Query Language)
- Used DataStaxOpsCenter and nodetool utilities to monitor the cluster.
- Experience in taking the data backups through nodetool snapshots.
- Experience in moving the SSTables data on to the live cluster.
- Experience in using Sqoop to import the data on to Cassandra tables from different relational databases
- Tested the application and the cluster with different consistency levels to check for the writes and reads performance with respective to Consistency Level.
- Experience in Importing data from various sources to the Cassandra cluster using Java API's
- Experience in data modeling of Cassandra.
- Experience in creating tables involving collections, TTLs, counter, UDTs as part of data modeling.
- Knowledge of Hadoop Ecosystem.
- Good experience in using Elastic Map Reduce on Amazon Web Services (AWS) cloud for supporting data analysis projects.
- Basic Knowledge of Spark and SCALA programming
- Knowledge of managing and scheduling backup and restore operations
TECHNICAL SKILLS
- Cassandra, DataStax, DevCenter and OpsCenter, Node tool, Spark on Cassandra
- Cassandra, HBase.
- SQL Server, MySQL, Oracle.
- HDFS, Map Reduce, HBase, YARN, Hive, Spark, Oozie, Zookeeper, Sqoop, and Pig.
- Apache Tomcat, Web Logic, Web Sphere, JBOSS.
- Eclipse, Visual Studio, NetBeans, Pycharm, GIT, Spark Streaming and Batch, PowerBI, MemSQL, Caching Frameworks(ElasticCache), Maven, R Programming Language, OpsCenter, DevCenter, Node tool, JIRA, ANT, MatLab,
PROFESSIONAL EXPERIENCE
Sr. Cassandra Developer/Admin
Confidential, Mt. Laurel, NJ
Responsibilities:
- Involved in capacity planning and requirements gathering for multi datacenter Cassandra cluster.
- Involved in the process of designing Cassandra Architecture.
- Involved in NoSQL database design, integration, and implementation.
- Involved in the process of Conceptual and Physical Data Modeling techniques.
- Installed, Configured, Tested Datastax Enterprise Cassandra multi-node cluster which has 4 Datacenters and 5 nodes each.
- Installed and configured Cassandra cluster and CQL on the cluster.
- Involved in the Migration of data from one database to another database.
- Involved in the process of Cassandra data modeling and building efficient distributed schema.
- Involved in the process of data mover for disaster recovery platforms Backup and recovery.
- Experienced in upgrading the existing Cassandra cluster to latest releases.
- Experienced in OpsCenter and its capabilities.
- Experienced in Apache Spark with Scala.
- Experienced in storing the analyzed results into the Cassandra cluster.
- Experienced in provisioning and managing multi-datacenter Cassandra cluster on public cloud environment - Amazon Web Services (AWS) - EC2.
- Imported data from various resources to the Cassandra cluster using Java APIs.
- Enabled Security properties like authenticator and authorizer to protect the data from unauthorized users.
- Consistency levels for read and write queries were implemented depending on the use case.
- Optimized the Cassandra cluster by making changes in Cassandra properties and Linux OS configurations.
- Working closely with Datastax to resolve issues on cluster using ticketing mechanism.
- Configured Performance Tuning and Monitoring for Cassandra Read and Write processes for fast I/ O operations and low latency time.
- Performed Stress and Performance testing to benchmark the cluster.
- Administered Cassandra cluster using DatastaxOpsCenter and monitored CPU usage, memory usage and health of nodes in the cluster.
- Configured accordingly to achieve maximum throughput and execution time based on the benchmarking results.
- Loaded and transformed large sets of structured, semi structured and unstructured data in various formatslike text, zip, XML, YAML and JSON
- Experience using DataStax Pig functionality to develop the PIG UDFs for manipulating the data and extractinguseful information according to Business Requirements
- Experienced in performance tuning a Cassandra Cluster to optimize writes and reads using the Cassandrastresstool
- Extensively worked on Java persistence layer in application migration to Cassandra using Spark to load data to and from Cassandra Cluster
- Used Github version control for tagging the new versions.
- Configured, Documented, and Demonstrated inter node communication between Cassandra nodes and client using SSL encryption.
- Knowledge on applying updates and maintenance patches for the existing clusters.
- Scheduled repair and cleanup process in production environment.
Environment: Cassandra 2.1, DevCenter, Cqlsh, OpsCenter, Nodetool, UNIX, Cassandra-stress, Shell Scripting, Github, Maven, Solr, Shell Scripts, Sqoop, Sprak and Scala.
Sr. Cassandra Consultant
Confidential, St. Louis, MO
Responsibilities:
- Installed, configured, administered and monitored multi-datacenter Cassandra cluster.
- Modified Cassandra.yaml file to set the configuration properties like cluster name, node addresses, seed provider, replication factors, memTable size and flush times etc.
- Tuned the Cassandra.yaml and Cassandra-env.sh file to enhance and improve the performance.
- Analyzed the performance of Cassandra cluster using nodetool TP stats and CFstats for thread analysis and latency analysis.
- Used Data modelling best practices like Partition per Query strategy for good performance of the Cassandra cluster, De-normalizing data for better read performance.
- Used Apache Spark to analyze data.
- Designed, automated the process of installation and configuration of secure DataStaxEnterprise Cassandra using Chef recipes.
- Prepared chebotko diagrams for logical and physical models during the phase of data modeling.
- Displayed the item frequency plot and graphs using toolkit such as R Language, MatLab.
- Good hands on experience with Solr on extensive search over the Cassandra database cluster built on Datastax using the Dynamic fields and Faceting.
- Created the required keyspaces and column families based on the queries and application workflow.
- Collaborated with application developers to decide upon various topics like compaction strategies, replication factors and consistency levels.
- Worked with Cassandra Query Language (CQL) to execute queries on the data persisting in the Cassandra cluster.
- Designed and implemented a strategy to securely move production data to Development for testing using stable loader.
- Worked on major and minor upgrades of cluster, Knowledge on applying updates and maintenance patches for the existing clusters
- Involved in benchmarking the Cassandra cluster for performance using Cassandra-stress tool.
- Suggested data modeling performance and tuning techniques.
- Used AWS Elasticache to eliminate database data access bottlenecks and speed up application data delivery
- Configured internode communication between Cassandra nodes and client using SSL encryption.
- Administered Cassandra cluster using DatastaxOpsCenter and monitored CPU usage, memory usage and health of nodes in the cluster.
- Gained Knowledge on Solr.
- Queried and analyzed data from Datastax Cassandra for quick searching, sorting and grouping Used the Spark - Cassandra Connector to load data to and from Cassandra
- Fixed Bugs and Troubleshoot operational issues as they occur.
Environment: Cassandra 2.1, DSE 4.0, Solr, Spark, Scala, AWS-EC2, AWS-S3, AWS-IAM, Visio, Ops Center, Dev Center, Nodetool, Linux, Chef, Cassandra Stress.
Cassandra Admin
Confidential
Responsibilities:
- Benchmarked Cassandra cluster based on the expected traffic for the use case and optimized for low latency.
- Involved in doing PoC to migrate from existing repair to Incremental repairs.
- Extensively worked on implementing multiple interfaces (private/public) for C* communication.
- Excellent Knowledge and understanding of Cassandra Architecture.
- Excellent knowledge of tweaking Compaction level parameters to reduce disk space.
- Excellent knowledge of deploying & working in multiple DC cluster.
- Excellent knowledge of configuring JVM level settings to have efficient usage of GC Grace Seconds.
- Actively involved in doing Cassandra Migration to higher version. (from 2.0 to 2.2)
- Involved in business requirement gathering and proof of concept creation.
- Involved in the process of Conceptual and Physical Data Modeling techniques.
- Modified Cassandra.yaml files to set the configuration properties like cluster name, node addresses, seed provider, replication factors, memTable size and flush times etc.
- Also used SOAP UI tool to test the REST web service operations.
- Used the DataStaxOpsCenter for maintenance operations and keyspace and table management.
- Involved in automation of moving Snapshots from C* disk to Tape drive
- Implemented advanced procedures like text analytics and processing using the in-memory computing capabilities like Spark.
- Built and maintained abstraction between SOA services and underlying technologies
- Developed and supported SOA processes
- Participated in NoSQL database integration and implementation.
- Tuned and recorded performance of Cassandra clusters by altering the JVM parameters.
- Queried and analyzed data from Datastax Cassandra for quick searching, sorting and grouping.
Environments: Cassandra 2.1, DevCenter, Cqlsh, Java, jdk 1.6.0 25, J2EE, Eclipse v3.5, OpsCenter, Nodetool, UNIX, Cassandra-stress, Shell Scripting, Github, Maven, Shell Scripts.