We provide IT Staff Augmentation Services!

Sr. Cassandra Developer/admin Resume

2.00/5 (Submit Your Rating)

Mt Laurel, NJ

PROFESSIONAL SUMMARY:

  • Over 7 years of professional IT experience in Big Data Technologies and Data Analysis with 4+ years of hands - on experience in development and design of applications in Java and its related frameworks.
  • Experienced in installing, configuring and monitoring the DatastaxCassandraCluster, DevCenter and OpsCenter.
  • Excellent understanding ofCassandraArchitecture and management tool like OpsCenter
  • Commendable knowledge on read and write processes, including SSTables, MemTables and Commitlog
  • Experience with querying on data present inCassandracluster using CQL (CassandraQuery Language)
  • Used DataStaxOpsCenter and nodetool utilities to monitor the cluster.
  • Experience in taking the data backups through nodetool snapshots.
  • Experience in moving the SSTables data on to the live cluster.
  • Experience in using Sqoop to import the data on toCassandratables from different relational databases
  • Tested the application and the cluster with different consistency levels to check for the writes and reads performance with respective to Consistency Level.
  • Experience in Importing data from various sources to theCassandracluster using Java API’s
  • Experience in data modeling ofCassandra.
  • Experience in creating tables involving collections, TTLs, counter, UDTs as part of data modeling.
  • Knowledge of Hadoop Ecosystem.
  • Good experience in using Elastic Map Reduce on Amazon Web Services (AWS) cloud for supporting data analysis projects.
  • Basic Knowledge of Spark and SCALA programming
  • Knowledge of managing and scheduling backup and restore operations
  • Experience in benchmarkingCassandraCluster usingCassandrastress tool
  • Developed wrapper applications involving File I/O processing, Data Mining using python
  • Read data from local files, XML files, excel files and involved in Input Output Processing using different packages in python
  • Experience in python, Jupyter, Scientific computing stack (numpy, scipy, pandas and matplotlib)
  • Good experience in all the phases of Software Development Life Cycle (Analysis of requirements, Design, Development, Verification and Validation, Deployment)
  • Hands on experience in application development using Java, .NET, RDBMS, and Linux shell scripting.
  • Experience working with JAVA J2EE, JDBC, ODBC, Servlets.
  • Experience using Eclipse, Visual Studio and DBMS like Oracle, MYSQL and SQL Server.
  • Evaluate and propose new tools and technologies to meet the needs of the organization.
  • Good knowledge in Unified Modeling Language (UML) and Agile Methodologies.
  • Excellent team player, self-starter with effective communication skills.

TECHNICAL SKILLS:

ApacheCassandra: Cassandra, DataStax, DevCenter and OpsCenter, Node tool, Spark onCassandra

NoSQL Database: Cassandra, HBase.

Relational Database: SQL Server, MySQL, Oracle.

Hadoop System: HDFS, Map Reduce, HBase, YARN, Hive, Spark, Oozie, Zookeeper, Sqoop, and Pig.

Servers: Apache Tomcat, Web Logic, Web Sphere, JBOSS.

Others: Eclipse, Visual Studio, NetBeans, Pycharm, GIT, Maven, OpsCenter, DevCenter, Node tool, JIRA, ANT.

Operating Systems: Windows, Macintosh, Linux.

PROFESSIONAL EXPERIENCE:

Sr. Cassandra Developer/Admin

Confidential, Mt. Laurel, NJ

Responsibilities:

  • Involved in capacity planning and requirements gathering for multi datacenterCassandracluster.
  • Involved in the process of designingCassandraArchitecture.
  • Involved in NoSQL database design, integration, and implementation.
  • Involved in the process of Conceptual and Physical Data Modeling techniques.
  • Installed, Configured, Tested Datastax EnterpriseCassandramulti-node cluster which has 4 Datacenters and 5 nodes each.
  • Installed and configuredCassandracluster and CQL on the cluster.
  • Involved in the Migration of data from one database to another database.
  • Involved in the process ofCassandradata modeling and building efficient distributed schema.
  • Involved in the process of data mover for disaster recovery platforms Backup and recovery.
  • Experienced in upgrading the existingCassandracluster to latest releases.
  • Experienced in OpsCenter and its capabilities.
  • Experienced in Apache Spark with Scala.
  • Experienced in storing the analyzed results into theCassandracluster.
  • Experienced in provisioning and managing multi-datacenterCassandracluster on public cloud environment - Amazon Web Services (AWS) - EC2.
  • Imported data from various resources to theCassandracluster using Java APIs.
  • Enabled Security properties like authenticator and authorizer to protect the data from unauthorized users.
  • Consistency levels for read and write queries were implemented depending on the use case.
  • Optimized theCassandracluster by making changes inCassandraproperties and Linux OS configurations.
  • Working closely with Datastax to resolve issues on cluster using ticketing mechanism.
  • Configured Performance Tuning and Monitoring forCassandraRead and Write processes for fast I/O operations and low latency time.
  • Performed Stress and Performance testing to benchmark the cluster.
  • AdministeredCassandracluster using DatastaxOpsCenter and monitored CPU usage, memory usage and health of nodes in the cluster.
  • Configured accordingly to achieve maximum throughput and execution time based on the benchmarking results.
  • Loaded and transformed large sets of structured, semi structured and unstructured data in various formatslike text, zip, XML, YAML and JSON
  • Experience using DataStax Pig functionality to develop the PIG UDFs for manipulating the data and extractinguseful information according to Business Requirements
  • Experienced in performance tuning a Cassandra Cluster to optimize writes and reads using the Cassandrastresstool
  • Extensively worked on Java persistence layer in application migration to Cassandra using Spark to load datato and from Cassandra Cluster
  • Used Github version control for tagging the new versions.
  • Configured, Documented, and Demonstrated inter node communication betweenCassandranodes and client using SSL encryption.
  • Knowledge on applying updates and maintenance patches for the existing clusters.
  • Scheduled repair and cleanup process in production environment.

Environment: Cassandra2.1, DevCenter, Cqlsh, OpsCenter, Nodetool, UNIX,Cassandra-stress, Shell Scripting, Github, Maven, Solr, Shell Scripts, Sqoop, Sprak and Scala.

Sr. Cassandra Consultant

Confidential, St. Louis, MO

Responsibilities:

  • Installed, configured, administered and monitored multi-datacenterCassandracluster.
  • ModifiedCassandra.yaml file to set the configuration properties like cluster name, node addresses, seed provider, replication factors, memTable size and flush times etc.
  • Tuned theCassandra.yaml andCassandra-env.sh file to enhance and improve the performance.
  • Analyzed the performance of Cassandra cluster using nodetool TP stats and CFstats for thread analysis and latency analysis.
  • Used Data modelling best practices like Partition per Query strategy for good performance of theCassandracluster, De-normalizing data for better read performance.
  • Used Apache Spark to analyze data.
  • Designed, automated the process of installation and configuration of secure DataStaxEnterpriseCassandrausing Chef recipes.
  • Prepared chebotko diagrams for logical and physical models during the phase of data modeling.
  • Good hands on experience with Solr on extensive search over theCassandradatabase cluster built on Datastax using the Dynamic fields and Faceting.
  • Created the required keyspaces and column families based on the queries and application workflow.
  • Collaborated with applicationdevelopersto decide upon various topics like compaction strategies, replication factors and consistency levels.
  • Worked withCassandraQuery Language (CQL) to execute queries on the data persisting in theCassandracluster.
  • Designed and implemented a strategy to securely move production data to Development for testing using stable loader.
  • Worked on major and minor upgrades of cluster, Knowledge on applying updates and maintenance patches for the existing clusters
  • Involved in benchmarking theCassandracluster for performance usingCassandra-stress tool.
  • Suggested data modeling performance and tuning techniques.
  • Configured internode communication betweenCassandranodes and client using SSL encryption.
  • AdministeredCassandracluster using DatastaxOpsCenter and monitored CPU usage, memory usage and health of nodes in the cluster.
  • Gained Knowledge on Solr.
  • Queried and analyzed data from Datastax Cassandra for quick searching, sorting and grouping
  • Used the Spark - Cassandra Connector to load data to and from Cassandra
  • Fixed Bugs and Troubleshoot operational issues as they occur.

Environment: Cassandra2.1, DSE 4.0, Solr, Spark, Scala, AWS-EC2, AWS-S3, AWS-IAM, Visio, Ops Center, Dev Center, Nodetool, Linux, Chef,CassandraStress.

Cassandra Admin

Confidential, Manchester, NH

Responsibilities:

  • BenchmarkedCassandracluster based on the expected traffic for the use case and optimized for low latency.
  • Involved in doing PoC to migrate from existing repair to Incremental repairs.
  • Extensively worked on implementing multiple interfaces (private/public) for C* communication.
  • Excellent Knowledge and understanding ofCassandraArchitecture.
  • Excellent knowledge of tweaking Compaction level parameters to reduce disk space.
  • Excellent knowledge of deploying & working in multiple DC cluster.
  • Excellent knowledge of configuring JVM level settings to have efficient usage of GC Grace Seconds.
  • Actively involved in doingCassandraMigration to higher version. (from 2.0 to 2.2)
  • Involved in business requirement gathering and proof of concept creation.
  • Involved in the process of Conceptual and Physical Data Modeling techniques.
  • ModifiedCassandra.yaml files to set the configuration properties like cluster name, node addresses, seed provider, replication factors, memTable size and flush times etc.
  • Used the DataStaxOpsCenter for maintenance operations and keyspace and table management.
  • Involved in automation of moving Snapshots from C* disk to Tape drive
  • Implemented advanced procedures like text analytics and processing using the in-memory computing capabilities like Spark.
  • Participated in NoSQL database integration and implementation.
  • Tuned and recorded performance ofCassandraclusters by altering the JVM parameters.
  • Queried and analyzed data from DatastaxCassandrafor quick searching, sorting and grouping.

Environment: s:Cassandra2.1, DevCenter, Cqlsh, OpsCenter, Nodetool, UNIX,Cassandra-stress, Shell Scripting, Github, Maven, Shell Scripts.

BigData Consultant

Confidential, TX

Responsibilities:

  • Involved in the process of designingCassandraArchitecture including data modeling
  • Automated the process of installation and configuration of the nodes using Puppet
  • Administering and monitoring the cluster using nodetool utility and DataStaxOpsCenter
  • Involved in writing Java client program to connect toCassandracluster
  • Scheduled repair and cleanup process in production environment during off peak times
  • Involved in the process of designing, installation and configuration of Cloudera Hadoop cluster (CDH3) using Cloudera manager
  • Installed and configured Hive and Hive remote metastore using MySQL services
  • Benchmark Hadoop cluster with TestDFSIO, TeraSort, NNbench, and MRbench
  • Configured accordingly to achieve maximum throughput and execution time based on the benchmarking results.
  • Administering and Monitoring HDFS for efficient functioning of cluster including space remaining, memory and CPU utilization, replicas, throughput, and network connectivity.
  • Automated periodically backing up of Namenode metadata into multiple disks and nodes
  • Back up of data from active cluster to a backup cluster using distcp.
  • Loaded web log data into HDFS using apache flume for analysis.
  • Improved the efficiency of the MapReduce by tuning the configuration parameters accordingly and using custom practitioner.
  • Used Hive and Pig UDFs for adhoc query purposes.
  • Troubleshooting and debugging MapReduce job failures and issues with Hive, Pig Scripts.
  • Experience in storing the analyzed results back into theCassandracluster.

Environment: Hadoop, Hbase, Hive, HDFS, pig, MapReduce,Cassandra1.1, Puppet, Spark, AWS, Linux.

Hadoop Developer

Confidential, New York, NY

Responsibilities:

  • Expertise in designing and deployment of Hadoop cluster and different Big Data analytic tools including Pig, Hive, HBase, Oozie, ZooKeeper, SQOOP, flume, Kafka, Spark, Cassandra with Hortonworks and Cloudera
  • Installed Hadoop, Map Reduce, HDFS, AWS and developed multiple MapReduce jobs in PIG and Hive for data cleaning and pre-processing.
  • Understanding business needs, analysing functional specifications and map those to develop and designing MapReduce programs and algorithms.
  • Written Pig and Hive jobs to parse the logs and structure them in tabular format to facilitate effective querying on the log data. Also have hand on Experience on Pig and Hive User Define Functions (UFD).
  • Execution of Hadoop ecosystem and Applications through Apache HUE.
  • Optimizing Hadoop MapReduce code, Hive/Pig scripts for better scalability, reliability and performance.
  • Developed the OOZIE workflows for the Application execution.
  • Feasibility Analysis (For the deliverables) - Evaluating the feasibility of the requirements against complexity and time lines.
  • Performing data migration from Legacy Databases RDBMS to HDFS using SQOOP.
  • Writing Pig scripts for data processing.
  • Implemented Hive tables and HQL Queries for the reports. Written and used complex data type in Hive. Storing and retrieved data using HQL in Hive. Developed Hive queries to analyze reducer output data.
  • Highly involved in designing the next generation data architecture for the unstructured data
  • Managed a 4-node Hadoop cluster for a client conducting a Hadoop proof of concept. The cluster had 12 cores and 3 TB of installed storage.
  • Developed PIG Latin scripts to extract data from source system.
  • Involved in Extracting, loading Data from Hive to Load an RDBMS using SQOOP.
  • Integrate four square monitoring and production system with Kafka
  • Designed, documented operational problems by following standards and procedures using a software reporting tool JIRA.
  • Worked on Data modeling during application software design

Environment: HDFS, Map Reduce, Hive, Oozie, Java, PIG, Shell Scripting, Kafka, Linux, HUE, SQOOP, Flume, DB2, and Oracle 11g, Data modelling

Java Developer

Confidential

Responsibilities:

  • Involved in developing solutions to requirements, enhancements and defects.
  • Involved in requirements design, development, and system testing.
  • Implemented Action class to encapsulate the business logic.
  • Used frameworks for developing applications.
  • Used various design patterns using Core Java techniques.
  • Used Object Oriented Application Design (OOA/D) for deriving objects and classes.
  • Stored Procedures, database triggers were used at all levels.
  • Communicating across the team about the processes, goals, guidelines and delivery of items.
  • Developed the Java Code using Eclipse as IDE.
  • Configuration of Tomcat 4.1 for the application on Win NT server.
  • Used Java script for validation of page data in the JSP pages.
  • Responsible for code version management and unit test plans

Environment: Java 1.3, Tomcat, Eclipse, SQL and Windows.

We'd love your feedback!