Cassandra Consultant Resume
3.00/5 (Submit Your Rating)
NY
PROFESSIONAL SUMMARY:
- Around 8+ years of experience in IT and 3+ years of experience in Hadoop Technologies
- Experience in deploying the Cassandra cluster in cloud, on premises, working on the data storage and disaster recovery for Cassandra
- Experience in designing, configuring and installing Datastax Cassandra
- Hands - on experience in Cassandra Hector API's and Cassandra data modeling
- Hands on experience in monitoring the cluster with Zabbix
- Designed, automated the process of installation and configuration of secure DataStax Enterprise Cassandra using puppet
- Experience in configuring, installing, benchmarking and managing Apache Hadoop and Cloudera Hadoop distributions
- Experience in using automations tools like puppet for deploying hadoop cluster
- Hands on experience in developing distributed programs using MR framework Experience working with FLUME to load data from web logs
- Excellent understanding of performance tuning, commissioning and decommissioning, log rotation, Fair scheduler and Capacity scheduler
- Experience in working with BI team and transform big data requirements into Hadoop centric technologies.
- Experience in performance tuning the Hadoop cluster by gathering and analyzing the existing infrastructure.
- Experience in setting up the Hadoop cluster for the Environment
- Experience in automating the Hadoop Installation, configuration and maintaining the cluster by using the tools like puppet
- Experience in trouble shooting and maintenance of cluster
- Experience in setting up monitoring infrastructure for Hadoop cluster using Nagios and Ganglia.
- Working experience on designing and implementing complete end-to-end Hadoop Infrastructure including PIG, HIVE, Sqoop, Oozie, Flume and zookeeper.
- Experience in configuring the Zookeeper to coordinate the servers in clusters and to maintain the data consistency
- Experience in Data migration from existing data stores and mainframe NDM (Network Data mover) to Hadoop
- Basic Knowledge of ETL tools
- Experience in upgrading the existing Hadoop cluster to latest releases
- Experienced in using NFS (network file systems) for Name node metadata backup
- Experience in monitor the cluster around the clock using Ganglia.
- Experience in using Cloudera Manager 4.0 for installation and management of Hadoop cluster
- Experience in supporting data analysis projects using Elastic Map Reduce on the Amazon Web Services (AWS) cloud. Exporting and importing data into S3
- Experience in designing both time driven and data driven automated workflows using Oozie
- Experience in supporting analysts by administering and configuring HIVE
- Experience in providing support to data analyst in running Pig and Hive queries
- Developed Map Reduce programs to perform analysis
- Performed Importing and exporting data into HDFS and Hive using Sqoop
- Experience in writing shell scripts to dump the Sharded data from MySQL servers to HDFS
- Good knowledge in Core java, Collection framework
- Excellent knowledge in OOPS (Object Oriented Programming Structure)
TECHNICAL SKILLS:
Hadoop Ecosystem: HDFS, Hive, Pig, Flume, Oozie, Zookeeper, HBASE and Sqoop
Operating System: Unix, Linux, Windows
Databases: MySQL, Oracle, MS SQL Server
Languages: C, JAVA, PYTHON, SQL, PIG LATIN, UNIX shell scripting.
PROFESSIONAL EXPERIENCE:
Confidential - NY
Cassandra Consultant
Responsibilities:
- Administered Cassandra cluster using Datastax OpsCenter and monitored CPU usage, memory usage and health of nodes in the cluster
- Hands on experience in monitoring the cluster with Zabbix
- Deployed and maintained PROD, client test, QA and Dev clusters
- Migrated 60 TB of data from one datacenter to another datacenter
- Created and managed the VM’s in Openstack environment for testing purpose
- Designed, automated the process of installation and configuration of secure DataStax Enterprise Cassandra using puppet
- Configured internode communication between Cassandra nodes and client using SSL encryption
- Configured authorization to Cassandra cluster using Password Authenticator and Kerberos for Hadoop cluster
- Designed and configured gateway node to the cluster
- Documented and demonstrated on various ways to securely connect to the cluster
- Designed and implemented a strategy to securely move production data to Development for testing purposes using stableloader
- Designed and implemented a strategy to upgrade the DataStax enterprise cluster
- Worked on major and minor upgrades of cluster
- Played responsible role for deciding the hardware configurations for the cluster along with other teams in the company
- Involved in Data modeling design for various use cases
- Knowledge on applying updates and maintenance patches for the existing clusters
Confidential, Atlanta, GA
Hadoop Administrator / Cassandra Engineer
Responsibilities:
- Installed, configured and deployed a 50 node Cloudera Hadoop cluster for development, production
- Worked on setting up high availability for major production cluster and designed automatic failover
- Maintenance and troubleshooting the cluster
- Involved in the process of designing Cassandra Architecture including data modeling
- Automated the process of installation and configuration of the nodes using Puppet
- Upgraded the Cassandra Cluster
- Benchmark Hadoop cluster with TestDFSIO, TeraSort, NNbench, and MRbench
- Configured Rack awareness for Hadoop cluster by writing topology script
- Administering and Monitoring HDFS for efficient functioning of cluster including space remaining, memory and CPU utilization, replicas, throughput, and network connectivity
- Back up of data from active cluster to a backup cluster using distcp
- Experience in storing the analyzed results back into the Cassandra cluster
- Configured Hive metastore, which stores the metadata for Hive tables and partitions in a relational database
- Configured Flume for efficiently collecting, aggregating and moving large amounts of log data
- Worked on configuring security for hadoop cluster(Kerberos, Active Directory)
- Cloudera Administration (performance tuning, commissioning and decommission)
- Responsible to manage data coming from different sources
- Installed and configured Zookeeper for Hadoop cluster.
- Tuning MR Program’s those are running on the Hadoop cluster.
- Involved in HDFS maintenance, Upgrading the cluster to latest versions of CDH
- Imported/exported data from RDMS to HDFS using Sqoop
- Wrote Hive queries for data analysis to meet the business requirements
Confidential, Atlanta, GA
Hadoop DevOps Consultant
Responsibilities:
- Developed Map Reduce programs in Java for parsing the raw data and populating staging tables
- Created Hive queries to compare the raw data with EDW reference tables and performing aggregates
- Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with PIG
- Developed the UDF's in Pig, Hive using Java
- Experienced in defining workflows using oozie
- Developed Hive tables to transform, analyze the data in HDFS.
- Moving data from HDFS to RDBMS and vice-versa using SQOOP
- Co-coordinating with other programmers in the team to ensure that all the modules complement each other well
- Installed and configured Hadoop cluster in Test and Production environments
- Performed both major and minor upgrades to the existing CDH cluster
- Implemented Commissioning and Decommissioning of new nodes to existing cluster
Confidential
Linux Administrator
Responsibilities:
- Setup and configured VMware based Linux environment.
- Designed, deployed, and administered production server environments
- Hands on experience in running cron-tab to back up data
- Developed and maintained automation systems to improve administrative efficiency
- Installed, configured, and maintained Red Hat Enterprise Linux Distributions
- Good understanding of Operating System updates, patches and configuration changes
- Experience in adding, removing or updating user account information, resetting passwords
- Creating and managing Logical volumes; Installing and updating packages using YUM
- Installed and maintained software packages (.rpm) as necessary
