Cassandra Database Analyst Resume NC - Hire IT People

SUMMARY

Having 9 years of experience in all aspects of Software development including requirement analysis, development, implementation, documenting and maintenance of web applications using cassandra and Big Data.
Strong knowledge and understandingof Cassandra, Hadoop HDFS &MapReduce conceptsand Hadoop Ecosystem.
Knowledge on both development and adminstration skills of cassandra Framework.
Experience in installation, configuration, supporting and managing Cassandra clusters.
In depth knowledge of Job Tracker, Task Tracker, Name Node, Data Node and MapReduce concepts.
Experience in Big Data analysis using PIG and HIVE and understanding of SQOOP and Puppet.
Experience in analyzing data using HiveQL, PIG Latin.
Experienced in developing MapReduce programs using Apache Hadoop for working with Big Data.
Experience in managing Hadoop clusters using Cloudera Manager Tool.
Good experience in analysis using PIG and HIVE and understanding of SQOOP.
Experience working on NoSQL databases including HBase.
Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice - versa.
Experience in database design using PL SQL to write Stored Procedures, Functions, Triggers and strong experience in writing complex queries for Oracle 8i/9i/10g.
Good experience in database performance tuning.
Hands on experience with Shell Scripting and UNIX.
Experience with Deployment Automation using: Jenkins, Git, Maven
Experience in production support and application support by fixing bugs.
Have very good communication / interpersonal skills and is capable of maintaining and working with team.
Project management skills like schedule planning, Offshore Team management, and design presentation.
Experience in working on Agile methodology

TECHNICAL SKILLS

Big Data / Hadoop: Apache Hadoop, Map Reduce, HDFS, HBase, Hive, Oozie, Sqoop. Cloudera Distribution of Apache Hadoop, IBM Infospheres, IBM Biginsights, cassandra

Other Technology: XML, XSLT, Maven, Jenkins

Languages: Java, C, C++, SQL, PL/SQL

Databases: MYSQL, MS Access, Oracle

Testing: Restclient, postman

PROFESSIONAL EXPERIENCE

Confidential

Cassandra Database Analyst

Responsibilities:

Configured and maintained the Cassandra clusters
Added and removed nodes from cluster using ATT specific client.
Regualr monitoting of Cassandra cluster to see all nodes are up and normal
Starting, shuttingdown and bouncing the applications and apis of the cluster.
Experience using nodetool utility
Backing up the data before any upgrades and restoring the data.
Upgrading servers from 2.0.8 cassandra to 2.1.2 cassandra
Checking logs for the errors occurred during launch of an application and debugging.
Check cqlsh for the connectivity to the node.
Use cqlsh commands to retrieve the data.
Programming using JAVA to connect to Cassandra for any fixes and creating layers
Create mavem projects for connecting to Cassandra.
Testing the apis using restclient and postman.
Creating the documentation using markdown and generating HTML files using aglio
Uploading the code using GIT.

Environment: cassandra, JAVA, eclipse, Maven, Git, Markdown, HTML, Aglio, Putty, Linux, Oracle, MYSQL, Hive, PIG, Sqoop, Oozie.

Confidential, NC

Consultant Hadoop Developer

Responsibilities:

Installed/Configured/Maintained Apache Hadoop clusters for application development and Hadoop tools like Hive, Pig, HBase, Zookeeper and Sqoop.
Wrote the shell scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
Installed and configured Hadoop, MapReduce, HDFS (Hadoop Distributed File System), developed multiple MapReduce jobs in java for data cleaning.
Developed data pipeline using Flume, Sqoop, Pig and Java map reduce to ingest customer behavioral data and financial histories into HDFS for analysis.
Involved in collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis
Collected the logs data from web servers and integrated in to HDFS using Flume.
Worked on installing cluster, commissioning & decommissioning of DataNodes, NameNode recovery, capacity planning, and slots configuration.
Implemented NameNode backup using NFS. This was done for High availability.
Developed PIG Latin scripts to extract the data from the web server output files to load into HDFS.
Used Pig as ETL tool to do transformations, event joins and some pre-aggregations before storing the data onto HDFS.
Responsible for developing data pipeline using HDInsight, flume, Sqoop and pig to extract the data from weblogs and store in HDFS.
Installed Oozie workflow engine to run multiple Hive and Pig Jobs.
Use of Sqoop to import and export data from HDFS to RDBMS and vice-versa.
Used Hive and created Hive tables and involved in data loading and writing Hive UDFs.
Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports.
Involved in migration of ETL processes from Oracle to Hive to test the easy data manipulation.
Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
Worked on importing and exporting data from Oracle and DB2 into HDFS and HIVE using Sqoop.
Worked on NoSQL databases including HBase, MongoDB, and Cassandra.
Created Hive External tables and loaded the data in to tables and query data using HQL.
Wrote shell scripts for rolling day-to-day processes and it is automated.
Automated workflows using shell scripts pull data from various databases into Hadoop
Deployed Hadoop Cluster in Fully Distributed and Pseudo-distributed modes.
Supported in setting up QA environment and updating configurations for implementing scripts with Pig and Sqoop.
Experience in AWS (Amazon web services)

Environment: Hadoop, MapReduce, Hive, HDFS, PIG, Sqoop, Oozie, Cloudera, Flume, HBase, ZooKeeper, CDH3, MongoDB, Cassandra, Oracle, NoSQL and Unix/Linux.

Confidential, Dallas, TX

Big Data Hadoop Developer/Administrator

Responsibilities:

Installed and configured Hadoop and responsible for maintaining cluster and managing and reviewing Hadoop log files.
Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, HBase NoSQL database and Sqoop.
Installed and configured Hadoop Map Reduce, HDFS, Developed multiple MapReduce jobs in java for data cleaning and pre-processing.
Applied MapReduce frameworkjobs in java for data processing by installing and configuring Hadoop, HDFS.
Performed data analysis in Hive by creating tables, loading it with data and writing hive queries which will run internally in a map reduce way.
Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.
Implemented Fair scheduler on the Job tracker to share the resources of the cluster for the MapReduce jobs given by the users.
Developed map reduce programs for applying business rules on the data.
Used Sqoop to import and export data from HDFS to RDBMS and vice-versa.
Created HBase tables to store various data formats of PII data coming from different portfolios Implemented Map-reduce for loading data from oracle database to NoSQL database.
Exported data from DB2 to HDFS using Sqoop and NFS mount approach.
Worked on custom Pig Loaders and Storage classes to work with a variety of data formats such as JSON, Compressed CSV, etc.
Used Cloudera Manager for installation and management of Hadoop Cluster.
Wrote Pig scripts to run ETL jobs on the data in HDFS.
Involved in scheduling Oozie workflow engine to run multiple Hive and pig jobs.
Moved data from Hadoop to Cassandra using Bulk output format class.
Used Sqoop to import data into HDFS and Hive from other data systems.
Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
Developed and executed hive queries for denormalizing the data.
Involved in the regular Hadoop Cluster maintenance such as patching security holes and updating system packages.
Automated the work flow using shell scripts.
Performance tuning of the hive queries, written by other developers.

Environment: Hadoop, MapReduce, Pig, Hive, HBase, Oozie, HDFS, Sqoop, Oozie, Cloudera, Cassandra, NoSQL, DB2 and UNIX.

Confidential, TX

SAS Programmer/Analyst

Responsibilities:

Worked closely with research team of scientists and Biostatisticians to analyze data, summarize tables and listing specifications.
Developed project analysis plans, including table specifications, statistical analyses and report formats using BI Solutions
Reviewed study Protocol, Annotated Case Report Form (ACRF), and performed validation of clinical trial data to identify illogical data entries.
Review of specifications, mock tables and listings.
Performed Statistical Analysis and generated reports using SAS/MACRO, SAS/ODS, Proc Report, Proc Print, Proc Summary, Proc Freq, Proc means, Proc tabulate and Proc SQL.
Used Base SAS to perform sorting, indexing, merging of datasets and generated reports.
Modified existing SAS programs using SAS macro variables to improve the ease, speed and consistency of the results.
Extensively used SAS DICTIONARY tables to get updated information on datasets.
Extensively used BASE SAS functions like ROUND, SCAN, INDEX, SUBSTR, TRIM, LENGTH, PUT, INPUT, DATE, MAX, MIN and MEAN.
Used SAS/ODS to generate the Statistical reports in HTML format.
Used SAS macros functions to simplify the process and to get consistent results.
Developed routine SAS Macros to create tables, graphs and listings for inclusion in clinical study reports and regulatory submissions and maintained existing ones

Environment: SAS, SQL, SAS MACROS, STAT

Confidential

SAS Programmer/Analyst

Environment: SAS/BASE, SAS/MACROS, SAS/ACCESS, SAS/STAT, SAS ODS,Windows.

Responsibilities:

Analyzed Phase II and III Clinical Trials through SAS programming and by providing statistical support to statisticians and Biostatisticians.
Created CRT (Case Report Tabulations) datasets using ODM model of CDISC standards for submissions to the FDA.
Successfully created Tables, Listings and Graphs using various procedures like Proc Report, Proc Tabulate, Proc Plot, and Proc Gplot.
Extensively used SAS BI tool for generating BI solutions
Created SAS Macros and modified the existing ones relating to multiple studies.
Produced Tables, Listings and Graphs from Integrated Summaries of Efficacy (ISE) and Safety (ISS).
Effectively and timely contacted data management head of the respective study about the various Data Issues and resolved the queries through meetings.
Maintained appropriate study application documentation.
Performed Program Documentation on all programs, files and variables for accurate historical record and for future reference.
Optimized performance using Data Validation and Data Cleaning on Clinical Trial Data.
Involved in writing the SAS codes to help in the process of Quality control by implementing various statistical procedures like Proc freq, Proc means, Proc uni-variate and other procedures like Proc Summary, Proc Transpose, Proc SQL and Proc print.
Successfully validated study TLG’s and CRT’s through independent validation using Proc compare and departmental standard macros.

Environment: SAS, BASE SAS, MACROS, STAT, GRAPH

Confidential

Jr.JAVA developer

Responsibilities:

Gathered requirements for the project and involved in analysis phase.
Developed quick prototype for the project so as to aid business in deciding the necessary ramifications to the requirements.
Created UML class and sequence diagrams using Rational Rose.
Designed and created user interactive front-end screens using JavaScript, HTML and JSP's.

Environment: Java, HTML, Oracle, SQL

We provide IT Staff Augmentation Services!

Cassandra Database Analyst Resume

NC

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship