Big Data Engineer Resume

PROFESSIONAL SUMMARY:

6+ years of experience in the IT industry, including analysis, design, development, maintenance as well as enhancement of software
5+ years of experience in physical design, planning, data modelling, administration and performance tuning in Bigdata technologies and Hadoop ecosystem.
Experience in physical design, planning, data modelling, administration and performance tuning in Cassandra Cluster and Hadoop Cluster as well.
Excellent hands on with importing and exporting data from different Relational Database Systems like Mysql and Oracle into HDFS and Hive and vice - versa, using Sqoop.
Hands-on experience in writing Pig Latin scripts, working with grunt shells and job scheduling with Oozie.
Experience in analyzing data using Hive QL, Pig Latin, and custom Map Reduce programs in Java.
Knowledge of job workflow scheduling and monitoring tools like Oozie and Zookeeper.
Expert in understanding the data and designing/Implementing the enterprise platforms like Hadoop Data Lake and Huge Data warehouses.
Capable in managing multi-tenant Cassandra clusters on public cloud environment - Amazon Web Services (AWS)-EC2 instances.
Efficient in designing data models in Cassandra and Hive and working with Cassandra Query and Hive Query Language.
Hands-on experience in Horton Works Platform, Cloudera Distribution (CDH5) and develop automation scripts using shell-scripting language BASH.
Have working experience with Spark RDD, Data Frame, Datasets and Python.
Developing effective Spark and Map Reduce Programs as per business requirements
Hand’s-on in Spark and Cassandra integration and knowledge on read and write operations and internal architecture.
Designed Hive Custom UDFs to meet the user requirement while working with dataset’s in Hive.
Developed Scala scripts using both Data Frames/SQL/Data sets and RDD/Map Reduce in SPARK for Data Aggregation, queries and writing data back into OLTP system through SQOOP.
Used Different SPARK Modules like SPARK core, SPARK RDD's, SPARK Data frame, and SPARK SQL.
Experience in troubleshooting Hadoop Service issues by analyzing various service log files.

TECHNICAL SKILLS:

Big Data Ecosystem: HDFS, Pig, Map Reduce, Hive, Sqoop, Spark, Kafka

Database: MySql, Oracle, Postgre-Sql

Languages: Java,Scala,Python, SQL\PLSQL, Hive QL, Spark SQL, Linux Shell Scripting(Bash)

API s/Tools: Eclipse,IntelliJ

Version Control: Git

Operating Systems: Linux (RHEL, Debian), Windows XP/7/8/10

Monitoring Tools: DataStax Opscenter, Splunk, Hubble

NoSql Databases: Cassandra, HBase

Application Servers: Apache Tomcat, Jboss

PROFESSIONAL EXPERIENCE:

Confidential

Big Data Engineer

Responsibilities:

Worked on batch processing of data using Apache Spark.
Good experience and understanding with Spark Scala programming and its 'In Memory' processing capability.
Worked on the core and Spark SQL modules of Spark extensively using programming languages like Scala.
Worked on creating the RDD's, DF's and Datasets for the required input data and performed the data transformations and actions using Spark Scala.
Worked on creating Kafka Producer, Consumer, Brokers, Topic and partitions.
Experience in Writing the Scala functions, procedures, Constructors and Traits.
Real time streaming of the data using Spark with Kafka.
Worked on integrating Apache Kafka with Apache Spark for data processing.
Created Hive data ware house and loaded data with Apache Spark.
Created both external and internal Hive tables also partitioned and bucketed them based on the requirement.
Exposure on usage of Apache Kafka develop data pipeline of logs as a stream of messages using producers and consumers.
Worked on fine tuning the Spark programs and Hive queries.
Worked on resolved on long running Spark jobs.

Technology: Spark, Scala, Hive, SparkSQL, HDFS, Hadoop

Confidential

Big Data Engineer

Responsibilities:

Involved in the process of Cassandra data modelling and building efficient data structures.
Responsible for Installation and Configuration for Cassandra Cluster and Hadoop file system maintenance and alerts.
Provided security to the cluster by implementing Kerberos for Cassandra and Hadoop clusters.
Responsible for Backup and recovery, security and maintenance and performance tuning.
Loaded and transformed large sets of structured, semi structured and unstructured data in to Hive through Sqoop.
Extracted Tables and exported data from Teradata through Sqoop and placed in Cassandra.
Loaded data from Linux file system to Cassandra and Hadoop file system.
Created hive table with schema and loaded the data using Spark SQL and data frames.
Responsible for managing data from multiple sources
Worked closely with Cassandra loading activity on history load and incremental loads from Teradata and Oracle Databases and resolving loading issues and tuning the loader for optimal performance.

Technology: HDFS, Hive, Cassandra, Teradata, Java, Spark

Confidential

Software Engineer

Responsibilities:

Optimized the Cassandra cluster by making changes in Cassandra and Oracle configurations.
Performed daily administrative tasks of Cassandra Cluster health check, balancing, and name node metadata backup.
Prepared the test cases, documenting and performing, unit testing and Integration.
Involved in Support of daily operations including monitoring and troubleshooting the databases and issue fixings.
Managed multi-tenant Cassandra and Hadoop clusters on public cloud environment - Amazon Web Services (AWS)-EC2 instances
Involved in Creating, Upgrading, and Decommissioning of Cassandra clusters.
Worked on Cassandra database to analyze how the data get stored.
Worked on automating scenarios using shell scripting.
Delivered urgent production issues

Technology: Cassandra, Java, Hadoop Ecosystems

Confidential

Software Engineer

Responsibilities:

Worked as a developer in creating complex stored procedures, cursors, tables, and views and other SQL joins and statements for applications.
Created Stored Procedures to transform the Data and worked extensively in SQL for various needs of the transformations while loading the data.
Coordinated with business customers to gather business requirements. And also interact with other technical peers to derive Technical requirements.
Development of code, Assembly and UNIT testing, Integration of code, build and deployment activities.
Designing the changes and showing it to the lead for confirmation
Programming, unit testing, interacting with QA to understanding the Defects raised by them and fix it.
Meeting client satisfaction with quick, accurate and timely resolution of issues/defects in the application.
Participating in telephonic conversation with client partners and vendors to meet day to day deliverables.

Technology: Java, JavaCV, OpenCV

We provide IT Staff Augmentation Services!

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship