Big Data Engineer Resume
PROFESSIONAL SUMMARY:
- 6+ years of experience in the IT industry, including analysis, design, development, maintenance as well as enhancement of software
- 5+ years of experience in physical design, planning, data modelling, administration and performance tuning in Bigdata technologies and Hadoop ecosystem.
- Experience in physical design, planning, data modelling, administration and performance tuning in Cassandra Cluster and Hadoop Cluster as well.
- Excellent hands on with importing and exporting data from different Relational Database Systems like Mysql and Oracle into HDFS and Hive and vice - versa, using Sqoop.
- Hands-on experience in writing Pig Latin scripts, working with grunt shells and job scheduling with Oozie.
- Experience in analyzing data using Hive QL, Pig Latin, and custom Map Reduce programs in Java.
- Knowledge of job workflow scheduling and monitoring tools like Oozie and Zookeeper.
- Expert in understanding the data and designing/Implementing the enterprise platforms like Hadoop Data Lake and Huge Data warehouses.
- Capable in managing multi-tenant Cassandra clusters on public cloud environment - Amazon Web Services (AWS)-EC2 instances.
- Efficient in designing data models in Cassandra and Hive and working with Cassandra Query and Hive Query Language.
- Hands-on experience in Horton Works Platform, Cloudera Distribution (CDH5) and develop automation scripts using shell-scripting language BASH.
- Have working experience with Spark RDD, Data Frame, Datasets and Python.
- Developing effective Spark and Map Reduce Programs as per business requirements
- Hand’s-on in Spark and Cassandra integration and knowledge on read and write operations and internal architecture.
- Designed Hive Custom UDFs to meet the user requirement while working with dataset’s in Hive.
- Developed Scala scripts using both Data Frames/SQL/Data sets and RDD/Map Reduce in SPARK for Data Aggregation, queries and writing data back into OLTP system through SQOOP.
- Used Different SPARK Modules like SPARK core, SPARK RDD's, SPARK Data frame, and SPARK SQL.
- Experience in troubleshooting Hadoop Service issues by analyzing various service log files.
TECHNICAL SKILLS:
Big Data Ecosystem: HDFS, Pig, Map Reduce, Hive, Sqoop, Spark, Kafka
Database: MySql, Oracle, Postgre-Sql
Languages: Java,Scala,Python, SQL\PLSQL, Hive QL, Spark SQL, Linux Shell Scripting(Bash)
API s/Tools: Eclipse,IntelliJ
Version Control: Git
Operating Systems: Linux (RHEL, Debian), Windows XP/7/8/10
Monitoring Tools: DataStax Opscenter, Splunk, Hubble
NoSql Databases: Cassandra, HBase
Application Servers: Apache Tomcat, Jboss
PROFESSIONAL EXPERIENCE:
Confidential
Big Data Engineer
Responsibilities:
- Worked on batch processing of data using Apache Spark.
- Good experience and understanding with Spark Scala programming and its 'In Memory' processing capability.
- Worked on the core and Spark SQL modules of Spark extensively using programming languages like Scala.
- Worked on creating the RDD's, DF's and Datasets for the required input data and performed the data transformations and actions using Spark Scala.
- Worked on creating Kafka Producer, Consumer, Brokers, Topic and partitions.
- Experience in Writing the Scala functions, procedures, Constructors and Traits.
- Real time streaming of the data using Spark with Kafka.
- Worked on integrating Apache Kafka with Apache Spark for data processing.
- Created Hive data ware house and loaded data with Apache Spark.
- Created both external and internal Hive tables also partitioned and bucketed them based on the requirement.
- Exposure on usage of Apache Kafka develop data pipeline of logs as a stream of messages using producers and consumers.
- Worked on fine tuning the Spark programs and Hive queries.
- Worked on resolved on long running Spark jobs.
Technology: Spark, Scala, Hive, SparkSQL, HDFS, Hadoop
Confidential
Big Data Engineer
Responsibilities:
- Involved in the process of Cassandra data modelling and building efficient data structures.
- Responsible for Installation and Configuration for Cassandra Cluster and Hadoop file system maintenance and alerts.
- Provided security to the cluster by implementing Kerberos for Cassandra and Hadoop clusters.
- Responsible for Backup and recovery, security and maintenance and performance tuning.
- Loaded and transformed large sets of structured, semi structured and unstructured data in to Hive through Sqoop.
- Extracted Tables and exported data from Teradata through Sqoop and placed in Cassandra.
- Loaded data from Linux file system to Cassandra and Hadoop file system.
- Created hive table with schema and loaded the data using Spark SQL and data frames.
- Responsible for managing data from multiple sources
- Worked closely with Cassandra loading activity on history load and incremental loads from Teradata and Oracle Databases and resolving loading issues and tuning the loader for optimal performance.
Technology: HDFS, Hive, Cassandra, Teradata, Java, Spark
Confidential
Software Engineer
Responsibilities:
- Optimized the Cassandra cluster by making changes in Cassandra and Oracle configurations.
- Performed daily administrative tasks of Cassandra Cluster health check, balancing, and name node metadata backup.
- Prepared the test cases, documenting and performing, unit testing and Integration.
- Involved in Support of daily operations including monitoring and troubleshooting the databases and issue fixings.
- Managed multi-tenant Cassandra and Hadoop clusters on public cloud environment - Amazon Web Services (AWS)-EC2 instances
- Involved in Creating, Upgrading, and Decommissioning of Cassandra clusters.
- Worked on Cassandra database to analyze how the data get stored.
- Worked on automating scenarios using shell scripting.
- Delivered urgent production issues
Technology: Cassandra, Java, Hadoop Ecosystems
Confidential
Software Engineer
Responsibilities:
- Worked as a developer in creating complex stored procedures, cursors, tables, and views and other SQL joins and statements for applications.
- Created Stored Procedures to transform the Data and worked extensively in SQL for various needs of the transformations while loading the data.
- Coordinated with business customers to gather business requirements. And also interact with other technical peers to derive Technical requirements.
- Development of code, Assembly and UNIT testing, Integration of code, build and deployment activities.
- Designing the changes and showing it to the lead for confirmation
- Programming, unit testing, interacting with QA to understanding the Defects raised by them and fix it.
- Meeting client satisfaction with quick, accurate and timely resolution of issues/defects in the application.
- Participating in telephonic conversation with client partners and vendors to meet day to day deliverables.
Technology: Java, JavaCV, OpenCV