We provide IT Staff Augmentation Services!

Hadoop And Spark Developer Resume

New, YorK

PROFESSIONAL SUMMARY:

  • Over all 5+ years of professional experience as Hadoop Developer using Apache Spark Framework and also Oracle Database Administrator
  • Hands on experience in installing configuring and using Hadoop ecosystem components like Apache Spark, HDFS, HBase, Spark SQL, Sqoop, Zookeeper, Kafka, and Flume.
  • Good Knowledge on Apache Cassandra and Mongo DB
  • Hands - on fundamental building blocks of Spark - RDDs and related manipulations for implementing business logics Like Transformations, Actions and Functions performed on RDD.
  • Depth understanding of Data-frames and Data-Sets in Spark SQL
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa.
  • Designed good understanding of Partitions, bucketing concepts in Hive and designed both Managed and External tables in Hive to optimize performance
  • Created Hive external tables, views and scripts for transformations such as filtering, aggregation and partitioning tables.
  • Expert in performing business analytical scripts using Hive SQL.
  • Worked on IDE’s such as Eclipse and IntelliJ for developing, deploying and debugging the applications.
  • Good knowledge on Data Warehousing, ETL development, Distributed Computing, and large scale data processing.
  • Experienced in work with different file formats like Text, Sequence, Xml and JSON.
  • Expertise in working with relational databases such as Oracle 10g, SQL Server 2012.
  • Good knowledge of stored procedures, functions, etc. using SQL and PL/SQL.
  • Configured Oracle Data guard for Disaster Recovery implementations.
  • Planning and Support upgrades oracle 10g to 11g to 12c. RMAN Backup and Recovery strategies. Oracle High Availability Solutions
  • Hands on Data Analysis, Logical and Physical Design, Backup and Recovery, Performance and Tuning, Database installation, and upgrades
  • Collaborated with the infrastructure, network, database, application, and BI teams to ensure data quality and availability.
  • Strong knowledge of Software Development Life Cycle and expertise in detailed design documentation
  • Excellent Communication Skills, Ability to perform at a high level and meet deadlines

TECHNICAL SKILLS:

Big Data: HDFS, Apache Spark, Spark SQL, Spark streaming, Zookeeper, Hive, Sqoop, HBase, Kafka, Flume, Yarn, Cassandra, Mongo dB

Languages: Java, Scala, SQL/PLSQL, Shell Scripting.

Java Technologies: JSP, Servlets, JDBC, OOPS Concept

Database: MySQL, Mongo DB, Cassandra, Oracle 10g/11g, Microsoft SQL Server 2014

IDE / Testing Tools: Eclipse, IntelliJ IDEA

Operating System: Windows, UNIX, Linux

Tools: SQL Developer, Maven. Hue, TOAD

PROFESSIONAL EXPERIENCE:

Confidential, New York

Hadoop and Spark Developer

Responsibilities:

  • Involved in requirement gathering to connect with business Analysis.
  • Responsible for creating technical Documents like High-Level Design and low-Level Design specifications.
  • Installed and configured Cloudera Manager for easy management of existing Hadoop cluster
  • Configured various property files like core-site.xml, hdfs-site.xml, yarn-site.xml, mapred-site.xml and hadoop-env.xml based upon the job requirement.
  • Used Sqoop to transfer data between RDBMS and HDFS.
  • Worked with business functional lead to review and finalize requirements and data profiling analysis.
  • Implemented complex Spark programs to perform Joins from Different tables
  • Exploring with the Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Scala, Spark-SQL, Data Frame, and Pair RDD's.
  • Responsible for creating tables based on business requirements
  • Show data visualization and to generate reports for clear result.
  • Loaded and transformed large sets of structured, semi structured and unstructured data in various formats like text, XML and JSON.
  • Utilized Agile Scrum Methodology to help manage and organize a Project with professor and regular code review sessions.

Environment: Hadoop HDFS, Apache Spark, Spark-Core, Spark-SQL, Scala, JDK 1.8, CHD 5, Sqoop, MySQL, CentOS Linux

Confidential

Hadoop and Spark Developer

Responsibilities:

  • Worked with the Spark for improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN.
  • Developed iterative algorithms using Spark Streaming in Scala for near real-time dashboards.
  • Developed custom aggregate functions using Spark SQL and performed interactive querying
  • Designed good understanding of Partitions, bucketing concepts in Hive and designed both Managed and External tables in Hive to optimize performance
  • Created Hive external tables, views and scripts for transformations such as filtering, aggregation and partitioning tables.
  • Handled importing of data from various data sources, performed transformations using Hive, and loaded data into Teradata to HDFS
  • Expert in performing business analytical scripts using Hive SQL.
  • Responsible for Building automation jobs and scheduling using atomic scheduler with aorta framework,
  • Worked with data in multiple file formats including Parquet, Sequence files and Text/ CSV.
  • Expertise in creating Thought-spot pin boards and Bringing all data for reports as per Business Requirements
  • Participate in meetings with clients (internal and external), assist in framing projects and designing solutions based on client needs and problems to be solved
  • Followed agile methodology and SCRUM meetings to track, optimize and tailored features to customer needs.
  • Gained very good business knowledge on different category of products and designs within.
  • Involved in developing Thought spot reports and work flows automated to load data

Environment: Hadoop HDFS, Apache Spark, Spark-Core, Spark-SQL, Scala, JDK 1.7, Sqoop, Eclipse, MySQL, AWS EC2, HBase, CentOS Linux and ZooKeeper

Confidential

Hadoop and Spark Developer

Responsibilities:

  • Involved in Requirement Gathering to connect with BA.
  • Working Closely with BA & Client for creating technical Documents like High-Level Design and low-Level Design specifications.
  • Implemented best income logic using Spark SQL
  • Experienced on loading and transforming of large sets of structured data, semis structured data and unstructured data.
  • Imported data using Sqoop to load data from MySQL to HDFS on regular basis.
  • Developing RDDS to schedule various Hadoop Program.
  • Written SPARK SQL Queries for data analysis to meet the business requirements.
  • Experienced in defining job flows.
  • Cluster coordination services through Kafka and Zookeeper.
  • Serializing JSON data and storing the data into tables using Spark SQL.
  • Writing Shell scripts to automate the process flow.
  • Storing the extracted data into HDFS using Flume
  • Experienced in multiple file formats including XML, JSON, CSV and other compressed file formats
  • Experience on Kafka and Spark integration for real time data processing
  • Developed Kafka producer and consumer components for real time data processing.
  • Experienced writing queries in Spark SQL using Scala
  • Communicated all issues and participated in weekly strategy meetings

Environment: Hadoop HDFS, Apache Spark, Spark-Core, Spark-SQL, Scala, JDK 1.7, Sqoop, Eclipse, MySQL, CentOS Linux, Zookeeper

Software Specialist

Confidential

Oracle Database Administrator

Responsibilities:

  • Managing over 20 critical applications single.
  • Configuring Dataguard for OLTP databases.
  • Upgrading databases to 12C.
  • Work with Application team to performance related issues.
  • Rebuilding of Indexes for better performance, maintenance of Oracle Database.
  • Generated performance reports and Daily health checkup of the database using utilities like AWR, Statspack to gather performance statistics.
  • Identified and tuned poor SQL statements using EXPLAIN PLAN, SQL TRACE and TKPROF, analyzed tables, indexes for improving the performance of the Query.
  • Troubleshooting various issues like database connectivity to users, privileges issue.
  • Created users and allocated appropriate table space quotas with necessary privileges and roles for all databases.
  • Wrote script to monitor the database with shell and PL/SQL code or SQL code such as procedure, function and package.
  • Created or cloned the oracle Instance and databases on ASM. Performed database cloning and re-location activities.
  • Managed tablespaces, data files, redo logs, tables and its segments.
  • Maintained data integrity also managed profiles, resources and password security manage Users, privileges and roles.
  • Performed RMAN backups, restores, cloning, or refreshing databases and applications.
  • Monitoring and planning ASM storage in all databases.

Environment: Oracle 11g/12C, TOAD, Linux, UNIX, Putty, E-Manager, SQL SERVER, Windows Server, Web services, WebLogic

Hire Now