Cube and Spark/Hive Developer Resume

SUMMARY

Cloudera certified Spark and Hadoop Developer with ~6+ years of experience on Hadoop, Spark and Confidential technologies.
Experience in machine learning using random forest, deep learning for understand the shoppingtrends.
Experience in installing, configuring, testing Hadoop ecosystemcomponents.
Experience in migrating the data using Sqoop from HDFS to Relational Database System and vice - versa according to client'srequirement.
Experience in managing different file formats andcompressions.
Extensive noledge on Spark Core APIs, Data Frames, SparkSQL.
Very good understanding of partitioning and bucketing inHive.
Knowledge on ingesting real-time or near real time streaming data into HDFS using flume, Kafka and spark streaming.
KnowledgeonintegrationofflumewithKafka,integrationofflumewithsparkstreamingandKafkawithspark streaming.
Designed HIVE queries & Pig scripts to perform data analysis, data transfer and tabledesign.
Expertise in Hive queries and has extensive noledge onjoins.
Extensive noledge on Sqoop import andexports.
Create hive scripts to extract, transform, load (ETL) and store thedata
Managed and reviewed Hadoop logfiles.
Hands-on noledge in core Java concepts like Exceptions, Collections, Data-structures, I/O. Multi-threading, Serialization and deserialization of streaming applications.
Experience in Software Design, Development and Implementation of Client/Server Web based Applications using JSTL, jQuery, JavaScript, Java Beans, JDBC, Struts, PL/SQL, SQL, HTML, CSS, XML, AJAX and had a bird's eye view on React Java Script Library.
Experience in mentoring my team members by training, guiding and monitoring theirtasks.

TECHNICAL SKILLS

Languages: Python, SQL, shell script, Java, C/C++, HTML

Bigdata Technologies: HDFS, Hive, Pig, Spark, Kafka, Sqoop, Flume

Design tools: MATLAB, Arduino, NetBeans

Java Technologies: Core Java, JSP, JDBC, Eclipse, Jboss

Cloud Technologies: Confidential AWS

No Sql Databases: HBase

IDE’s & Tools: Confidential, Eclipse, Intellij Idea, putty, Visual Studio

Databases: My Sql, Oracle

Operating Systems: Unix, windows, Linux

Business Intelligence Tools: Tableau

Modelling Language: UML

PROFESSIONAL EXPERIENCE

Confidential

Cube and Spark/Hive Developer

Responsibilities:

Primary aim of the project is to demonstrate how a consumption strategy can be developed on top of Hadoop. Key roles and responsibilities in the project are:
Installation and setup of multi node Cloudera cluster on AWScloud
Installation and setup of Confidential on top of Hadoop cluster using Hive and Impala as the SQLengines
Development of cubes involving multiple facts anddimensions
Development of calculations, leveraging Query Datasets
Infrastructure related activities like Managing Confidential database instance, backing up instance, upgrading instance, reviewing logs and troubleshooting Hadoop and AtScaleenvironment.
Defining & managingAggregates
Developing fine grain access and data level security for region, branch and main branchhierarchy
Develop access patters for data access from excel andtableau.

Confidential

Hadoop Developer

Responsibilities:

Developed the framework for multiple data sources ingestion capabilities. Configured framework with required metadata for data ingestion for all the datasources.
Developed a mapping of Data ingestion type to the program mapping (Sqoop, Pig orKafka)
Developed Ingestion capability using Sqoop, Kafka and pig. Leveraged spark for data processing and transformation
Developed the real time / near real time framework using Kafka and Flumecapabilities
Developed framework to decide on data formats like Parquet, AVRO, ORCetc.
Developed Spark code using Python and Spark-SQL for faster processing andtesting.
Worked on Spark SQL for joining multiple hive tables and write them to a final hive table and stored them on S3.
Implemented Spark RDD transformations to Map business analysis and apply actions on top of transformations.
Performed querying of both managed and external tables created byHive.
FlumewasusedtoingeststreamingdataintoHDFSorKafkatopics,whereitactedasaKafkaproducer.Multiple Flume agents were also used to collect data from multiple sources into a Flumecollector.
Continuous monitoring and managing the Hadoop cluster through ClouderaManager.
Used Spark API over Cloudera Hadoop YARN to perform analytics on data inHive.
Responsible in creating Hive tables, loading with data and writing Hivequeries.
Setting up Cloudera Clusters, adding nodes /hosts.
Performance Tuning of Cloudera Clusters in terms of resources allocation across Yarn, HDFS andImpala.

Confidential

Risk Data Analyst

Responsibilities:

Analyzing the purchase patterns to the buyers and seller experience better using Random Forest, Deep learning.
Developing reports by performing spark- actions and transformations on the available data and connecting to tableau for better businessdecisions.
AddressingallegedintellectualpropertyinfringementsbytherightsownerstomakesureAmazonidentifyand protect originalworks.
Monitormerchanttransactions,performanceandotherparameterstomakeamazonmarketplaceasafeplace to maketransactions.
Creating HIVE scripts for ETL, creating HIVE tables, writing HIVEqueries.

Environment: s: AWS, Hadoop, Spark, Hive, Sqoop, Tableau, Jupiter Notebook, Investigator Workbench, Lexus Nexus.

Confidential

Java Developer

Responsibilities:

Responsible and active in the analysis, design, implementation and deployment of full Software Development Lifecycle (SDLC) of the project.
Designed and developed user interface using JSP, HTML and JavaScript.
Defined the search criteria and pulled out the record of the customer from the database. Make the required changes and save the updated record back to the database.
Validated the fields of user registration screen and login screen by writing JavaScript and jQuery validations.
Used DAO and JDBC for database access.
Developed stored procedures and triggers using PL/SQL in order to calculate and update the tables to implement business logic.
Design and develop XML processing components for dynamic menus on the application.
Involved in postproduction support and maintenance of the application.
Involved in the analysis, design, implementation, and testing of the project modules.
Implemented the presentation layer with HTML, XHTML and JavaScript.
Developed web components using JSP and JDBC.
Deploying the Application to the JBOSS Application Server.
Requirement gatherings from various stakeholders of the project.
Effort-estimation and estimating timelines for development tasks.
Used to J2EE and EJB to handle the business flow and Functionality.
Implemented database using SQL Server.
Designed tables and indexes.
Wrote complex SQL queries and stored procedures.
Involved in fixing bugs and unit testing with test cases using JUnit.
Created user and technical documentation.

Environment: s: Java, Oracle, HTML, XML, SQL, J2EE, JUnit, JDBC, JSP, Tomcat, SQL Server, MongoDB, JavaScript, GitHub, SourceTree, NetBeans.

We provide IT Staff Augmentation Services!

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship