Cube And Spark/hive Developer Resume
SUMMARY
- Cloudera certified Spark and Hadoop Developer with ~6+ years of experience on Hadoop, Spark and Confidential technologies.
- Experience in machine learning using random forest, deep learning for understand the shoppingtrends.
- Experience in installing, configuring, testing Hadoop ecosystemcomponents.
- Experience in migrating the data using Sqoop from HDFS to Relational Database System and vice - versa according to client'srequirement.
- Experience in managing different file formats andcompressions.
- Extensive noledge on Spark Core APIs, Data Frames, SparkSQL.
- Very good understanding of partitioning and bucketing inHive.
- Knowledge on ingesting real-time or near real time streaming data into HDFS using flume, Kafka and spark streaming.
- KnowledgeonintegrationofflumewithKafka,integrationofflumewithsparkstreamingandKafkawithspark streaming.
- Designed HIVE queries & Pig scripts to perform data analysis, data transfer and tabledesign.
- Expertise in Hive queries and has extensive noledge onjoins.
- Extensive noledge on Sqoop import andexports.
- Create hive scripts to extract, transform, load (ETL) and store thedata
- Managed and reviewed Hadoop logfiles.
- Hands-on noledge in core Java concepts like Exceptions, Collections, Data-structures, I/O. Multi-threading, Serialization and deserialization of streaming applications.
- Experience in Software Design, Development and Implementation of Client/Server Web based Applications using JSTL, jQuery, JavaScript, Java Beans, JDBC, Struts, PL/SQL, SQL, HTML, CSS, XML, AJAX and had a bird's eye view on React Java Script Library.
- Experience in mentoring my team members by training, guiding and monitoring theirtasks.
TECHNICAL SKILLS
Languages: Python, SQL, shell script, Java, C/C++, HTML
Bigdata Technologies: HDFS, Hive, Pig, Spark, Kafka, Sqoop, Flume
Design tools: MATLAB, Arduino, NetBeans
Java Technologies: Core Java, JSP, JDBC, Eclipse, Jboss
Cloud Technologies: Confidential AWS
No Sql Databases: HBase
IDE’s & Tools: Confidential, Eclipse, Intellij Idea, putty, Visual Studio
Databases: My Sql, Oracle
Operating Systems: Unix, windows, Linux
Business Intelligence Tools: Tableau
Modelling Language: UML
PROFESSIONAL EXPERIENCE
Confidential
Cube and Spark/Hive Developer
Responsibilities:
- Primary aim of the project is to demonstrate how a consumption strategy can be developed on top of Hadoop. Key roles and responsibilities in the project are:
- Installation and setup of multi node Cloudera cluster on AWScloud
- Installation and setup of Confidential on top of Hadoop cluster using Hive and Impala as the SQLengines
- Development of cubes involving multiple facts anddimensions
- Development of calculations, leveraging Query Datasets
- Infrastructure related activities like Managing Confidential database instance, backing up instance, upgrading instance, reviewing logs and troubleshooting Hadoop and AtScaleenvironment.
- Defining & managingAggregates
- Developing fine grain access and data level security for region, branch and main branchhierarchy
- Develop access patters for data access from excel andtableau.
Confidential
Hadoop Developer
Responsibilities:
- Developed the framework for multiple data sources ingestion capabilities. Configured framework with required metadata for data ingestion for all the datasources.
- Developed a mapping of Data ingestion type to the program mapping (Sqoop, Pig orKafka)
- Developed Ingestion capability using Sqoop, Kafka and pig. Leveraged spark for data processing and transformation
- Developed the real time / near real time framework using Kafka and Flumecapabilities
- Developed framework to decide on data formats like Parquet, AVRO, ORCetc.
- Developed Spark code using Python and Spark-SQL for faster processing andtesting.
- Worked on Spark SQL for joining multiple hive tables and write them to a final hive table and stored them on S3.
- Implemented Spark RDD transformations to Map business analysis and apply actions on top of transformations.
- Performed querying of both managed and external tables created byHive.
- FlumewasusedtoingeststreamingdataintoHDFSorKafkatopics,whereitactedasaKafkaproducer.Multiple Flume agents were also used to collect data from multiple sources into a Flumecollector.
- Continuous monitoring and managing the Hadoop cluster through ClouderaManager.
- Used Spark API over Cloudera Hadoop YARN to perform analytics on data inHive.
- Responsible in creating Hive tables, loading with data and writing Hivequeries.
- Setting up Cloudera Clusters, adding nodes /hosts.
- Performance Tuning of Cloudera Clusters in terms of resources allocation across Yarn, HDFS andImpala.
Confidential
Risk Data Analyst
Responsibilities:
- Analyzing the purchase patterns to the buyers and seller experience better using Random Forest, Deep learning.
- Developing reports by performing spark- actions and transformations on the available data and connecting to tableau for better businessdecisions.
- AddressingallegedintellectualpropertyinfringementsbytherightsownerstomakesureAmazonidentifyand protect originalworks.
- Monitormerchanttransactions,performanceandotherparameterstomakeamazonmarketplaceasafeplace to maketransactions.
- Creating HIVE scripts for ETL, creating HIVE tables, writing HIVEqueries.
Environment: s: AWS, Hadoop, Spark, Hive, Sqoop, Tableau, Jupiter Notebook, Investigator Workbench, Lexus Nexus.
Confidential
Java Developer
Responsibilities:
- Responsible and active in the analysis, design, implementation and deployment of full Software Development Lifecycle (SDLC) of the project.
- Designed and developed user interface using JSP, HTML and JavaScript.
- Defined the search criteria and pulled out the record of the customer from the database. Make the required changes and save the updated record back to the database.
- Validated the fields of user registration screen and login screen by writing JavaScript and jQuery validations.
- Used DAO and JDBC for database access.
- Developed stored procedures and triggers using PL/SQL in order to calculate and update the tables to implement business logic.
- Design and develop XML processing components for dynamic menus on the application.
- Involved in postproduction support and maintenance of the application.
- Involved in the analysis, design, implementation, and testing of the project modules.
- Implemented the presentation layer with HTML, XHTML and JavaScript.
- Developed web components using JSP and JDBC.
- Deploying the Application to the JBOSS Application Server.
- Requirement gatherings from various stakeholders of the project.
- Effort-estimation and estimating timelines for development tasks.
- Used to J2EE and EJB to handle the business flow and Functionality.
- Implemented database using SQL Server.
- Designed tables and indexes.
- Wrote complex SQL queries and stored procedures.
- Involved in fixing bugs and unit testing with test cases using JUnit.
- Created user and technical documentation.
Environment: s: Java, Oracle, HTML, XML, SQL, J2EE, JUnit, JDBC, JSP, Tomcat, SQL Server, MongoDB, JavaScript, GitHub, SourceTree, NetBeans.