Aws/ Hadoop Developer Resume
Ew, YorK
SUMMARY
- Over 8 years of working experience in system requirements, analysis, design, testing, implementation and development for various tools like (HDFS, Map Reduce, Oracle, JDK, J2EE, Apache Kafka, Java Script, Java, JS, JDBC, JSP, SERVLETS, XML, Maven and SQL Server, and Mongo - DB)
- Worked with most of the JDK versions for 6 years, in almost many modules.
- Good usage of Spark Core and Spark SQL with Scala and Java.
- Capable of processing large sets of structured, semi-structured, unstructured data and supporting systems application architecture.
- Developed several programs us UNIX shell script.
- Extensive working knowledge on YARN and KAFKA.
- Extensively worked on cloud platforms like Amazon web services (AWS).
- Excellent hands on work with Hive UDF UDAF and UDTF.
- Extensively involved in all phases of Software development life cycle including Analysis, design, development, Implementation, testing and support
- Regular use of Connectivity hosts and authentication protocols.
- Loaded huge data into Spark RDD and do in memory data Computation to generate the Output response.
- Had good work experience with Amazon Ec2 cluster, and worked with several instances.
- Good Hands on work with Python, and very much comfortable in operating (Mac/Windows/Linux)
- Exposure to Waterfall and Agile Software methodologies.
- Expert in Problem solving, excellent analytical, troubleshooting and debugging skills
- Good domain knowledge in Supply chain, Financial and Insurance domain
- Used various compression techniques like LZO, G-ZIP, and Snappy.
- Developed Map-reduce programs and libraries using Java-8, and extracted several abstract methods.
- Strong experience in Web based applications design, development and implementation.
- Excellent team player with good communication, presentation and highly motivated.
TECHNICAL SKILLS
Big Data Ecosystems: Hadoop, Map-Reduce, HDFS, Mongo DB, H-Base, Hive, Pig, Zookeeper, Oozie, Flume, Sqoop, Java-8.
Programming Languages: C/ C++, Java (Ant, springs, Maven).
Scripting Languages: JavaScript, HTML, Python, XML, JSP & Servlets, PHP and Bash
Databases: Oracle, NoSQL
UNIX Tools: Yum, RPM, Apache, red hat Linux
Tools: Eclipse, Cloudera, Horton-works, Net-beans CVS, Ant.
Platforms: Windows (2000/XP), Linux, Solaris, AIX, AWS Platform.
Application Servers: Apache Tomcat 5.x 6.0
Methodologies: Agile, UML, Design Patterns
PROFESSIONAL EXPERIENCE
Confidential, New York
AWS/ Hadoop Developer
Responsibilities:
- Responsible for developing scripts like python to run jobs in Pyspark. The concept of data transformation is involved here, by converting huge Tsv files to parquet file formats along with compression.
- Gather the requirements for every sprint and assign the priorities and time every user story takes.
- Developed Apache presto and Apache drill setups in AWS EMR (Elastic Map Reduce) cluster, to combine multiple databases like Mysql and Hive.
- This enables to compare results like joins and inserts on various data sources controlling through single platform.
- Involved in developing shell script, where the logs generated by the users are collected and stored in AWS S3 (Simple storage service) buckets.
- This includes the trace of all user activities and a good sign of security to identify cluster termination and to protect the data integrity.
- Worked on cloud formation techniques, viz to design a prototype of JSON file format, where this file itself automates in creating the services mentioned in it.
- This is the best alternative used in production environment rather than doing the tasks manually.
- Configured Apache Hue console and it’s hive-site.xml property files.
- Performed partitioning and Bucketing concepts in Apache Hive database, which improves the retrieval speed when someone performs a query.
- Created AWS RDS (Relational database services) to migrate the Hive metastore external to the EMR cluster.
- Worked on Dynamo DB No-SQL database as part of scheduling logs using python script.
- Used AWS Code Commit Repository to store their programming logics and script and have them again to their new clusters.
Confidential
Hadoop Developer
Responsibilities:
- Worked on several Big - data Ecosystems like Oozie, Sqoop, Spark, Kafka, Flume, Pig, H-Base, Hive, and Sqoop with CDH5.
- Worked on Hadoop cluster during pre-production stage which ranged from 20-30 nodes and sometimes extended even more in production time.
- Worked on Spark Streaming and Spark SQL to run sophisticated applications on Hadoop.
- Good Hands on work with Python, and very much comfortable in operating (Mac/Windows/Linux)
- Worked on multi-node installation with zookeeper esseblers.
- Injected Elastic IP address and automated scaling concepts working with AWS (Amazon) EC2.
- Developed UDF's in both Data frames/SQL / RDD/MapReduce and Scala scripts in Spark 1.6 for writing data back into OLTP system, Data Aggregation and queries.
- Worked with the encrypted zone directory in HDFS Confidential distribution to generate Data Encryption Keys.
- Written python scripts in Oozie work flows, to automate various jobs and.
- Developed Spark scripts by using Scala shell commands as per the requirement
- Worked on Data Serialization and HIVE serialization formats, which involves converting Complex objects into sequence bits by using CSV, PARQUET, JSON and AVRO formats.
- Created Hive aggregator to update the Hive table after running the data profiling job.
- Implemented Bucketing, Dynamic and regular Partitioning in Hive.
- Designed and developed POCs in Spark using Scala to compare the performance of Spark with Hive and SQL/Oracle.
- Given File permissions to access encrypted files and metadata to control them by HDFS.
- Document/Develop/Capture architectural best practices for building systems on AWS.
- Developed ETL workflow that pushes the webserver logs to Amazon S3 bucket.
Confidentia, Nashville, Tennessee
Hadoop Developer
Responsibilities:
- Analyzed Business requirements of Big Data and transformed into Hadoop centric technologies.
- Implemented custom UDF's for Hive to achieve comprehensive data analysis, and created several JAVA - UDF. Good Understanding with Apache KAFKA
- Worked on Fast export, Fast Load, Multi Load and export from Mysql into HDFS and Hive using Sqoop.
- Spanned different AWS instances like EC2-classic and EC2-VPC with cloud formation templates.
- Developed Pig Custom UDF's for performing various levels of optimization in custom input formats.
- Worked on streaming log data into HDFS from web servers using Flume.
- Used python to embed several applications. Worked on various standard libraries in Python.
- To identify issues and behavioral patterns we used Hive and Pig to analyze data in HDFS.
- For optimized performance we defined static and dynamic partitions and created internal and external Hive tables.
- Created Hive tables to store the processed results in a tabular format.
- For running advanced analytics, we developed Pig scripts on the data collected.
- Extraction, processing and analysis of data is configured on daily workflow using Oozie Scheduler.
- Designed and implemented MapReduce-based on large-scale, parallel and relation-learning system.
- Involved in creating various Utility, Helper and Reusable classes which are used across all modules of application.
Environment: Hadoop, MapReduce, Mongo DB, HDFS, Hive, Pig, Java, SQL, Sqoop, Oozie, NoSQL, JDK- 8, J2EE, JDBC, Java 1.4, Java spring, Servlets, JSP, Web services, Flume, MVC, HTML, JavaScript 1.2.
Confidential
Java Developer
Responsibilities:
- Involved in business technical issues.
- Involved in coding, designing, debugging, documenting and maintaining applications.
- Developed web service API’s using java and java spring.
- Developed front end GUI with Java Server Faces.
- Good understanding of distributed computing principles.
- Used servlets and JSPs to implement controller layer and view layers using JSPs, EL, JSTL and custom JSP tags.
- Implemented HTML, CSS and JavaScript for UI’s.
- Executed test cases manually to verify expected results.
- Implemented various aspects at Service layer using Spring AOP.
- Used Test Director, added test categories and test details.
- Worked on user/business requirements and developed System test plans.
Environment: Servlets, Java (Jdk 1.6), JSPs, HTML, Java Beans, JavaScript, CSS, JDBC, SQL, Windows 98, J2EE, Java 1.4, C, C++, PHP, Multi-threading, JDBC.
Confidential
Java Developer
Responsibilities:
- Created/modified shell scripts for scheduling and automating tasks.
- Used JDBC to establish connection between the database and the application.
- Executed test cases manually to verify expected results.
- Implemented various aspects at Service layer using Spring AOP.
- Used Test Director, added test categories and test details.
- Implemented controller and view modules using Servlets and JSPs respectively.
- Wrote unit test cases using JUnit framework.
- Involved in designing, coding, debugging, documenting and maintaining many applications.
- Created the user interface using HTML, CSS and JavaScript.
Environment: Java-Script, Java spring, JUnit, JSP, Java Beans, HTML, CSS, Oracle 9i Java, Servlets.
