Hadoop/java Developer Resume
Sunnyvale, CA
PROFESSIONAL SUMMARY:
- Having 7+ years of experience in IT which includes Analysis, Design, Development, Implementation & maintenance of projects in Big Data using Apache Hadoop/Spark echo systems, design and development of web applications using Java technologies.
- Experience in analysis, design, development, and integration using Big Data Hadoop ecosystem components with cloud era in working with various file formats like Avro, Parquet.
- Working with various compression techniques like Snappy, LZO and G Zip.
- Experience in developing customized partitioners and combiners for effective data distributions.
- Expertise in tuning Impala queries to overcome multiple concurrence jobs and out of memory errors for various analytics use cases.
- Rigorously applied transformations in Spark and R programs.
- Worked with AWS cloud based CDH 5.13 cluster and developed merchant campaigns using Hadoop.
- Developed and maintained ETL (Data Extraction, Transformation and Loading) mappings using Informatica Designer 8.6 to extract the data from multiple source systems that comprise databases like Oracle 10g, SQL Server 7.2, flat files to the Staging area, EDW and then to the Data Marts.
- Expertise in using built in Hive Ser De and developing custom SerDes.
- Developed multiple Internal and external Hive Tables using Dynamic Partitioning & bucketing.
- Design and development of full text search feature with multi - tenancy elastic search after collecting the real time data through Spark streaming.
- A Used Erwin for Logical and Physical database modelling of the warehouse, responsible for database schemas creation based on the logical models Wrote multiple customized MapReduce Programs for various Input file formats.
- Experience in developing NoSQL applications using Mongo DB, HBase and Cassandra.
- Tuned multiple spark applications for better optimization.
- Developed data pipeline for real time use cases using Kafka, Flume and Spark Streaming.
- Experience in importing and exporting multi-Terabytes of data using Sqoop from HDFS, Hive to Relational Database Systems (RDBMS) and vice-versa.
- Developed multiple hive views for accessing HBase Tables data.
- Used complex Spark SQL programs for better joining and display the results on Kibana dashboard.
- Expertise in using various formats like Text, Parquet while creating Hive Tables.
- Experience in analysing large scale data to identify new analytics, insights, trends, and relationships with a strong focus on data clustering.
- Expertise in collecting data from various source systems as social media and databases.
- End-to-end hands on in ETL process and setup automation to load terabytes data into HDFS.
- Good Experience in Developing Applications using core java, Collections, Threads, JDBC,
- Servlets, JSP, Struts, Hibernate, XML components using various IDEs such as
- Eclipse6.0, MyEclipse.
- Experience in SQL programming in writing queries using joins, stored procedures, triggers, functions and performing query optimization techniques with Oracle, SQL Server, MySQL.
- Excellent team worker with good interpersonal skills and leadership qualities.
- Excellent organizational and communication skills.
- Excellent in understanding of Agile and scrum methodologies.
TECHNICAL SKILLS:
Programming Languages: C, C++, Java, Scala, R, Python, UNIX
Distributed Computing: Apache Hadoop, HDFS, MapReduce, Pig, Hive, Oozie, Hue, Kerberos, Sentry, Zookeeper, Kafka, Flume, Impala, HBase and Sqoop
AWS Components: EC2, S3, RDS, Redshift, EMR, DynamoDB, Lambda, RDS, SNS, SQS
Web Development: HTML, JSP, XML, Java Script and AJAX
Web Application Server: Tomcat 6.0, JBoss 4.2 and Web Logic 8.1
Operating Systems: Windows, Unix, iOS, Ubuntu and RedHat Linux
Tools: Eclipse, NetBeans, Visual Studio, Agitator, Bugzilla, ArcStyler (MDA), Rational Rose, Enterprise Architect and Rational Software Architect
Source Control Tools: VSS, Rational Clear Case, Subversion
Application Framework: Struts 1.3, spring 2.5, Hibernate 3.3, Jasper Reports, JUnit and JAXB
RDBMS: Oracle and SQL Server 2016
NOSQL: MongoDB, Cassandra and HBase
PROFESSIONAL EXPERIENCE:
Confidential, Sunnyvale, CA
Hadoop/java Developer
Responsibilities:
- Involved in creating Hive tables, loading with data, and writing hive queries to process the data.
- Developing and maintaining Workflow Scheduling Jobs in Oozie for importing data from RDBMS to Hive.
- Implemented Partitioning, Bucketing in Hive for better organization of the data.
- Involved with the team of fetching live stream data from Kafka to H base table using Spark Streaming and Apache Kafka.
- Developing Spark Streaming program for importing data from the Kafka topics into the H base tables.
- Involved in the Hadoop Configuration changes.
- Involved in the Design Phase for getting live event data from the database to the front-end application.
- Backend changes in the Code and Deployment in Production.
- Analysing live stream data from Kafka and lags.
- Solving in font-end application Two UI related issues like missing data or replication of data.
- Using Apache PHOENIX for queries.
- Process of data flow in Kafka and Flume.
- Analysing data in different stages in SAP and HBase.
- Involved in the Design Phase for getting live event data from the database to the front-end application.
- Importing data from hive table and run SQL queries over imported data and existing RDD’s Using Spark SQL.
- Responsible for loading and transforming large sets of structured, semi structured, and unstructured data.
- Developing and maintaining Workflow Scheduling Jobs in Oozie for importing data from RDBMS to Hive.
- Used Eclipse IDE for development, configured and deployed the application on to Web.
- Flume Deployment and Troubleshooting.
- Used Jira to track the user stories and defects with agile technology.
- Worked on NoSQL (Mongo DB) concepts such as transactions, indexes, replications, schema design.
- Wrote application-level code to perform client-side validation using jQuery and JavaScript.
- Loading Bulk load using MR Jobs.
- Apache Solar existing implementation.
- Collected the log data from web servers and integrated into HDFS using Flume.
- Responsible to manage data coming from different sources.
- Used log4j for logging errors in the application.
Environment: Eclipse, jdk1.8.0, ClouderaManager5.14 HDFS, MapReduce, Hive2.0, HBase, Apache-Maven3.0.3, Mongo DB, Splunk6.0, SAP, JIRA, Kubernetes, Microservices.
Confidential, Atlanta, GA
Hadoop/Spark Developer
Responsibilities:
- End-to-end involvement in data ingestion, cleansing, and transformation in Hadoop.
- Created Hive tables, load and transform large sets of structured and semi structured data.
- Logical implementation and interaction with HBase.
- Developed multiple Scala/spark jobs for data transformation and aggregation.
- Implemented PIG scripts to load data from and to store data into Hive using h Catalog.
- Write scripts to automate application deployments and configurations. Monitoring YARN applications. Troubleshoot and resolve cluster related system problems.
- Optimizing of existing word2vec algorithms in Hadoop using Spark Context, Spark-SQL, Data Frames and Pair RDD's in development of chatbot using Open NLP and Word2Vec.
- Produce unit tests for Spark transformations and helper methods.
- Implemented various output formats like Sequence file and parquet format in Map reduce programs. Also, implemented multiple output formats in the same program to match the use cases.
- Design and developed data pipeline using Kafka, flume, and spark streaming.
- Performed benchmarking of the No-SQL databases, Cassandra and HBase.
- Hands on experience with Lambda architectures.
- Created data model for structuring and storing the data efficiently. Implemented partitioning and bucketing of tables in Cassandra.
- Implemented test scripts to support test driven development and continuous integration.
- Converted text files into Avro then to parquet format for the file to be used with other Hadoop eco system tools.
- Experienced in handling large datasets using Partitions, Spark in Memory capabilities, Broadcasts in Spark, Effective & efficient Joins, Transformations and other during ingestion process itself.
- Experienced on loading and transforming of large sets of structured, semi structured, and unstructured data.
- HBase tables to load large sets of structured, semi-structured and unstructured data.
- Used Impala to read, write and query the data in HDFS from Cassandra and configured Kafka to read and write messages from external programs.
- Handling the importing of data from various data sources (media, MySQL) and performing transformations using Hive, MapReduce.
- Ran Pig scripts on Local Mode, Pseudo Mode, Distributed Mode in various stages of testing.
- Performed Importing and exporting data from SQL server to HDFS and Hive using Sqoop.
- Optimizing existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frames and Pair RDD's.
- Implemented Spark using Scala and Spark SQL for faster testing and processing of data.
- Create a complete processing engine, based on Cloudera distribution, enhanced performance.
- Developed data pipeline using Flume, Spark, and Hive to ingest, transform and analysing data.
- Writing Scala doc-style documentation with all code
- Designed and Modified Database tables and used HBASE Queries to insert and fetch data from tables.
- Developing and supporting multiple spark Programs running on the cluster.
- Preparation of Technical architecture and Low -level design documents.
- Tested raw data and executed performance scripts.
Environment: Linux, eclipse, jdk1.8.0, Hadoop2.9.0, flume 1.7.0, HDFS, MapReduce, Pig0.16.0, Spark 2.0, Hive 2.0, Apache-Maven3.0.3
Confidential, Bloomington, IL
Hadoop Developer
Responsibilities:
- Worked on improving the performance of existing Pig and Hive Queries.
- Developed Oozie workflow engines to automate Hive and Pig jobs.
- Worked on performing Join operations.
- Exported the result set from Hive to MySQL using Sqoop after processing the data.
- Analysed the data by performing Hive queries and running Pig scripts to study customer behaviour.
- Used Hive to partition and bucket data.
- Performed various source data ingestions, cleansing, and transformation in Hadoop.
- Design and developed many Spark Programs using PY spark.
- Produce unit tests for Spark transformations and helper methods.
- Creating RDD's and Pair RDD's for Spark Programming.
- Debugged existing ETL processes and did performance tuning to fix bugs.
- Interacted with third party vendors and identified different external and internal homogenous and heterogeneous sources and extracted and integrated data from flat files, Oracle, SQL Server sources and loaded to staging area and database/Datamart tables.
- Co-ordinated monthly roadmap releases to push enhanced/new informatica code to production.
- Integrated data quality plans as a part of ETL processes.
- Experienced in performance tuning of Spark Applications for setting right Batch Interval time, correct level of Parallelism and memory tuning.
- Developed Pig Scripts to perform ETL procedures on the data in HDFS.
- Analysed the partitioned and bucketed data and compute various metrics for reporting.
- Created HBase tables to store various data formats of data coming from different systems.
- Advanced knowledge in performance troubleshooting and tuning Cassandra clusters.
- Analysing the source data to know the quality of data by using Talend Data Quality.
- Created Scala/Spark jobs for data transformation and aggregation.
- Involved in creating Hive tables, loading with data, and writing hive queries.
- Used Impala to read, write and query the Hadoop data in HDFS from Cassandra and configured Kafka to read and write messages from external programs.
- Preparation of Technical architecture and Low-level design documents
- Tested raw data and executed performance scripts.
Environment: eclipse, jdk1.8.0, Hadoop2.8, HDFS, Map Reduce, Spark 2.0 Pig0.15.0, Hive2.0, HBase, Apache-Maven3.0.3
Confidential
Java Developer
Responsibilities:
- Actively involved in writing SQL using SQL Query Builder.
- Used JAXB to read and manipulate the xml properties.
- Used JNI for calling the libraries and other implemented functions in C language.
- Handling Server Related issues, new requirement handling, changes, and patch movements.
- Involved in developing business logic for skip pay module.
- Understanding the requirements based on the design.
- Coding the business logic methods in core java.
- Involved in development of the Action classes and Action Forms based on the
- Struts framework.
- Participated in client-side validation and server-side validation.
- Involved in creation of struts configuration file and validation file for skip module using.
- Struts framework
- Developed java programs, JSP pages and servlets using spring framework.
- Involved in creating database tables, writing complex TSQL queries and stored procedures in the SQL server.
- Worked with AJAX framework to get the asynchronous response for the user request and used JavaScript for the validation.
- Used EJBs in the application and developed Session beans to implement business logic at the middle tier level.
- Developed the Restful Web Services for various XSD schemas.
- Used Servlets to implement Business components.
- Designed and developed required Manager Classes for database operations.
- Developed various Servlets for monitoring the application.
- Designed the UML class diagram, Sequence diagrams for Trade Services.
- Designed the complete Hibernate mapping for SQL Server for PDM.
- Designed the complete JAXB classes mapping for various XSD schemas.
- Developed the Restful Web Services for various XSD schemas.
- Involved in writing JUnit test Classes for performing Unit testing.
- Developing a uniform standard of invoices between various customers.
- Automatic generations of invoices for fixed prize contract once deliverable have been transmitted to sponsor agencies.
- Preparation of Technical architecture and Low-level design documents.
- Tested raw data and executed performance scripts.
Environment: Eclipse neon, jdk1.8.0, Java, Servlets, JSP, EJB, xml, SQL server, Spring Junit and Eclipse, SQL, UNIX, UML, Apache-Maven3.0.3
Confidential
Java Developer
Responsibilities:
- Which Improved Code Quality to a commendable level.
- Interacting with business and reporting teams for requirement gathering, configure walkthrough and UAT.
- Developed the Restful Web Services for various XSD schemas.
- Used Servlets to implement Business components.
- Designed and developed required Manager Classes for database operations.
- Developed various Servlets for monitoring the application.
- Designed and developed the front-end using HTML and JSP.
- Involved in eliciting the requirement, use case modelling, design, leading the development team and tracking the project status.
- Understanding Functional and Technical Requirement Specification of the customer’s need for the product.
- Impact analysis of the application under development.
- Creating UML Diagrams like Class Diagram, Sequence Diagram and Activity Diagram.
- Developing application as per given specifications.
- Creation of User based modules implementing different levels of security.
- Involved in designing of Front-end, Implementing Functionality with Business Logic.
- Mentoring and grooming juniors technically as well as professionally on Agile practices and Java/J2EE development issues.
- Allowing department access to accounts receivable information. Subject to security and authorization constraints.
- Involved in designing and Code Reviews.
- Assisted in troubleshooting architectural problems.
- Design and development of report generation by using velocity framework.
- Interacting with business and reporting teams for requirement gathering, configure walkthrough and UAT.
- Generating monthly Sales report, Collection Report and Receivable-aging reports.
- Prepared use cases designed and developed object models and class diagrams.
- Worked on one of the most critical modules for project, right from the beginning phase which included requirement gathering, analysis, design, review, and development.
- Module lead located to another location had KT from him about roughly 2 weeks, Lead was absorbed by client.
- Took initiative in building a new team of more than 6 members with proper knowledge transfer sessions assigning and managing tasks with JIRA.
- Learned Backbone JS and worked with UI team on UI enhancements.
- Working with BA, QA to identify and fix bugs, raise new feature and enhancements.
- Was greatly appreciated by client with appreciation certificate and client bonus of 10k and 50k, respectively.
- Analyse the generated Junit and add proper asserts and make it more code specific along with increasing the code coverage. This helped to boast my product knowledge as well as my Junit writing skills.
- Addressed issues related to application integration and compatibility.
- Performed enhancements to User Group Tree using DOJO and JSON to meet business ne.
Environment: Eclipse neon, jdk1.8.0, Java, Servlets, JSP, EJB, xml, SQL server, Spring Junit and Eclipse, SQL, UNIX, UML, Apache-Maven3.0.3