Java/j2ee Developer Resume
San Diego, CA
SUMMARY
- Nearly 8+ years of experience in analysis, design, software Development and support, comprehensively experienced in Big Data Hadoop Development and Ecosystem Analytics in various industry verticals like Aviation, Oil & Gas, Pharmaceuticals and Banking.
- 4+ year’s capacious experience in Hadoop Framework and its ecosystem including HDFS, Map - Reduce, SPARK, Pig, Hive, Oozie, Flume, Hcatalog, Sqoop, Impala, Storm, Zookeeper, No Sql: CASSANDRA and Hbase
- Strong knowledge of apache hive development and writing UDFS, UDAFS, and UDTFS in java for Hive
- Expert in Data Preparation of unstructured/structured data using Big Data ecosystems (Pig, Hive, HAWQ, Map-reduce).
- Proficient in big data ingestion and streaming tools like Flume, Sqoop, Kafka and Storm
- Top-notch understanding and hands on experience on Hadoop Architecture and its components serving as HDFS, JobTracker, TaskTracker, NameNode, DataNode and MapReduce programming paradigm.
- Well versed in Map Reduce MRv1 and Map Reduce MRv2 (YARN)
- Hands on experience developing ETL processes to load data from multiple data sources to HDFS using FLUME and SQOOP, perform structural modifications using Map-Reduce, HIVE and analyse data using visualization/reporting tools.
- Eclectic knowledge on data serialization techniques like JSON, SerDe, and Sequence files, AVRO.
- Implemented Proof of concepts on running Hadoop mapReduce program with partitioner, combiner and migration from multiple databases (SQL server, MySQL) to Hadoop.
- Experience on working with Spark streaming, Spark SQL, Tuning and Debugging the Spark using Python and Scala.
- Hands on experience in installing, configuring and using ecosystem components like Hadoop MapReduce, HDFS, Hbase, ZooKeeper, Oozie, Hive, Cassandra, Sqoop, Pig, Flume, Avro, Thrift.
- Documented the data flow form Application>Kafka>Spark>HDFS>Hive Tables.
- Experience in implementation of Open-Source frameworks like Struts, spring, Hibernate, Web Services etc.
- Experience in understanding the security requirements for Hadoop and integrate with Kerberos authentication and authorization infrastructure.
- Converted XML and Json Files using UDFs, adding jars to library of Hive and imported data to Hive-Serde tables.
- Worked with efficient storage formats like PARQUET, Avro and ORC integrated them with Hadoop and the ecosystem (Hive, Impala and Spark). Also used compression snappy and Zlib.
- Extensive experience in data analysis using tools like Sync sort and HZ along with Shell Scripting and UNIX.
- Good understanding of Data Mining and Machine Learning techniques and experience implementing them with Big Data tools like Mahout, Spark Mlib, H2O.
- Familiar with popular frameworks like Struts, Hibernate, Spring MVC and AJAX.
- Well experienced in using application servers like Weblogic, Web Sphere and Java tools in client server.
- Work experience with cloud infrastructure like Amazon Web Services (AWS).
- Experience in Hadoop Shell commands, writing MapReduce Programs, verifying managing and reviewing Hadoop Log files.
- Great experience in agile and Scrum software development methodologies
- Key strengths are familiarity with multiple software systems, ability to learn quickly new technologies, adapt to new environments, self-motivated, team player, focused adaptive and quick learner with excellent interpersonal, technical and communication skills.
- Strong communication skills, work ethics and the ability to work in a team efficiently with good leadership skills.
TECHNICAL SKILLS
Hadoop/Big Data: HDFS, MapReduce, HBase, Pig, Hive, Sqoop, Flume, Scala, Impala, Spark, Scala, Avro and Oozie
NoSQL Databases: HBase, Cassandra, MongoDB
Java & J2EE Technologies: Java Servlets. Junit, Java Database Connectivity (JDBC), J2EE, JSP
IDE Tools: Eclipse, Cygwin, Putty
Programming languages: C, C++, Java, Python, Linux shell scripts
Databases: Oracle 11g/10g/9i, MySQL, DB2, MS-SQL Server, Teradata
Operating Systems: Windows, Macintosh, Ubuntu (Linux), RedHat
Web Technologies: HTML, XML, JavaScript, JSP, JDBC
Testing: HIVE Testing, HADOOP Testing, Quality Center(QC), MR Unit Testing, Junit Testing
ETL Tools: Informatica, Pentaho
PROFESSIONAL EXPERIENCE
Big Data Engineer
Confidential, San Roman, CA
Responsibilities:
- Involved in Sqoop, HDFS Put or Copy from Local to ingest data and Map Reduce jobs.
- Expertise with the tools in Hadoop Ecosystem including Pig, Hive, HDFS, Map Reduce, SQOOP, Kafka, Yarn, Oozie, and Zookeeper. Hadoop architecture and its components.
- Used Pig to do transformations, event joins, filter boot traffic and some pre-aggregations before storing the data onto HDFS.
- Experience in different Hadoop distributions like Cloudera (CDH4 & CDH5) and Horton Works Distributions (HDP) and Elastic MapReduce (EMR)
- Worked extensively in creating Map Reduce jobs to power data for search and aggregation. Designed a data warehouse using Hive.
- Experience in writing MapReduce programs with Java API to cleanse Structured and unstructured data.
- Developed multiple POCs using PySpark and deployed on the Yarn cluster, compared the performance of Spark, with Hive and SQL/Tera data
- Tool monitored log input from several data centres via Spark Stream, was analyzed in Apache Storm and data was parsed and saved into Cassandra.
- Experience with creating ETL jobs to load JSON data and server data into MongoDB and transformed MongoDB into the Data Warehouse.
- Involved in ETL code deployment, Performance Tuning of mappings in Informatica.
- Experienced with performing analytics on Time Series data using HBase.
- Experienced in build/deploy multi module applications using Maven and integrated with CI servers like Jenkins. Telecom, Medical and Manufacturing Domains/Sectors.
- Expertise in cleansing and analyzing data using HiveQL, Pig Latin and custom MapReduce programs in Java.
- Improved stability and performance of the Scala plug-in for Eclipse, using product feedback from customers and internal users.
- Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
- Involved in creating Hive Tables, loading with data and writing Hive queries which will invoke and run MapReduce jobs in the backend.
- Experience in AWS cloud environment and on s3 storage and ec2 instances.
- Processing the real time data using Spark and connecting it to HIVE tables to store the real time data.
- Experience with creating ETL jobs to load JSON data and server data into MongoDB and transformed MongoDB into the Data Warehouse.
- Created ETL Mapping with Talend Integration Suite to pull data from Source, apply transformations, and load data into target database.
- Played a key role in dynamic Partitioning and Bucketing of the data stored in HIVE Metadata.
- Implemented a distributed messaging queue to integrate with Cassandra using Apache Kafka and ZooKeeper.
- Analyzed the Cassandra/SQL scripts and designed the solution to implement using Scala
- Install KAFKA on Hadoop cluster and configure producer and consumer coding part in java to establish connection from twitter source to HDFS with popular hash tags.
- Managed and developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
- Responsible for automating number of Sqoop, Hive and Pig scripts using Oozie Workflow Schedule.
- Scala job is created to implement on POC to migrate MapReduce job to SPARK RDD.
- Monitoring the system health, reports and logs in process to act swiftly in the terms of failure and also alerting the team regarding the failures.
- Utilized Agile Scrum Methodology to help manage and organize developers with regular code review sessions.
Environment: Hadoop, MapReduce, Hive, HDFS, PIG, Sqoop, Oozie, Cloudera, Flume, HBase, ZooKeeper, CDH3, MongoDB, Cassandra, Oracle, NoSQL and Unix/Linux, AWS, S3, EC2, Talend, Spark, Kafka, Sql, SparkQL.
Big Data Developer & Analyst
Confidential, Baltimore, MD
Responsibilities:
- Worked on Agile Methodology, participated in daily/weekly team meetings, peer reviewed the development works and provided the technical solutions.
- Proposed ETL strategies based on requirements.
- Worked on custom Pig Loaders and Storage classes to work with a variety of data formats such as JSON, Compressed CSV, etc.
- Implemented secondary sorting to sort reducer output globally in map reduce.
- Implemented data pipeline by chaining multiple mappers by using Chained Mapper.
- Created Hive Dynamic partitions to load time series data. Wrote complex Hive queries and UDFs.
- Experienced in handling different types of joins in Hive like Map joins, bucker map joins, sorted bucket map joins.
- Involved in configuring multi-nodes fully distributed Hadoop cluster.
- Installed and configured Hive and also wrote Hive UDF’s that helped spot market trends.
- Extracted and parsed RDF data using a Java API called Sesame, from ontology system called Semaphore, which is used to process the unstructured resources and build NLP capabilities.
- Loaded the final data from both structured and unstructured resources into Neo4j graph data base to facilitate the search capabilities on a graph data store.
- Worked on variety of file formats like Avro, RC Files, Parquet and Sequence File and Compression Techniques.
- Wrote Python scripts to parse XML documents and loaded the data into Hbase.
- Installation, Configuration, and Administration of Hadoop cluster of major Hadoop distributions such as Cloudera Enterprise (CDH3 and CDH4) and Hortonworks Data Platform (HDP1 and HDP2).
- Trained and mentored analyst and test team on Hadoop framework, HDFS, Map Reduce concepts, Hadoop Ecosystem.
- Load data from various data sources into HDFS using Kafka.
- Gained very good business knowledge on different category of products and designs within.
- Worked with NoSQL databases like Hbase in creating Hbase tables to load large sets of semi structured data coming from various sources.
Environment: MapReduce, HDFS Sqoop, Flume, LINUX, Oozie, Hadoop, Pig, Hive, Hbase, Cassandra, Hadoop Cluster, Amazon Web Services, Unix
Big Data Engineer
Confidential, Cincinnati, OH
Responsibilities:
- Played a key role in discussing about the requirements, analysis of the entire system along with estimation, development and testing accordingly keeping BI requirements as a note.
- Capacious working experience on Hadoop eco-system components like MapReduce (MRv1), (MRv2), Hive, Pig, Sqoop, Oozie, Kafka, ZooKeeper.
- Installed and configured MapReduce, Hive and the HDFS, implemented CDH4 Hadoop cluster on CentOS. Assisted with performance tuning and monitoring.
- Strong experience in working with BI tools like Tableau, QlikView, Pentahoand working accordingly depending on the requirements and implementing in the Big Data eco system.
- Knowledge on several tools like Spark, Kafka, Storm, Hue, Zookeeper, Solr search engine.
- Created Hive tables to load large sets of structured, semi-structured and unstructured data coming from UNIX, NoSQL and a variety of portfolios.
- Created the custom functions for several data type conversions, handling the errors in the data provided by the vendor.
- Good work experience in developing web applications covering front-end/UI using the web technologies like HTML, XHTML, CSS, JAVASCRIPT, JQUERY, JSON, XML and AJAX
- Writing Hive queries for joining multiple tables based on business requirement.
- Developed and implemented MapReduce programs for analyzing Big Data with different file formats like structured and unstructured data.
- Good experience working on analysis tool like Tableau for regression analysis, pie charts, and bar graphs.
- Analyzing data through Tableau Server connecting directly through the hive tables in HDFS.
- Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
- Involved in build applications using Maven and integrated with CI servers like Jenkins to build jobs.
- Involved in Agile methodologies, daily scrum meetings, spring planning.
- Imported the log data from different servers into HDFS using Flume and developed MapReduce programs for analyzing the data.
- Used different file formats like Text files, Sequence Files, Avr.
- Implemented Storm integration with Kafka and ZooKeeper for the processing of real time data.
- Worked with efficient storage formats like PARQUET, AVRO and ORC integrated them with Hadoop and the ecosystem (Hive, Impala, and spark).
- Also used compressions like Snappy and Zlib.
- Excellent understanding and knowledge of NOSQL databases like HBase, Cassandra, Mongo DB, Teradata and on Data warehouse.
- Expertise in working with different databases, like Oracle, MS-SQL Server, Postgres, and MS Access 2000 along with exposure to Hibernate for mapping an object-oriented domain model to a traditional relational database.
- Responsible writing PIG script and Hive queries for data processing.
- Executed Oozie workflow engine to run multiple Hive and Pig jobs which run independently with time and data availability.
Environment: Hadoop, HDFS, Hive, Flume, Sqoop, HBase, PIG, Eclipse, MySQL and Ubuntu, Zookeeper, Java (JDK 1.6)
Java/J2EE Developer
Confidential, San Diego, CA
Responsibilities:
- Designed the application using the J2EE design patterns such as Session Facade, Business Delegate, Service Locator, Value Object, Value List Handler, and Singleton.
- Developed Use case diagrams, Object diagrams, Class diagrams, and Sequence diagrams using UML.
- Developed presentation tier as HTML, JSPs using Struts Framework.
- Developed the middle tier using EJBs.
- Developed session, entity beans and message driven beans.
- Entity Beans used for accessing data from the SQL Server database.
- Prepared high and low level design documents for the business modules for future references and updates.
- Deployed the application on Websphere application server in development and production environment.
- Undertook the Integration and testing of the different parts of the application.
- Developed automated Build files using ANT.
- Used Subversion for version control and log4j for logging errors.
- Code Walkthrough, Test cases and Test Plans.
Environment: Java, Java Swing JSP, Servlets, JDBC, Applets, JCE 1.2.1, RMI, EJB, XML/XSL, Visual Age java (VAJ), Visual C++, J2EE
Java Developer
Confidential
Responsibilities:
- Involved in various phases of Software Development Life Cycle (SDLC) as design development and unit testing.
- Developed and deployed UI layer logics of sites using JSP, XML, JavaScript, HTML/DHTML, and Ajax.
- CSS (Cascading style sheet) and JavaScript were used to build rich internet pages.
- Designed different design specifications for application development that includes front-end, back-end using design patterns.
- Developed proto-type test screens in HTML and JavaScript.
- Involved in developing JSP for client data presentation and, data validation on the client side with in the forms.
- Developed the application by using the Spring MVC framework.
- Collection framework used to transfer objects between the different layers of the application.
- Developed data mapping to create a communication bridge between various application interfaces using XML, and XSL.
- Spring IOC being used to inject the parameter values for the Dynamic parameters.
- Developed JUnit testing framework for Unit level testing.
- Actively involved in code review and bug fixing for improving the performance.
- Documented application for its functionality and its enhanced features.
- Created connection through JDBC and used JDBC statements to call stored procedures.
Environment: Spring MVC, Oracle 9i J2EE, Java, JDBC, Servlets, JSP, XML, Design Patterns, CSS, HTML, JavaScript 1.2, Junit, Apache Tomcat, My SQL Server 2008.