Hadoop Engineer/ Java Developer Resume
Ridgeland, MS
SUMMARY
- Over all 8+ years of experience in IT which includes 4+ years’ experience, using Apache Hadoop and experience using spark for analyzing the Big Data as per the requirement.
- In depth knowledge of understanding the Hadoop architecture and its components such as HDFS, Job tracker, Task tracker, Name Node, Data Node, Resource Manager, Node Manager, Map Reduce programs and YARN paradigm.
- Working experience with large scale Hadoop environments build and support including design, configuration, installation, performance tuning and monitoring.
- Good Exposure on Apache Hadoop Map Reduce programming, Hive, PIG scripting and HDFS.
- Hands on experience in installing, configuring, monitoring and using Hadoop ecosystem components like Hadoop Map Reduce, HDFS, HBase, Hive, Sqoop, Pig, Zookeeper, Horton works, Flume, Kafka, Oozie, Elastic search, Apache Spark, Impala, R, Qlikview.
- Hands on experience in testing and implementation phase of all the big data Technologies.
- Strong experience in writing Map Reduce programs for Data Analysis. Hands on experience in writing custom partitions for Map Reduce.
- Experience with distributed systems, large - scale non-relational data stores, RDBMS, NoSQL map-reduce systems, data modeling, database performance, and multi-terabyte data warehouses.
- Experience in Software Development Life Cycle (Requirements Analysis, Design, Development, Testing, Deployment and Support).
- Experience onHadoopAdministration, responsibilities include software installation, configuration, software upgrades, backup and recovery, cluster setup, cluster performance and monitoring on daily basis, maintaining cluster up and run on healthy.
- Efficient in writing MapReduce Programs and using Apache Hadoop API for analyzing the structured and unstructured data.
- Worked on developing ETL processes to load data from multiple data sources to HDFS using FLUME and SQOOP, perform structural modifications using Map-Reduce, HIVE and analyze data using visualization/reporting tools.
- Hands on experience in application development and database management using the technologies JAVA, RDBMS, Linux/Unix shell scripting and Linux internals.
- Good Knowledge on real time data feeding platform-KAFKA.
- Experience in Integration software like Talend.
- Experience with Amazon redshift which Analyze virtually any size data set using the same SQL-based business intelligence tools they use today.
- Solid understanding of open source monitoring tools: Nagios, Ganglia, Cloudera Manager
- Analyze and develop Transformation logic for handling large sets of structured, semi structured and unstructured data using Hive.
- Experience with ETL tools, like Informatica, Talend.
- Developed Scala programs to perform data scrubbing for unstructured data.
- Good understanding of HDFS Designs, Daemons, HDFS high availability (HA).
- Experience in using and understanding of Pig, Hive and HBase and Hive Built-in functions.
- Excellent understanding and knowledge of NOSQL databases likeMongoDB, Hbase and Cassandra.
- Having Big Data related technology experience in Storage, Querying, Processing and analysis of data.
- Good experience in installing, configuring, and usingHadoop ecosystem components like HDFS, Map Reduce, HBase, Zookeeper, Oozie, Hive, Sqoop, Pig, and Flume, Kafka, Apache Spark.
- Experience in understanding and managingLog Files, experience in managing the Hadoop infrastructure with Cloudera Manager.
- Good experience in Hive partitioning, bucketing and perform different types of joins on Hive tables.
- Capable of building hive, pig and map-reduce script and to adapt and learn new tools, techniques, and approaches.
- Familiar with data architecture including data ingestion pipeline design, Hadoop information architecture, data modeling and data mining, machine learning and advanced data processing. Experience optimizing ETL workflows.
- Evaluation of ETL and OLAP tools and recommend the most suitable solutions based on business needs.
- Working knowledge of machine learning.
- Have Knowledge in splunk.
- Familiar with data warehousing and ETL tools like Informatica and Pentaho.
- Established and maintained comprehensive data model documentation including detailed descriptions of business entities, attributes, and data relationships.
- Familiar in Core Java with strong understanding and working knowledge in Object Oriented Concepts like Collections, Multithreading, Data Structures, Algorithms, Exception Handling and Polymorphism as well Data Mining which includes Eclipse, Weka, R, Net beans.
- Good knowledge and experience in Core Java, JSP, Servlets, Multi-Threading, JDBC, HTML.
- Experience in using Sqoop to import data and export data into HDFS from RDBMS and vice-versa.
- Experience in working in 24X7 Support and used to meet deadlines, adaptable to ever changing priorities.
- Excellent interpersonal and communication skills, creative, research-minded, technically competent, result-oriented with problem solving as well maintaining the leadership skills and ability to work well with people and to maintain a good relation with the organization.
TECHNICAL SKILLS
Software Programming: C, C++, Java.
Frameworks: Spring, Hibernate, Struts.
Data Base: SQL, MySQL, Hbase, MongoDB, Cassandra.
Operating Systems: Windows Different distributions of Linux/Unix/Ubuntu.
Script: JavaScript, Shell Scripting.
Web Technology: HTML, CSS, JSP, Web Services, XML, JavaScript.
IDEs: Eclipse, Net Beans, MS Office, Microsoft Visual Studio
Methods: Worked in most of the phases of Agile and Waterfall methodologies.
Web/Application servers: Apache Tomcat, Web logic.
Domain Experience: Banking and financial services, Manufacturing.
Cluster Monitoring Tools: Apache Tomcat, Web logic.
Big Data: Hive, Map Reduce, Hdfs, Sqoop, R, Flume, Spark, Apache Kafka, Hbase, Pig, Elastic search, AWS, Oozie, Zookeeper, YARN, Talend, Storm, Impala, and Qlikview.
PROFESSIONAL EXPERIENCE
Confidential - Birmingham, AL
Hadoop Developer
Environment: Hadoop, Map Reduce, HDFS, Hive, Pig, HBase, SQL, Cloudera Manager, Sqoop, Oozie Java, Eclipse, weka, R, Flume, Apache Kafka, Storm, Horton works, Apache Talend, Web services.
Responsibilities:
- Worked on evaluation and analysis ofHadoopcluster and different big data analytic tools like Hbase and Sqoop.
- Developed MapReduce programs to perform data filtering for unstructured data.
- Wrote MapReduce jobs using Java API.
- Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, Hive and Impala.
- Successfully loaded files to hive and HDFS from MongoDB, Cassandra and Hbase.
- Installed100 node multi clustersonHorton worksplatform.
- Prepare dataset with Hadoop and import it into Neo4J to be able to query and visualize the data.
- Used Flume to channel data from different sources to HDFS and Data migration from RDMS to hadoop using sqoop for analysis and implemented Oozie jobs for automatic data imports from source.
- Have Knowledge on splunk.
- Created HBase tables to store data depending on column families.
- Helped in troubleshooting Scala problems while working with Micro Strategy to produce illustrative reports and dashboards along with ad-hoc analysis.
- Integrate Kafka and Storm by using Avro for serializing and desterilizing the data and Kafka procedure and consumer.
- Developed scripts and batch jobs to schedule various Hadoop programs.
- Utilized high-level information architecture to design modules for complex programs.
- Utilized Apache Hadoopenvironment by Horton works.
- Manage Nagios and Ganglia monitoring and Wrote Map Reduce job using Scala.
- Wrote Hive queries for data analysis to meet the business requirements.
- Used theSparkfor the transformation of data in storage and fast processing of data.
- Used MapReduceframework open-source to allow users to run native C and C++ code in their Hadoop environments.
- Used Amazon red shift to solve challenging problems that will revolutionize database computing in the cloud.
- Responsible for cluster stability and availability of variousHadoopcomponents.
- Involved inHadoopplatform upgrades.
- Defined enterprise architecture roadmap to realign IT with rating business strategy - balance strategic vision with tactics.
- Involved in Amazon red shift to build a product that will leverage the scale of resources available in the cloud.
- Developed and maintained Hive QL, Pig Latin Scripts, Scala and Map Reduce.
- Worked on building BI reports in Tableau with Spark using SparkSQL.
- Implemented Spark using Scala and SparkSQL for faster testing and processing of data.
- Implemented data ingestion and handling clusters in real time processing using Kafka.
- Experience with Core Distributed computing and Data Mining Library using Apache Spark.
- Did comparative analysis of the Hive vs. Impala.
- Used Storm to perform real time processing of unbounded data.
- Used Kafka for stream processing, tracking website activities, monitoring and aggregation of logs.
- Completed testing of Integration and tracked and solved defects.
- Created a virtual server, called anEC2 instance, and use it as an application server in the cloud.
- Optimized hive joins for large tables and developed map reduce code for the full outer join of two large tables.
- Responsible for developing efficient MapReduce on AWS cloud programs for more than 20 years' worth of claim data to detect and separate fraudulent claims.
- Uploaded and processed more than 30 terabytes of data from various structured and unstructured sources into HDFS (AWS cloud) using Sqoop and Flume.
- Worked with spark to quickly write applications in Java or Python.
- Responsible for developing efficient MapReduce on AWS cloud programs like claim data to detect and separate fraudulent claims.
- Uploaded and processed terabytes of data from various structured and unstructured sources into HDFS (AWS cloud) using Sqoop and Flume.
- Configuring the Kafka,Stormand Hive to get and load the real time messaging.
- Populated HDFS and Cassandra with huge amounts of data using Apache Kafka.
- Involved in runningHadoop streaming jobs to process terabytes of text data.
- Worked on Integration of Big data and cloud platforms Using Talend.
- Installed and utilized Talend utility tool.
- Supported Map Reduce Programs those are running on the cluster.
- Responsible to manage data coming from different and multiple sources.
- Extracted files from MySQL through Sqoop and placed in HDFS and processed.
- Familiar with hadoop data modeling and data mining, machine learning and advanced data processing. Experience optimizing ETL workflows.
- Experience in using Sqoop to migrate data to and fro from HDFS and My SQL and deployed Hive and HBase integration to perform OLAP operations on HBase data.
- Creating Hbase tables for random read/writes by the map reduce programs.
- Worked on CSV files while trying to get input from the MySQL databaseand worked with CSV files in weka and Rcode.
Confidential - Atlanta, GA
Hadoop Developer
Environment: Hadoop, Map Reduce, HDFS, Hive, Pig, HBase, Java, SQL, Cloudera Manager, Sqoop, Eclipse, weka, R, Apache Kafka, Storm, Web Services.
Responsibilities:
- Involved in Design, Architecture and Installation of Big Data andHadoopecosystem components.
- Worked on analyzing, writing Hadoop Map reduce jobs using Java API, Pig and Hive.
- Responsible for building scalable distributed data solutions using Hadoop.
- Involved in loading data from edge node to HDFS.
- Worked on installing cluster, commissioning & decommissioning of data node, name node high availability, capacity planning, and slots configuration.
- Created Custom Hive Queries as per Business requirements to perform analysis on Marketing and Sales Data.
- Designed documented and implemented ETL standards to be followed bydevelopersto maintain consistency in the code and performed the code reviews in HA environment with multiple available nodes.
- Performed Complex Data set processing and Multi Dataset Operations with pig and hive.
- Used Indexing and Bucketing for improving the Hive query performance.
- Used Shell and python to automate daily jobs.
- Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
- Installed Oozie workflow engine to run multiple Hive and pig jobs.
- Used SQL queries, Stored Procedures, User Defined Functions (UDF), Database Triggers, using tools like SQL Profiler and Database Tuning Advisor (DTA).
- Used to leverage the robust image-processing libraries written in C and C++.
- Work with business stakeholders, application developers, and DBA's and production teams to identify business needs and discuss solution options.
- Performed joins, group by and other operations in MapReduce by using Java and PIG.
- Assisted in managing and reviewingHadooplog files.
- Assisted in loading large sets of data (Structure, Semi Structured, and Unstructured).
- ManagedHadoopclusters include adding and removing cluster nodes for maintenance and capacity needs.
- Involved in loading and transforming large sets of Structured, Semi-Structured and Unstructured data and analyzed them by running Hive queries and Pig scripts.
- Involved in writing Pig Scripts for Cleansing the data and implemented Hive tables for the processed data in tabular format.
Confidential - Ridgeland, MS
Hadoop Engineer/ Java developer
Environment: Hadoop, Map Reduce, HDFS, Hive, Pig, HBase, Java/J2EE, SQL, Cloudera Manager, Sqoop, Eclipse, weka, R.
Responsibilities:
- Hands on experience creating Hive tables and written Hive queries for data analysis to meet business requirements.
- Experience in Sqoop to import and export the data Mysql.
- Involved in processing of unstructured health care records using pig.
- Integrating Health Care entities including nursing, Hospitals.
- Involved in analyzing the medical billing scenarios patterned after the Client’s electronic logic library.
- Experience in installation, configuration, management and deployment of Big Data solutions and the underlying infrastructure ofHadoopCluster.
- Experience in importing and exporting terabytes of data using Sqoop from HDFS to Relational Database Systems and vice-versa.
- Created HBase tables to store variable data formats of data coming from different portfolios.
- Created workflow and coordinator using Oozie for regular jobs and to automate the tasks of loading the data into HDFS.
- Developed business components using core java concepts and classes like Inheritance, Polymorphism, Collections, Serialization, and Multithreading.
- Implemented Java Script for client side validations.
- Designed and developed user interface static and dynamic web pages using JSP, HTML and CSS.
- Involved in generating screens and reports in JSP, Servlets, HTML, and JavaScript for the business users.
- Provided support and maintenance after deploying the web application.
- Writing Hive queries for joining multiple tables based on business requirement.
- Used complex data types like bags, tuples and maps in Pig for handling data.
- Developed multiple MapReduce Jobs in java for data cleaning and pre-processing.
- Developed Simple to complex MapReduce Jobs using Hive and Pig.
- Experience in implementing data transformation and processing solutions (ETL) using Hive.
- Experience in creating Oozie workflow jobs for Map-reduce/Hive/Sqoop/actions.
- Involved in loading and transforming large sets of Structured, Semi-Structured and Unstructured data and analyzed them by running Hive queries and Pig scripts.
- Loading files to HDFS and writing hive queries to process required data.
- Loading data to hive tables and writing queries to process.
- Involved in loading data from LINUX file system to HDFS.
- Experience in developing Java MapReduce jobs.
- Good knowledge on No-SQL databases- HBASE.
- Proficient in adapting to the new Work Environment and Technologies.
- Experience in managing and reviewingHadoop log files.
Confidential - Indianapolis, IN
Java Developer
Environment: Windows, Linux, Java, HTML, CSS, Eclipse, Java Beans, SQL, XML, Multi-Threading.
Responsibilities:
- Used Exception handling and Multi-threading for the optimum performance of the application.
- Used the Core Java concepts to implement the Business Logic.
- Key responsibilities included requirements gathering, designing and developing the applications.
- Developed and maintained the necessary Java components, Enterprise Java Beans, Java Beans and Servlets.
- Developed business components using core java concepts and classes like Inheritance, Polymorphism, Collections, Serialization and Multithreading.
- Implemented Java Script for client side validations.
- Used My Eclipse as an IDE for all development and debugging purposes.
- Developed Proof of Concepts and provided work/time estimates for design and development efforts.
- Coordinated with the QA lead for development of test plan, test cases, test code and actual testing, was responsible for defects allocation and ensuring that the defects are resolved.
- Coordinating with Offshore team to provide the requirement, resolving issues and reviewing the deliverables.
- Developed the application under J2EE architecture, developed Designed dynamic and browser compatible user interfaces using JSP, Custom Tags, HTML, CSS, and JavaScript.
- Involved in the analysis, design, and development and testing phases of Software Development Lifecycle (SDLC) using agile development methodology.
- Designed and implemented the UI using HTML, JSP, JavaScript and Java.
- Used JDBC to connect the web applications to Data Bases.
- Designed and developed user interface using JSP, HTML and JavaScript.
- Developed The UI using JavaScript, JSP, HTML, and CSS for interactive cross browser functionality and complex user interface.
- Created complex SQL Queries, PL/SQL Stored procedures, Functions for back end.
- Developed UI using HTML, CSS.
- Involved in system, Unit and Integration testing.
- Provided Technical support for production environments resolving the issues, analyzing the defects, providing and implementing the solution defects.
- Resolved more priority defects as per the schedule.
Confidential
System administrator
Environment: LINUX, Tomcat, UNIX, Shell scripts.
Responsibilities:
- Worked in Mission Critical Production environment that deals with OS level issues.
- Regular Admin Tasks include building Solaris, Red-Hat, and Linux servers for Production, Development and Test Environments and supported the ones under production.
- Installing new software and upgrading machines on the servers.
- Responsible for evaluating Storage Foundation Cluster File System for possible production use.
- Performed routine backups and established baseline images using VMware Replication Appliance.
- Utilized Microsoft Word, Excel, and PowerPoint to produce software package installation instructions and supporting documentation.
- Modifiedsystemimages to support legacy and state of the art hardware using a singlesystem image.
- SystemsAdministratorand Software Maintenance Engineer maintaining, troubleshooting, patching, testing and developing software upgrade/update packages.
- Evaluated and configured, based on government security standards, multiple devices including Windows, Signal Converters and network devices.
- Implemented Windows OS deployment standards and configuration across the program and updated current security standards.
- Performed day-to-day health checks and maintained the servers by taking the down time from the users.
- Performed HA for UNIX servers.
Confidential
Assistant System Engineer
Environment: Java, HTML, CSS, JQuery, JavaScript, UML, Servlets, JSP, Eclipse, Apache Tomcat, MySQL.
Responsibilities:
- Worked with a requirement analysis team to gather the client side requirements for application development.
- Designed UML and entity relational diagrams for the process flow and database design
- Developed java programs to implement the computational logic for the web applications.
- Implemented model view controller architecture with the help of jsps, servlets and java.
- Designed and implemented the database which server as the backend for the web application.
- Provided support and maintenance after deploying the web application.
- Designed the static web user interface with html and css.
- Administered and supported the backend databases for the web application
- Developed custom packages to connect to standard data sources and retrieve data efficiently eliminating the need for each team to rewrite the same set of code multiple times.
- Worked on JavaScript, jQuery for data validation on client side.
- Worked on product deployment, documentation and support.