Hadoop developer Resume
Minneapolis, MN
PROFESSIONAL SUMMARY:
- Experienced Hadoop developer with strong foundation in distributed file systems like HDFS, Hbase in big data environment.
- Excellent understanding of the complexities associated with big data with experience in developing modules and codes in MapReduce, Hive, Pig and Spark to address those complexities.
- 8 years of Professional experience in IT Industry with 4 years in Development, Implementation and Configuration of large scale Hadoop ecosystem components like Hive, Hbase, Pig, Sqoop, Flume, Oozie, Kafka, Apache Spark on Linux environment.
- Experience in developing strategic methods for deploying Big data technologies to efficiently solve Big Data processing requirement.
- 4 years of experience in Development and maintenance of various applications using Java, J2EE.
- Experience in using Hadoop for standalone, Pseudo and distributed modes.
- Experience in designing technical architecture & developing various Big data work flow using custom MapReduce, Pig, Hive and Sqoop.
- Experience in implementing and troubleshooting MapReduce jobs.
- Utilized Maven extensively to build jar files of MapReduce projects and deployed to Cluster.
- Experience on YARN environment with Storm, Spark, Kafka and Avro
- Performed transformations using Hive on Imported Data.
- Developed UDF’s in Java and Scala for use in Pig and Hive Queries.
- Building re - usable Hive UDF libraries for business requirements which empowers different business analysts to utilize these UDF's in Hive querying.
- Performed Cassandra cluster setup, design and replication.
- Exposure to file formats like Sequence, Avro, Parquet and JSON.
- Used various compression techniques like Gzip and snappy.
- Experience in working with NOSQL database like HBase, MongoDB in getting real time data analytics using Apache Spark with Scala.
- Experience in importing and exporting data utilizing Sqoop from HDFS to Relational Database Systems and the vice versa.
- Experience in working with Flume/Kafka to load the log data from different sources into HDFS.
- Experience in designing the Zookeeper to facilitate the servers in clusters and to keep up the information consistency.
- Experience in planning both time driven and information driven mechanized work processes utilizing Oozie using python.
- Experienced with the Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark -SQL, Data Frame, Pair RDD's, Spark YARN
- Very Good understanding and Working Knowledge of Object Oriented Programming (OOPS), Python and Scala.
TECHNICAL SKILLS:
Languages: C, C++, Java, Python, Scala, Visual Basic, SQL,HQL.
Big Data technologies: HDFS, MapReduce, Hive, Pig, Sqoop, Flume, Impala, HBase, Oozie, Zookeeper, Yarn, Storm, Kafka, Spark.
J2EE Technologies: JSP, Servlets, EJB, JMS, JNDI, LDAP, JPA, JDBC, Annotations
Web Server: Apache Tomcat 6.4, JBOSS, WebSphere, WebLogic.
Databases: Oracle, MySQL, MongoDB, Cassandra.
IDE: Eclipse, SQL Server, Notepad++, NetBeans, MS Office suite.
WORK EXPERIENCE:
Hadoop Developer
Confidential, Minneapolis, MN
Responsibilities:
- Deploying Flume to load data from various sources for analysis.
- Designed and developed ways of cleansing, normalizing and standardizing big data from multiple sources in batches using Apache Hadoop (Cloudera's CDH) for Map/Reduce to load the corresponding data in HBase and HDFS accordingly to develop front-end dashboard application/services to the end clients.
- Involved in implementing real time MapReduce pipelines using user-defined functions, Join’s and data aggregations to process the data into the cloud.
- Strong experience in developing, debugging and tuning Map Reduce jobs in Hadoop environment.
- Creating Spark SQL queries for faster requests.
- Developed and implemented core API services using Spark for the real-time processing.
- Utilized spark streaming for speedy migration and used it to create a very reliable environment.
- Hands-on experience in using Hive partitioning, bucketing and executing different types of joins on Hive tables and implementing Hive Serves like JSON and Avro.
- Worked on Developing custom MapReduce programs and User Defined Functions (UDFs) in Hive to transform the large volumes of data with respect to business requirement.
- Maintenance of data importing scripts using Hive and Map Reduce jobs. Data design and analysis in order to handle huge amount of data.
- Cross examining data loaded in Hive table with the source data.
- Working close together with QA and Operations teams to understand, design, and develop and end-to-end data flow requirements.
- Developing structured, efficient and error free codes for Big Data requirements. Storing, processing and analyzing huge dataset for getting valuable insights from them.
- Worked on tuning the performance of Hive queries.
- Created a Data-Platform for the structured data-sets and processed an ETL by using Spark engine and Hive.
- Building time driven and message driven work flows utilising Oozie with the help of python.
- Involved in building data structures and designed Avro schemas for serialization
- Experience in writing ETL jobs using PIG Latin and HIVE QL.
Environment: Hadoop, HDFS, Pig, Hive, Oozie, HBase, Flume, Kafka, Map Reduce, Sqoop, LINUX, Cloudera, BigData, Spark, Windows 2008.
Hadoop Developer
Confidential, Tampa, FL
Responsibilities:
- Developing and running Map-Reduce jobs on YARN and Hadoop clusters to produce daily and monthly reports as per user's need.
- Debugging/Troubleshoot issues on UDF's in Hive.
- Scheduling and managing jobs on a Hadoop cluster using Oozie work flow.
- Experience in developing multiple MapReduce programs in java for data extraction, transformation and aggregation from multiple file formats including XML, JSON, CSV and other file formats.
- Experienced on loading and transforming of large sets of structured, semi structured and unstructured data.
- Transforming unstructured data into structured data using PIG.
- Imported data using Sqoop to load data from MySQL to HDFS on regular basis.
- Designed and developed PIG Latin Scripts to process data in a batch to perform trend analysis.
- Good experience on Hadoop tools like MapReduce, Hive and HBase.
- Worked on both External and Managed HIVE tables for optimized performance.
- Developed HIVE scripts for analyst requirements for analysis.
- Hands-on experience in using Hive partitioning, bucketing and execute different types of joins on Hive tables and implementing Hive SerDes like JSON and Avro.
- Worked on Developing custom MapReduce programs and User Defined Functions (UDFs) in Hive to transform the large volumes of data with respect to business requirement.
- Maintenance of data importing scripts using Hive and Map reduce jobs.
- Data design and analysis in order to handle huge amount of data.
- Cross examining data loaded in Hive table with the source data in oracle.
- Working close together with QA and Operations teams to understand, design, and develop and end-to-end data flow requirements.
- Utilising oozie to schedule workflows.
- Developing structured, efficient and error free codes for Big Data requirements using my knowledge in Hadoop and its Eco-system.
- Storing, processing and analyzing huge data-set for getting valuable insights from them.
Environment: Hadoop, HDFS, Pig, Hive, HBase, Map Reduce, Sqoop, Oozie, LINUX, Cloudera, BigData, Java, SQL
Java Developer
Confidential, Columbus, OH
Responsibilities:
- Designed, configured and developed the web application using Jsp, Jasper Report, barbeque barcode scanner, JavaScript, HTML.
- Developed Session Beans for JSP clients.
- Configured and Deployed EAR & WAR files on WebSphere Application Server.
- Defined and designed the layers and modules of the project using OOAD methodologies and standard J2EE design patterns & guidelines
- Designed and developed all the user interfaces using JSP, Servlets and Spring framework
- Maintained the existing code based developed in Spring and Hibernate framework by incorporating new features and fixing bugs
- Involved in fixing bugs and unit testing with test cases using JUnit framework
- Developed build and deployment scripts using Apache ANT to customize WAR and EAR files
- Developed stored procedures and triggers using PL/SQL in order to calculate and update the tables to implement business logic using Oracle database
- Involved in writing Hibernate Query Language (HQL) for persistence layer.
- Used Log4j for application logging and debugging
- Coordinated with offshore team for requirement transition & providing the necessary inputs required for successful execution of the project
Environment: Java SE 7, Java EE 6, JSP 2.1, Servlets 3.0, HTML, JDBC 4.0, IBM WebSphere 8.0, PL/SQL, XML, Spring 3.0, Hibernate 4.0, Oracle 12c, ANT, Java Script & JQuery, JUnit, Windows 7 and Eclipse 3.7.
Java Developer
Confidential
Responsibilities:
- Designed and implemented the training and reports modules of the application using Servlets, JSP and Ajax.
- Developed custom JSP tags for the application.
- Writing queries for fetching and manipulating data using ORM software iBatis.
- Used Quartz schedulers to run the jobs sequentially at given time.
- Implemented design patterns like Filter, Cache Manager and Singleton to improve the performance of the application.
- Implemented the reports module of the application using Jasper Reports to display dynamically generated reports for business intelligence.
- Deployed the application in client's location on Tomcat Server.
Environment: HTML.
Software Developer
Confidential
Responsibilities:
- Developed views, controller and model components implementing Struts MVC Framework.
- Developed Presentation tier as Java Server Pages using Struts MVC Framework implementing Struts Validator, Tiles and Struts Internationalization.
- Developed web GUI involving JSP, JavaScript, CSS, and XML and beans under MVC architecture.
- Developed Struts MVC compliant components for the web tier.
- Created Action Classes for Controller in Struts MVC Framework.
- Implemented Struts Framework for configuration of action mappings and presentation logic in JSPs, Servlets.
- Developed Servlets and designed web.xml for the Servlets.
- Involved in Java application testing and maintenance in development phase and production phase.
- Involved in developing JSP for client data presentation and, data validation on the client side with in the forms.
Environment: Java, HTML, XML, CSS, JSP, Servlets, EJB, Java Beans, JavaScript, JDBC, Eclipse, Oracle and Windows 2003.
