We provide IT Staff Augmentation Services!

Sr. Big Data/hadoop Developer Resume

5.00/5 (Submit Your Rating)

Juno Beach, FL

SUMMARY

  • Above 8+ years of experience as Big Data/Hadoop and Java Developer with skills in analysis, design, development, testing and deploying various software applications.
  • Experience in developing custom UDF's for Pig and Apache Hive to in corporate methods and functionality of Java into Pig Latin and HiveQL.
  • Good experience in developing MapReduce jobs in J2EE /Java for data cleansing, transformations, pre - processing and analysis.
  • Good Knowledge in Amazon Web Service (AWS) concepts like EMR and EC2 web services which provides fast and efficient processing of Teradata Big Data Analytics.
  • Experience in collection of Log Data and JSON data into HDFS using Flume and processed the data using Hive/Pig.
  • Strong exposure to Web 2.0 client technologies using JSP, JSTL, XHTML, HTML5, DOM, CSS3, JavaScript and AJAX.
  • Experience working with cloud platforms, setting up environments and applications on AWS, automation of code and infrastructure (DevOps) using Chef and Jenkins
  • Extensive experience on developing Spark Streaming jobs by developing RDD's (Resilient Distributed Datasets) and used Spark SQL as required.
  • Experience on developing JAVA MapReduce jobs for data cleaning and data manipulation as required for the business.
  • Strong knowledge on Hadoop eco systems including HDFS, Hive, Oozie, HBase, Pig, Sqoop, Zookeeper etc.
  • Extensive experience with advanced J2EE Frameworks such as Spring, Struts, JSF and Hibernate.
  • Expertise in Java Script, JavaScript MVC patterns, Object Oriented JavaScript Design Patterns and AJAX calls.
  • Installation, configuration and administration experience in Big Data platforms Cloudera Manager of Cloudera, MCS of MapR.
  • Extensive experience in working with Oracle, MS SQL Server, DB2, MySQL.
  • Experience working with Hortonworks and Cloudera environments.
  • Good knowledge in implementing various data processing techniques using Apache HBase for handling the data and formatting it as required.
  • Excellent experience in installing and running various Oozie workflows and automating parallel job executions.
  • Experience on Spark and Spark SQL, Spark Streaming, Spark GraphX, SparkMlib.
  • Extensively development experience in different IDE like Eclipse, Net Beans, IntelliJ and STS.
  • Strong experience in core SQL and Restful web services (RWS).
  • Strong knowledge in NOSQL column oriented databases like HBase and its integration with Hadoop cluster.
  • Good experience in Tableau for Data Visualization and analysis on large data sets, drawing various conclusions.
  • Experience in using Python, R for statistical analysis.
  • Good knowledge of coding using SQL, SQL Plus, T-SQL, PL/SQL, Stored Procedures/Functions.
  • Worked on Bootstrap, Angular JS and Node JS, knockout, ember, Java Persistence Architecture (JPA).
  • Experienced in developing applications using all Java/J2EE technologies like Servlets, JSP, EJB, JDBC, JNDI, JMS, SOAP, REST, GRAILS etc.
  • Well versed working with Relational Database Management Systems as Oracle 12c, MS SQL, MySQL Server.
  • Experience with all stages of the SDLC and Agile Development model right from the requirement gathering to Deployment and production support.
  • Experience in using PL/SQL to write Stored Procedures, Functions and Triggers.

TECHNICAL SKILLS

Hadoop/Big Data Technologies: Hadoop 2.7/2.5, HDFS, MapReduce, HBase 1.2.4, Pig, Hive 2.0, Hue, Sqoop, Spark 2.0/2.0.2, Impala, Oozie, YARN, Flume 1.7, Kafka, Zookeeper

Hadoop Distributions: Cloudera 5.9, Hortonworks, MapR

Programming Language: Java, Scala, Python 3.5, SQL, PL/SQL, Shell Scripting, Storm, JSP, Servlets

Frameworks: Spring 4.3, Hibernate, Struts, JSF, EJB, JMS

Web Technologies: HTML, CSS, JavaScript, JQuery, Bootstrap, XML, JSON, AJAX

Databases: Oracle 12c/11g, SQL Server2016/2014, MYSQL 5.7/5.4.16

Database Tools: TOAD, SQL PLUS, SQLite 3.15/3.15.2

Operating Systems: Linux, Unix, Windows 8/7

IDE and Tools: Eclipse 4.6, Netbeans 8.2, IntelliJ, Maven

NoSQL Databases: HBase, Cassandra, MongoDB, Accumulo

Web/Application Server: Apache Tomcat, Jboss, Web Logic, Web Sphere

SDLC Methodologies: Agile, Waterfall

Version Control: GIT, SVN, CVS

PROFESSIONAL EXPERIENCE

Confidential, Juno Beach, FL

Sr. Big data/Hadoop Developer

Responsibilities:

  • Involved in Analysis, Design, System architectural design, Process interfaces design, design documentation.
  • Responsible for developing prototypes the selected solutions and implementing complex big data projects with a focus on collecting, parsing, managing, analyzing and visualizing large sets of data using multiple platforms.
  • Understand how to apply technologies to solve big data problems and to develop innovative big data solutions.
  • Developed Spark Applications by using Scala, Java and Implemented Apache Spark data processing project to handle data from various RDBMS and Streaming sources.
  • Used Spark Streaming APIs to perform transformations and actions on the fly for building common learner data model which gets the data from Kafka in near real time and persist it to Cassandra.
  • Responsible for analyzing and cleansing raw data by performing Hive queries and running Pig scripts on data.
  • Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS.
  • Developed Simple to complex MapReduce Jobs using Hive and Pig.
  • Performed importing data from various sources to the Cassandra cluster using Sqoop. Worked on creating data models for Cassandra from Existing Oracle data model.
  • Used Spark - Cassandra connector to load data to and from Cassandra. worked in Spark and Scala for Data Analytics. Handle ETL Framework in Spark for writing data from HDFS to Hive.
  • Used Scala based written framework for ETL.
  • Developed multiple spark streaming and core jobs with Kafka as a data pipe-line system
  • Worked and learned a great deal from AWS Cloud services like EC2, S3, EBS.
  • Migrated an existing on-premises application to AWS. Used AWS services like EC2 and S3 for small data sets processing and storage.
  • Imported data from AWS S3 into Spark RDD, Performed transformations and actions on RDD's.
  • Extensively use Zookeeper as job scheduler for Spark Jobs.
  • Worked on Talend with Hadoop. Worked in migrating from Informatica Talend jobs.
  • Implemented a distributed messaging queue to integrate with Cassandra using Apache Kafka and Zookeeper.
  • Developed Kafka producer and consumer components for real time data processing.
  • Worked on physical transformations of data model which involved in creating Tables, Indexes, Joins, Views and Partitions.
  • Involved in Cassandra Data modeling to create key spaces and tables in multi Data Center DSE Cassandra DB.

Environment: Spark, HDFS, Kafka, MapReduce (MR1), Pig, Hive, Sqoop, Cassandra, AWS, Talend, Java, Linux Shell Scripting.

Confidential - Long Beach, CA

Sr. Big Data/Hadoop Developer

Responsibilities:

  • Performed data transformations like filtering, sorting, and aggregation using Pig
  • Creating Sqoop queries to import data from SQl, Oracle, and Teradata to HDFS
  • Created Hive tables to push the data to MongoDB.
  • Wrote complex aggregate queries in mongo for report generation.
  • Developed scripts to run scheduled batch cycles using Oozie and present data for reports
  • Profound experience in the field of Data warehousing and Business Intelligence implementation with expertise in AWS Redshift, AWS S3, AWS Athena, AWS Glue Data Catalog, Informatica, Power center
  • Worked on a POC for building a movie recommendation engine based on Fandango ticket sales data using Scala and Spark Machine Learning library.
  • Developed big data ingestion framework to process multi TB data including data quality checks, transformation, and stored as efficient storage formats like parquet and loaded into Amazon S3 using Spark Scala API and Spark.
  • Implement automation, traceability, and transparency for every step of the process to build trust in data and streamline data science efforts using Python, Java, Hadoop streaming, Apache Spark, Spark SQL, Scala, Hive, and Pig.
  • Designed highly efficient data model for optimizing large-scale queries utilizing Hive complex data types and Parquet file format.
  • Performed data validation and transformation using Python and Hadoop streaming.
  • Developed highly efficient Pig Java UDFs utilizing advanced concept like Algebraic and Accumulator interface to populate ADP Benchmarks cube metrics.
  • Loading the data from the different Data sources like (Teradata and DB2) into HDFS using SQOOP and load into Hive tables, which are partitioned.
  • Developed bash scripts to bring the TLOG file from ftp server and then processing it to load into hive tables.
  • Automated workflows using shell scripts and Control-M jobs to pull data from various databases into Hadoop Data Lake.
  • Involved in story-driven Agile development methodology and actively participated in daily scrum meetings.
  • Inserted Overwriting the HIVE data with HBase data daily to get fresh data every day and used Sqoop to load data from DB2 into HBASE environment...
  • Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs, Scala and have a good experience in using Spark-Shell and Spark Streaming.
  • Designed, developed and maintained Big Data streaming and batch applications using Storm.
  • Created Hive, Phoenix, HBase tables and HBase integrated Hive tables as per the design using ORC file format and Snappy compression.
  • Developed Oozie Workflows for daily incremental loads, which gets data from Teradata and then imported into hive tables.
  • Sqoop jobs, PIG and Hive scripts were created for data ingestion from relational databases to compare with historical data.
  • Created HBase tables to load large sets of structured, semi-structured and unstructured data coming from UNIX, NoSQL and a variety of portfolios.
  • Developed pig scripts to transform the data into structured format and it are automated through Oozie coordinators.
  • Used Splunk to captures, indexes and correlates real-time data in a searchable repository from which it can generate reports and alerts.

Environment: Hadoop, HDFS, Spark, Strom, Kafka, Map Reduce, Hive, Pig, Sqoop, Oozie, DB2, Java, Python, Splunk, UNIX Shell Scripting.

Confidential - Pittsburg, PA

Sr. Hadoop Developer

Responsibilities:

  • Worked on Spark SQL to handle structured data in Hive.
  • Involved in making Hive tables, stacking information, composing hive inquiries, producing segments and basins for enhancement.
  • Involved in migrating tables from RDBMS into Hive tables using SQOOP and later generate visualizations using Tableau.
  • Worked on complex MapReduce program to analyses data that exists on the cluster.
  • Exported and Import of Data to and From MongoDB, Run time configuration of MongoDB,
  • Monitoring at Server, Database, Collection Level, and Various Monitoring Tools related to MongoDB,
  • Analyzed substantial data sets by running Hive queries and Pig scripts.
  • Written Hive UDFs to sort Structure fields and return complex data type.
  • Worked in AWS environment for development and deployment of custom Hadoop applications.
  • Involved in creating Shell scripts to simplify the execution of all other scripts (Pig, Hive, Sqoop, Impala and MapReduce) and move the data inside and outside of HDFS.
  • Creating files and tuned the SQL queries in Hive utilizing HUE.
  • Involved in collecting and aggregating large amounts of log data using Storm and staging data in HDFS for further analysis.
  • Created the Hive external tables using Accumulo connector.
  • Managed real time data processing and real time Data Ingestion in MongoDB and Hive using Storm.
  • Created custom SOLR Query segments to optimize ideal search matching.
  • Developed Spark scripts by using Python shell commands.
  • Stored the processed results In Data Warehouse, and maintaining data using Hive.
  • Worked with Spark eco system using Spark SQL and Scala queries on different formats like Text file, CSV file.
  • Created Oozie workflow and Coordinator jobs to kick off the jobs on time for data availability.
  • Worked with NoSQL databases like MongoDB in making MongoDB tables to load expansive arrangements of semi structured data.
  • Developed Spark scripts by using Python shell commands as per the requirement.
  • Installed Oozie workflow engine to run multiple Hive and Pig jobs, which run independently with time and data availability.
  • Worked and learned a great deal from Amazon Web Services (AWS) Cloud services like EC2, S3, EMR.

Environment: HDFS, MapReduce, Storm, Hive, Pig, Sqoop, MongoDB, Apache Spark, Python, Accumulo, Oozie Scheduler, Kerberos, AWS, Tableau, Java, UNIX Shell scripts, HUE, SOLR, GIT, Maven.

Confidential - Dallas, TX

Sr. Java Developer

Responsibilities:

  • Design, Development of technical specifications using design patterns and SOA methodology using UML, Unit test, Integration & System testing.
  • Developed and tested the application in RAD development environment and deployed into the WebSphere.
  • Migrated the Servlets to the Spring Controllers and developed Spring Interceptors, worked on JSPs, JSTL, and JSP Custom Tags.
  • Developed and flexible, scalable, utilizing open source technologies like Hibernate ORM and Spring Framework.
  • Responsible for design and implementation of various modules of the application using Struts-Spring-Hibernate architecture.
  • Responsible for writing Struts action classes, Hibernate POJO classes and integrating Struts and Hibernate with Spring for processing business needs.
  • Struts Tag Libraries and Struts Tiles Framework were used in addition to JSP, HTML, AJAX and CSS in developing the presentation layer.
  • Used Struts Validation Framework for dynamic validation of the user input forms.
  • Improved Auto Quote application by designing and developing it using Eclipse, HTML, JSF, Servlets and JavaScript.
  • Implemented Spring ORM wiring with Hibernate provided access to Oracle 10g RDBMS.
  • Developed the user interface with JQuery, JSP, HTML, HTML5, CSS, CSS3 and JavaScript.
  • Written JDBC statements, prepared statements and callable statements for various database update, insert, delete operations and for invoking functions, stored procedures, triggers.
  • Implemented MVC architecture by using Spring to send and receive the data from front-end to business layer.
  • Used JSPs, HTML, JavaScript, and CSS for development of the web pages.
  • Developed Ajax, JavaScript validation functions for client side validations.
  • Developed Stateless Session EJBs to make our functionality available to other applications.
  • Involved in configuring hibernate to access database and retrieve data from the database.
  • Extensively worked on writing JUnit test cases for testing the business components developed in Spring.
  • Used Agile-methodology in Development.
  • Coding followed Test-driven development.
  • Used SOAP UI to test the web services and mock response for unit testing web services.

Environment: Java, Hibernate, JSP, JavaScript, Weblogic, Struts, EJB, Oracle 10g, Spring, JDBC, XML, HTML, CSS, JUnit, ANT, CVS, Eclipse, Agile, Test-driven development

Confidential

Java Developer

Responsibilities:

  • Involved in the review and analysis of the Functional Specifications, and Requirements Clarification Defects etc.
  • Created user-friendly GUI interface and Web pages using HTML and CSS and JSP.
  • Developed web components using MVC pattern under Struts framework.
  • Wrote JSPs, Servlets and deployed them on Weblogic Application server.
  • Used JSP's HTML on front end, Servlets as Front Controllers and JavaScript for client side validations.
  • Wrote the Hibernate-mapping XML files to define java classes-database tables mapping.
  • Developed the UI using JSP, HTML, CSS, AJAX and learned how to implement jQuery, JSP and client & server validations using JavaScript.
  • Implemented MVC architecture by using Spring to send and receive the data from front-end to business layer.
  • Designed, developed and maintained the data layer using JDBC and performed configuration of JAVA Application Framework.
  • Extensively used Hibernate in data access layer to access and update information in the database.
  • Migrated the Servlets to the Spring Controllers and developed Spring Interceptors, worked on JSPs, JSTL, and JSP Custom Tags.
  • The front-end JSP pages were developed using the Struts framework, and were hosted in a J2EE environment on a Apache tomcat server.
  • Developed and flexible, scalable, utilizing open source technologies like Hibernate ORM and Spring Framework.
  • Assisting Team-mates in completion of their assigned tasks.
  • Participated in Debug fixing and QA review of the Code before delivering to State.

Environment: HTML, JSP, JavaScript, CSS, Struts, Spring, Servlets, Design Patterns, XML, XSD, Hibernate, JUnit, Ant, J-Query, Web Services, Windows

We'd love your feedback!