Big Data/ Hadoop Developer Resume
DenveR
SUMMARY:
- 6+ years of strong experience in software development using Big Data, Hadoop, Apache Spark, Scala, JAVA/J2EE and Python Technologies
- Experience on Big Data Tools like Spark, Hive, Pig, HDFS, Kafka, Apache NiFi, Sqoop MapReduce, Storm, Yarn, Oozie, Avro, HBase, Zookeeper
- Hands - on experience with Spark, Spark SQL and Spark Streaming
- Expertise in handling streaming data using Spark, Scala and Kafka
- Proficiency in importing and exporting data using stream processing platforms like Flume Kafka, Apache NiFi and Sqoop
- Thorough experience in the various components of Apache Spark such as RDD, DataFrames, Paired RDD and implementing various transformations and actions using Scala
- Hands-on experience in data analysis using Spark SQL by using Spark data frame API
- Expertise on Hadoop ecosystems such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node, YARN and Map Reduce programming paradigm
- Hands-on experience in migrating the data using Sqoop between RDMS and HDFS
- Experienced with Flume to load the log data from multiple sources directly into HDFS
- Strong experience on using Hadoop Distribution such as Cloudera, Hortonworks and its administrative tools Cloudera Manager and Ambari for management of Hadoop Clusters
- Experience in manipulating/analyzing large datasets and finding patterns and insights within Structured and Unstructured data
- Good understanding of NoSQL databases and hands-on work experience in writing applications on NoSQL databases like HBase, Cassandra, DynamoDB
- Used Oozie, Zookeeper for Scheduling, maintaining and monitoring clusters
- Experienced in working with file formats like CSV, JSON, Xml, Parquet and Avro
- Hands-on experience with creation of Dashboard Reports and business intelligence visualizations using Tableau
- Experienced in building cross-platform applications using JAVA, J2EE with experience in Java core concepts like OOPS, Multi-threading, Collections and Files IO
- Experienced on applications using Java, RDBMS, and Linux shell scripting
- Proficient in using Python libraries like NumPy and Pandas for data analysis
- Familiar with AWS platform and Technologies like EC2, S3, Kinesis, DynamoDB, EMR, RedShift and Elasticsearch Service
- Working knowledge with testing methodologies like unit testing, regression testing, integration testing, user acceptance testing
- Ability to lead and bring synergy among team members to achieve project goals
- Have good interpersonal skills, good communication, problem solving skills and a motivated team player
TECHNICAL SKILLS:
Big Data Technologies: Hadoop, HDFS, Map Reduce, Apache Spark, Java & J2EE Technologies Core Java, JSP, Sqoop, Hive, Oozie, HDFS, Zookeeper, Kafka, JDBC, Eclipse NiFi, Flume, Hadoop (Cloudera, Horton Works)
No SQL Databases: HBase, Cassandra, DynamoDBOracle 11g, Teradata, My SQL, MS SQL Server
Languages: Java, Scala, R, Python, HTML5, Java Script, Apache Tomcat, JBossCSS
Web Services: SOAP, REST GIT
Reporting Tools: Tableau, Power BI
PROFESSIONAL EXPERIENCE:
Confidential, Denver
Big Data/ Hadoop Developer
Responsibilities:
- Responsible to collect, clean, and store data for analysis using Kafka, Sqoop, Spark, HDFS
- Used Kafka and Spark framework for real time and batch data processing
- Ingested large amount of data from different data sources into HDFS using Kafka
- Implemented Spark using Scala and performed cleansing of data by applying Transformations and Actions
- Used Case Class in Scala to convert RDD’s into DataFrames in Spark
- Processed and Analyzed data in stored in HBase and HDFS
- Developed Sqoop scripts for importing and exporting data into HDFS and Hive
- Created Hive internal and external Tables by Partitioning, bucketing for further Analysis using Hive
- Used Oozie workflow to automate and schedule jobs
- Used Zookeeper for maintaining and monitoring clusters
- Exported the data into RDBMS using Sqoop for BI team to perform visualization and to generate reports
- Continuously monitored and managed the Hadoop Cluster using Cloudera Manager
- Used JIRA for project tracking and participated in daily scrum meetings
Environment: Spark, Scala, Hive, Kafka, Teradata, HDFS, Oozie, Zookeeper, HBase, Tableau, Hadoop (Cloudera), JIRA
Confidential, Denver
Hadoop Developer
Responsibilities:
- Imported data from RDBMS to HDFS using Data Ingestion tool Sqoop.
- Ingested large amount of data from different data sources into HDFS using Kafka
- Involved in creating Hive Tables, loading with data and writing Hive queries which will invoke and run MapReduce jobs in the backend
- Created Hive schemas using performance techniques like partitioning and bucketing
- Create ad hoc Hive queries to provide insights into customer behavior
- Used Parquet format and snappy compressions while storing data
- Scheduled and executed workflows in Oozie to run Hive jobs
- Loaded and transformed large sets of structured, semi structured and unstructured data
- Used Zookeeper for maintaining and monitoring clusters
- Exported the data into RDMS using Sqoop for purpose of visualization and to generate reports for the BI team
- Continuously monitored and managed the Hadoop Cluster using Cloudera Manager
- Good understanding about using Tableau for Data Visualization and analysis on large data sets to gain insights into the data
Environment: HDFS, Hive, Sqoop, Oracle 11g, Oozie, HBase, Zookeeper, Java, Hadoop (Cloudera), Tableau
Confidential
J2EE/Java developer
Responsibilities:
- Worked on Agile/SCRUM development methodology and built the application with Test Driven Development (TDD)
- Developed the web application using Spring MVC architecture and implemented business layer using Spring Framework and Spring Validator
- Implemented Restful web services using JAX-RS and Jersey API to expose the data as a service
- Utilized Hibernate and Java Persistence API’s to persist the data into Oracle 10g database.
- Configured and build asynchronous communication with JMS services with MQ Series.
- Involved in designing and developing cross-browser web pages using HTML5, CSS3, JavaScript, jQuery and AngularJS
- Developed JUnit test cases for unit testing
- Generated Maven scripts to bundle and deploy, and Log4J components for logging applications, utilized GIT for version control
Environment: Spring MVC, Spring Validator, RESTful Web Service, Hibernate, Oracle10g, PL/SQL, JMS, HTML5, CSS3, JavaScript, AngularJS, Bootstrap, JUnit, Log4j, Maven, GIT
Confidential
Java Developer
Responsibilities:
- Designed the application using Agile Software Development life cycle (SDLC) Methodology
- Developed the application using J2EE technologies including JSP, Servlets, JSTL and JMS
- Designed GUI/User Interface using JSP, HTML, CSS, JavaScript
- Implemented persistence layer using Hibernate and used Java Persistence API (JPA) for Object relation mapping solution (ORM) to persist data to MySQL database
- Developed test cases for unit testing using Junit framework
- Experienced in implementing logging mechanisms for error debugging by using Log4J
Environment: J2EE, JSP, Servlets, JSTL, Hibernate 4, MySQL 5, HTML5, CSS3, JavaScript, Junit, Log4J
Confidential
Jr. Java developer
Responsibilities:
- Worked on Agile/SCRUM Software Development life cycle (SDLC) methodology
- Designed Use Case Diagrams, Class diagrams, Sequence diagrams and Object Diagrams for various modules and components.
- Developed business logic using JAVA methodologies using OOPS concepts, multithreading and design patterns.
- Implemented Web Application using J2EE technologies such as JSP, Servlet, JDBC and integrated with MySQL database.
- Developed user interface using HTML, CSS and used Java Script for client-side validation.
- Experience in writing SQL queries for JDBC connection to MySQL database
Environment: J2EE, JSP, Servlets, JDBC, MySQL, PL/SQL, HTML, CSS, JavaScript, jQuery