We provide IT Staff Augmentation Services!

Big Data Developer Resume

3.00/5 (Submit Your Rating)

Phoenix, AZ

SUMMARY:

  • 7+ years of experience in Information Technology with 4 years of Hadoop/ Big Data development and 3 years of Java J2EE technologies.
  • Comprehensive working experience in implementing Big Data projects using Apache Hadoop, Pig, Hive, HBase, Spark, Sqoop, Flume, Zookeeper, Oozie.
  • Extensive experience in working with various distributions of Hadoop like enterprise versions of Cloudera (CDH4/CDH5), MapR and has working knowledge on Amazon’s EMR.
  • Excellent working knowledge of HDFS Filesystem and Hadoop Demons such as Resource Manager, Node Manager, Name Node, Data Node, Secondary Name Node, Containers etc.
  • In depth understanding of Apache Spark job execution Components like DAG, lineage graph, DAG Scheduler, Task scheduler, Stages and task.
  • Good Experience in importing data using Sqoop, SFTP from various sources like RDMS, Teradata, Mainframes, Oracle, to HDFS and performed transformations on it using Hive, Pig and Spark.
  • Working knowledge of Amazon’s Elastic Cloud Compute(EC2) infrastructure for computational tasks and Simple Storage Service (S3) as Storage mechanism.
  • Extensively worked on Spark using Scala on cluster for computational analytics, performed advanced analytical operations by making use of Spark with Hive and SQL/Oracle.
  • Experience in performing transformations and actions on Spark RDDS using Spark Core.
  • Good working experience in using Broadcast variables, Accumulator variables and RDD caching in Spark.
  • Experience in troubleshooting Cluster jobs using Spark UI.
  • Having good knowledge on Hadoop data management components like HDFS and YARN.
  • Experience in managing and reviewing Hadoop log files.
  • Designing and creating Hive external tables using shared meta - store instead of derby with partitioning, dynamic partitioning and buckets.
  • Experience in performing Extract-Transform-Load(ETL) operations on data pipelines using Pig.
  • Simplified Complex tasks involving interrelated data transformations and encoded data flow sequences using Pig Latin.
  • Expertise in creating Managed/External tables, Views, Partitions, Buckets and analytical functions in HIve using HQL.
  • Worked on GUI Based Hive Interaction tools like Hue for querying the data.
  • Regularly tune performance of Hive and Pig queries to improve data processing and retrieving.
  • Experience in importing and exporting data using Sqoop between HDFS and Relational Database Systems.
  • Hands-on experience building data pipelines using Hadoop components Sqoop, Hive, Pig, MapReduce, Spark, Spark SQL.
  • Strong knowledge in importing and exporting streaming data into HDFS using stream processing platforms like Flume and Kafka messaging system.
  • Loaded and transformed large sets of structured, semi structured and unstructured data in various formats like text, zip, XML and JSON.
  • Experience in designing both time driven and data driven automated workflows using Oozie.
  • Good understanding of Zookeeper for monitoring and managing Hadoop jobs.
  • Strong Experience in NoSQL databases like HBase, Cassandra.
  • Proficient in Shell Scripting.Proficiency with the application servers like WebSphere, WebLogic, JBOSS and Tomcat
  • Experience with best practices of Web services development and Integration (both REST and SOAP).
  • Used Project Management services like JIRA for handling service requests and tracking issues.
  • Proficient in Java, J2EE, JDBC, Collection Framework, JSON, XML, REST, SOAP Web services.
  • Experience in complete Software Development Life Cycle (SDLC) in both Waterfall and Agile methodologies.

TECHNICAL SKILLS:

Data Access Tools: HDFS, YARN, Hive, Pig, HBase, Solr, Impala, Spark Core, Spark SQL, Spark Streaming

Data Management: HDFS, YARN

Data Workflow: Sqoop, Flume, Kafka

Data Operation: Zookeeper, Oozie

Data Security: Ranger, Knox

Big Data Distributions: Hortonworks, Cloudera

Cloud Technologies: AWS (Amazon Web Services) EC2, S3, DynamoDB, SNS, SQS, EMR, KINESIS

Programming and Scripting Languages: Java, Scala, Pig Latin, HQL, SQL, Shell Scripting, HTML, CSS, JavaScript

IDE/Build Tools: Eclipse, Intellij

Java/J2EE Technologies: XML, Junit, JDBC, AJAX, JSON, JSP

Operating Systems: Linux, Windows, Kali Linux

SDLC: Agile/SCRUM, Waterfall

PROFESSIONAL EXPERIENCE:

Confidential, Phoenix, AZ

Big Data Developer

Responsibilities:

  • Worked on analyzing Hadoop cluster and different big data analytic tools including Hive, HBase

    Database and Spark.

  • Used Hive QL to analyze the partitioned and bucketed data, Executed Hive queries on Parquet tables stored in Hive to perform data analysis to meet the business specification logic.
  • Worked with Log4j framework for logging debug, info & error data.
  • Involved in creating Hivetables, loading and analyzing data using Hive scripts.
  • Implemented Static partitions, Dynamic partitions and Buckets in Hive.
  • Developed Hive scripts in HiveQL to De-Normalize and Aggregate the data.
  • Created Hive Generic UDF s to process business logic that varies based on policy.
  • Converted existing MapReducejobs into Sparktransformations and actions using Spark RDDs, Data frames and SparkSQL APIs.
  • Responsible for design & deployment of Spark SQL scripts and Scala shell commands based on functional specifications.
  • Managed and reviewed hadoop log files to identify issues when a job fails and finding out the root cause.
  • Wrote new spark jobs in Scala to analyze the data of the customer account and payment history.
  • Experienced in handling large datasets using Sparkin Memory capabilities, broadcasts variables, effective and efficient Joins,transformations and other capabilities.
  • Interacted with different system groups for analysis of systems.
  • Experienced in performance tuning of Spark Applications for setting right Batch Interval time, correct level of Parallelism and memory tuning.
  • Performed unit testing using JUnit.
  • Involved in sprint planning, code review and daily standup meetings to discuss the progress of the application
  • Created and maintained technical documentation for launching Hadoop clusters and for executing Hive queries.
  • Development of Oozie workflow for orchestrating and scheduling the ETL process.
  • Created HBase tables and column families to store the user event data.
  • Developed interactive shell scripts for scheduling various data cleansing and data loading processes.
  • Work to tight deadlines and provide regular progress updates against agreed milestone.
  • Designed, documented operational problems by following standards and procedures using JIRA.

Environment: Hadoop, Hive, Hbase, Sqoop, Oozie,Zookeeper,MapR, Spark, Scala, shell scripting, apache kafka

Confidential, Alpharetta, GA

Hadoop/Spark Developer

Responsibilities:

  • Worked on analyzing Hadoop cluster and different Big Data analytic tools including Pig, Hive, HBase database and Sqoop.
  • Developed Spark jobs using Scala on top of Yarn for interactive and Batch Analysis.
  • Worked on importing data from HDFS to MYSQL database and vice-versa using SQOOP.
  • Developed Spark SQL to load tables into HDFS to run select queries on top.
  • Used Spark API over Hadoop YARN to perform analytics on data in Hive.
  • Responded to and resolved access and performance issues. Used Spark API over Hadoop to perform analytics on data in Hive.
  • Implemented Publish/Subscribe modelusing Apache Kafka for real-time transactions to load into HBase.
  • Analyzed the data by performing Hive queries and Pig scripts to study customer behavior.
  • Created HBase tables to store variable data formats of input data coming from different portfolios.
  • Involved in adding huge volumes of data in rows and columns to store data in HBase.
  • Imported data from structured data source into HDFS using Sqoop incremental imports.
  • Exported the analyzed data to the relational databases using Sqoop, to further visualize and generate reports for the BI team.
  • Used Zookeeper to coordinate the servers in clusters and to maintain the data consistency.
  • Used Oozie workflow engine to create the workflows and automate the MapReduce, Hive and Pig jobs.
  • Implemented Map Reduce jobs in HIVE by querying the available data.
  • Experience in NoSQL database such as HBase, MongoDB.
  • Collaborated with the network, database, and BI teams to ensure data quality and availability.

Environment: Hadoop, MapReduce, Pig, Hive, Hbase, Sqoop, Oozie, Spark, Solr,Sqoop, shell scripting, apache kafka

Confidential, Long Beach, CA

Data Engineer

Responsibilities:

  • Involved in gathering business requirements, design, development and testing.
  • Extensive experience in writing Pig (version 0.10) scripts to transform raw data from several data sources into forming baseline data.
  • Built complex data flows involving multiple inputs, transforms and outputs using Pig.
  • Worked on Hive for exposing data for further analysis and for generating transforming files from different analytical formats to text files.
  • Imported data from MySQL server and other relational databases to Apache Hadoop with the help of Apache Sqoop.
  • Written MapReduce jobs in java to process the log data.
  • Very good experience in monitoring and managing the Hadoop cluster using Cloudera Manager.
  • Developed multiple POCs using Spark and deployed on the Yarn cluster, compared the performance of Spark, with Hive and SQL/Teradata.
  • Tested and reported defects in an Agile Methodology perspective.
  • Analyzed stored data using Impala.
  • Generated various marketing reports using Tableau with Hadoop as a source for data.
  • Developed Unix shell scripts to load large number of files into HDFS from Linux File System.
  • Provided daily code contribution, worked in a test-driven development.

Environment: Hadoop, HDFS, Hive, MapReduce, Spark, Scala, Kafka, HBase,Impala, Oozie, Java, Linux, Cloudera.

Confidential

JAVA/J2EE Developer

Responsibilities:

  • Implemented the application using Agile methodology. Involved in daily scrum and sprint planning meetings.
  • Actively involved in analysis, detail design, development, bug fixing and enhancement.
  • Driving the technical design of the application by collecting requirements from the Functional Unit in the design phase of SDLC.
  • Developed Micro services using RESTful services to provide all the CRUD capabilities.
  • Creating requirement documents and design the requirement using UML diagrams, Class diagrams, Use Case diagrams for new enhancements.
  • Developed the Application Module using several design patterns like Singleton, DAO, DTO, and MVC.
  • Involved in writing JSPs, Javascript and Servlets to generate dynamic web pages and web content.
  • Used JBoss application server deployment of applications.
  • Developed communication among SOA services.
  • Involved in creation of both service and client code for JAX-WS and used SOAPUI to generate proxy code from the WSDL to consume the remote service.
  • Designed the user interface of the application using HTML5, CSS3, JavaScript, Angular JS, jQuery and AJAX.
  • Designed Node.js application components through Express.
  • Implemented AJAX functionality to speed up web application.

Environment: Java,J2EE, Spring MVC, Hibernate, SOAP, REST, JAXB, JAX-RPC, AngularJs, JQuery, AJAX, JSON, JavaScript, Bootstrap, XSL, XML, Struts, DB2, JUnit, Log4j, NetBeans IDE.

Confidential

Jr. Software Engineer

Responsibilities:

  • Individually worked on all the stages of a Software Development Life Cycle (SDLC).
  • Used JavaScript code, HTML and CSS style declarations to enrich websites.
  • Integrating Web services and working with data in different servers.
  • Wrote several Action Classes and Action Forms to capture user input and created different web pages using JSTL, JSP, HTML, Custom Tags and Struts Tags.
  • Developed the front end for the site based on (MVC) design pattern Using Struts framework.
  • Used standard data access technologies like JDBC and ORM tools like Hibernate
  • Developed complex PL/SQL queries to access data.
  • Understanding the requirements from business users and end users.
  • Experience creating UML class and sequence diagram.
  • Converted XML into JAVA objects using JAXB API.
  • Developed UI components using JSP and JavaScript.
  • Wrote test cases using JUnit testing framework and configured applications on WebLogic Server
  • Coordinated across multiple development teams for quick resolution to blocking issues.

Environment: Oracle, Java, Struts, Servlets, HTML, XML, SQL, J2EE, JUnit, Tomcat, PL/SQL, JIRA, SVN, JUNIT, MS ACCESS, Microsoft Excel.

We'd love your feedback!