We provide IT Staff Augmentation Services!

Big Data Developer Resume

5.00/5 (Submit Your Rating)

Tampa, FL

SUMMARY

  • Around 7 plus years of development experience using Hadoop,Java and Oracle, which includes Big Data ecosystem,design, development and administration.
  • Over 3 years of extensive experience in Big Data and excellent understanding/knowledge of Hadoop architecture and various components such as Spark SQL, HDFS, Pig, Hive, Hbase, Sqoop, Flume, Yarn, Zookeeper, Kafka and Cassandra.
  • Experience in loading structured, semi - structured and unstructured data from different sources like csv, xml files, Teradata, MS SQL Server, Oracle into Hadoop.
  • Experience in importing and exporting the different formats of data into HDFS, HBASE from different RDBMS databases and vice versa.
  • Experience in writing Scala programs.
  • Exposure on Python programming language.
  • Expertise is working with distributed and global project teams.
  • Experience in using various Hadoop distributions like Cloudera, Hortonworks.
  • Good exposure on Yarn environment withSpark, Kafka and dealing with file formats like Avro, Json, Xml and sequence files.
  • Experience writing custom UDFs in pig and hive based on the user requirement.
  • Experience in storing, processing unstructured data using NOSQL databases like Hbase, Cassandra and MongoDB.
  • Experience in writing work flows and scheduling jobs using Oozie.
  • Involved in project planning, setting up standards for implementation and design of Hadoop based applications.
  • Experience in Work independently and end to end on projects.
  • Proficiency in creating business and technical project documentation.
  • Ability to lead Team and develop a project from scratch.

TECHNICAL SKILLS:

Hadoop/Big Data: Apache Spark, HDFS, Map Reduce, Hive, Pig, Oozie, Flume, ZooKeeper, Scoop, Hbase, Cassandra, Spark Streaming, Kerberos, Zeppelin

NoSQL Databases: HBase,Cassandra, mongoDB

Languages: C, C++, Java, J2EE, PL/SQL, Pig Latin, HiveQL, Unix shell scripts,Perl, Python,Scala, R

ETL: IBMWebSphere/Oracle

Operating Systems: Sun Solaris, UNIX, Red Hat Linux, Ubuntu Linux and Windows XP/Vista/7/8

Web Technologies: HTML, DHTML, XML, AJAX, WSDL, SOAP

Web/Application servers: Apache Tomcat, WebLogic, JBoss

Databases: Oracle, SQL Server, MySQL, Netteza

PROFESSIONAL EXPERIENCE

Confidential, Tampa FL

Big Data Developer

Responsibilities:

  • Involved in complete project life cycle starting from design discussion to production deployment.
  • Performed transformations, cleaning and filtering on imported data using Hive, Map Reduce.
  • DevelopedSparkcore andSparkSQL scripts using Scala for faster data processing.
  • Extensively used big data analytical and processing tools Hive,SparkCore,SparkSQL for batch processing large data sets on Hadoop cluster.
  • Experienced with theSparkimproving the performance and optimization of the existing algorithms in Hadoop usingSparkContext,Spark-SQL, Data Frame, RDD's and YARN.
  • Developed Simple to complex Map Reduce Jobs using Hive and Pig
  • Import and export data between the environments like MySQL, HDFS and deploying into productions.
  • Used Pig as ETL tool to do transformations, event joins and some pre-aggregations beforestoring the data onto HDFS.
  • Worked on partitioning and used bucketing in HIVE tables and setting tuning parameters to improve the performance.
  • Involved in developing Impala scripts to do Adhoc queries.
  • Experience in Oozie workflow scheduler template to managevarious jobs like Sqoop, MR, Pig, Hive, Shell scripts, etc.
  • Extensively used SVN as a code repository for managing day agile projectdevelopment process and to keep track of the issues and blockers.
  • Developed the Oozie workflows with Sqoop actions to migrate the data from relationaldatabases like Oracle, Netezza, Teradata to HDFS.
  • Used Hadoop FS actions to move the data from upstream location to local data lake locations.
  • Created a common data lake for the migrated data to be used by other members of the team.
  • Written extensive Hive queries to do transformations on the data to be used by downstream models.
  • Developed map reduce programs as a part of predictive analytical model development.
  • Developed Hive queries to do analysis of the data and to generate the end reports to be

Confidential, Atlanta, GA

Senior Hadoop/SparkDeveloper

Responsibilities:

  • Migrating jobs from Sqoop and pig to Spark SQL for faster processing
  • Loading data intoSparkRDD and do in memory data Computation to generate the Output response.
  • Enhanced and optimized productSparkcode to aggregate, group and run data mining tasks using theSparkframework.
  • Loading data intoSparkRDD and do in memory data Computation to generate the Output response.
  • DevelopedSparkcode andSpark-SQL for faster testing and processing of data. Experiencing inExtending Hive and Pig core functionality by writing custom UDFs
  • Analyzed large data sets by running Hive queries and Pig scripts
  • Experienced in collecting, aggregating, and moving large amounts of streaming data into HDFS using Flume.
  • Load and transform large sets of structured, semi structured and unstructured data
  • Scheduling and managing cron jobs, wrote shell scripts to generate alerts.
  • Prepared design documents and functional documents.
  • Based on the requirements, addition of extra nodes to the cluster to make it scalable.
  • Involved in running Hadoop jobs for processing Billions of records of text data
  • Involved in loading data from local file system (LINUX) to HDFS
  • Experienced in runningHadoopstreaming jobs to process terabytes of xml format data.
  • Assisted in exporting analyzed data to relational databases using Sqoop
  • Created and maintained Technical documentation for launching Hadoop Clusters and for executing

Confidential, San Francisco, CA

Hadoop Developer

Responsibilities:

  • Handled importing of data from various data sources, performed transformations using Hive, Spark and loaded data into HDFS.
  • Expert in implementing advanced procedures like text analytics and processing using the in-memory
  • Involved in analysis, design, testing phases and responsible for documenting technical specifications.
  • Computing capabilities like Apache Spark written in Scala.
  • Experience in Hadoop distributed file system Cloudera.
  • Developed and executed shell scripts to automate the jobs
  • Wrote complex Hive queries and UDFs.
  • Worked on reading multiple data formats on HDFS using Spark.
  • Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs, Python and Scala.
  • Developed multiple POCs using Spark and deployed on the Yarn cluster, compared the performance of Spark, with Hive and SQL/Teradata.
  • Analyzed the SQL scripts and designed the solution to implement using Spark
  • Automation Tool (Autosys) for scheduling oozie jobs based on calendar and file watcher jobs.
  • Facilitating the dailyscrummeetings, sprint planning, sprint review, and sprint retrospective.
  • Worked on the core and Spark SQL modules of Spark extensively.

Software Engineer.

Confidential

Responsibilities:

  • Responsible for understanding the scope of the project and requirement gathering.
  • Developed the web tier using JSP, Struts MVC to show account details and summary.
  • Created and maintained the configuration of the Spring Application Framework.
  • Implemented various design patterns - Singleton, Business Delegate, Value Object and Spring DAO.
  • Used Spring JDBC to write some DAO classes which interact with the database to access account information.
  • Mapped business objects to database using Hibernate.
  • Involved in writing Spring Configuration XML files that contains declarations and other dependent objects declaration.
  • Used Tomcat web server for development purpose.
  • Involved in creation of Test Cases for Unit Testing.
  • Used Oracle as Database and used Toad for queries execution and also involved in writing SQL scripts, PL/ SQL code for procedures and functions.
  • Used CVS, Perforce as configuration management tool for code versioning and release.
  • Developed application using Eclipse and used build and deploy tool as Maven.
  • Used Log4J to print the logging, debugging, warning, info on the server console.

Software Developer

Confidential

Responsibilities:

  • Involved in the analysis, design, implementation, and testing of the project
  • Designed the functional specifications and architecture of the web-based module using Java Technologies.
  • Created Design specification using UML Class Diagrams, Sequence & Activity Diagrams
  • Developed the Web Application using MVC Architecture, Java, JSP, and Servlets & Oracle Database.
  • Developed various Java classes, SQL queries and procedures to retrieve and manipulate the data from RDBMS
  • Backend Oracle database using JDBC.
  • Extensively worked with Java Script for front-end validations.
  • Analysis of business requirements and develop system architecture document for the enhancement project.
  • Provided Impact Analysis and Test cases.
  • Involved writing the JDBC connectivity code to interact the back end data base.
  • Implemented the presentation layer with HTML, XHTML and JavaScript
  • Designed tables and indexes.
  • Wrote complex SQL queries and stored procedures.
  • Designed tables and indexes and involved in writing the DAO interaction layer as per the requirements.
  • Designed, Implemented, Tested and Deployed Enterprise application using WebLogic as Application Server.
  • Involved in fixing bugs and unit testing with test cases using Junit.
  • Actively involved in the system testing.
  • Involved in implementing service layer using Spring IOC module.

We'd love your feedback!