We provide IT Staff Augmentation Services!

Sr. Big Data Developer Resume

4.00/5 (Submit Your Rating)

Nashville, TN

SUMMARY:

  • 8 plus years of experience in design and deployment of Enterprise Application Development, Web Applications, Client - Server Technologies, using Bigdata technologies.
  • Possesses 5 years of comprehensive experience as a Hadoop, BigData&AnalyticsDeveloper.
  • Expertise on Hadoop architecture and ecosystem such as HDFS, MapReduce, Pig, Hive, Sqoop Flumeand Oozie.
  • Complete Understanding on Hadoop daemons such as Job Tracker, Task Tracker, Name Node, Data Node and MRV1 and YARN architecture.
  • Experience in installation, configuration, Management, supporting and monitoring Hadoop cluster using various distributions such as Apache, Cloudera and AWS.
  • Experience in Installation and Configuring Hadoop Stack elements MapReduce, HDFS, Hive, PigSqoop, Flume, Oozieand Zookeeper.
  • Experience in data process and analysis using Map Reduce, HiveQLand Pig Latin.
  • Extensive experience in Writing User Defined Functions (UDFs) in Hive and Pig.
  • Worked on ApacheSqoop to perform importing and exporting data from HDFS to RDBMS/NoSQL DBs and vice-versa.
  • Worked well with Python for statistics analytics forgenerating reports for Data Quality
  • Worked with NoSQL databases such as HBase, and MongoDB.
  • Exposure to search, cache, and analytics data solutions such as Cassandraand Hive.
  • Experience in job workflow scheduling and Job Designer with the help of Oozie.
  • Good knowledge on Amazon AWS concepts like EMR and EC2 web services which provides fast and efficient processing of Big Data and Machine Learning Concepts.
  • Good Experience inLinux Bashscripting and following PEP Guidelines inPython.
  • Involved in converting Hive/SQL queries into Spark transformations usingSpark RDDinScalaandPython
  • Experienced with different scripting language likePythonandshell scripts
  • Hands-on experience in usingpythonscripts to handledatamanipulation
  • Worked extensively over semi-structured data (fixed length & delimited files), for data sanitation, report generation and standardization.
  • Experienced in monitoring Hadoop cluster using Cloudera Manager and Web UI.
  • Extensive Experience working on web technologies like HTML, CSS, XML, JSON, JQuery
  • Extensive experience in documenting requirements, functional specifications and technical specifications.
  • Extensive experience with SQL, PL/SQL and database concepts.
  • Experience working on Version control tools like SVN and Git revision control systems such as GitHub and JIRA to track issues and crucible for code reviews.
  • Strong Database background with Oracle, PL/SQL, Stored Procedures, trigger, SQL Server, MySQL, and DB2.
  • Strong Problem Solving and Analytical skills and abilities to make Balanced & Independent Decisions.
  • Good Team Player, Strong Interpersonal, Organizational and Communication skills combined with Self-Motivation, Initiative and Project Management Attributes.
  • Holds strong ability to handle multiple priorities and work load and also has ability to understand and adapt to new technologies and environments faster.

TECHNICAL SKILLS:

Hadoop Core Services: HDFS, Map Reduce, Spark, YARN.

Hadoop Distribution: Horton works, Cloudera, Apache.

NO SQL Databases: HBase, Cassandra.

Hadoop Data Services: Hive, Pig, Impala, Sqoop, Flume, Kafka.

HadoopOperational Services: Zookeeper, Oozie.

Monitoring Tools: Cloudera Manager.

Cloud Computing Tools: Amazon AWS.

Languages: C,Scala, Python, SQL, PL/SQL, Pig Latin, HiveQL, Unix, Shell Scripting.

Databases: Oracle, MySQL,Postgress, Teradata.

Operating Systems: UNIX, Windows, LINUX.

WORK EXPERIENCE:

Confidential, Nashville, TN

Sr. Big Data Developer

Responsibilities:

  • Worked with SQOOP import and export functionalities to handle large data set transfer between Oracle database and HDFS.
  • Knowledge on handling Hive queries using Spark SQL that integrate Spark environment.
  • Involved in creating Oozie workflow and Coordinator jobs to kick off the jobs on time and data availability.
  • Involved in implementing High Availability and automatic failover infrastructure to overcome single point of failure for Name node utilizing zookeeper services.
  • Continuously monitored and managed the Hadoop cluster using Cloudera manager and Web UI.
  • Developed Map Reduce programs that filter bad and un-necessary records and find out unique records based on different criteria.
  • Developed Secondary sorting implementation to get sorted values Confidential reduce side to improve map reduce performance.
  • Implemented custom Data Types, Input Format, Record Reader, Output Format, Record Writer for Map Reduce computationsto handle custom business requirements.
  • Implemented Map Reduce programs to classified data organizations into different classifieds based on different type of records.
  • Worked on Sequence files, RC files, Map side joins, bucketing, partitioning for hive performance enhancement and storage improvement.
  • Implemented Daily Cron jobs that automate parallel tasks of loading the data into HDFS and pre-processing with Pig usingOozie co-coordinator jobs.
  • Responsible for performing extensive data validation using Hive.
  • Worked intuning Hive and Pig scriptsto improve performance.
  • Involved in submitting and tracking Map Reduce jobs using JobTracker.
  • WrotePython scriptsto find vulnerabilities withSQL Queries
  • Wrote Python UDFs to process and return valid names using streaming
  • Wrote Python scripts for pattern matching in build logs to format warnings and errors
  • Implemented business logic by writing Pig UDFs in Java and used various UDFs from Piggybanks and other sources
  • Involved in loading the created HFiles into HBase for faster access of large customer base without taking Performance hit.
  • Implemented Hive Generic UDF's to implement business logic.
  • Coordinated with end users for designing and implementation of analytics solutions for User Based Recommendations using R as per project proposals.
  • Improved stability and performance of the Scala plug-in for Eclipse, using product feedback from customers and internal users.
  • Redesigned and implemented Scala REPL (read-evaluate-print-loop) to tightly integrate with other IDE features in Eclipse.
  • Assisted monitoring Hadoop cluster using Ganglia.
  • Implemented test scripts to support test driven development and continuous integration.
  • Junit framework was used to perform unit and integration testing.
  • Configured build scripts for multi module projects with Maven and Jenkins CI.
  • Involved in story-driven agile development methodology and actively participated in daily scrum meetings.

Environment: Hadoop, CDH4, Map Reduce, HDFS, Pig, Hive, Impala, Oozie, Java, Kafka, Linux, Scala, Maven, Java Scripting, Oracle 11g/10g, SVN, Ganglia.

Confidential, Deerfield, IL

Sr. Hadoop Developer

Responsibilities:

  • Installed and Setup Hadoop CDH clusters for development and production environment.
  • Installed and configured Hive, Pig, Sqoop, Flume, Cloudera manager and Oozie on the Hadoop cluster.
  • Planning for production cluster hardware and software installation on production cluster and communicating with multiple teams to get it done.
  • It also means that OpenStack has the benefit of thousands of developers all over the world working in tandem to develop the strongest, most robust, and most secure product that they can.
  • Monitored multiple Hadoop clusters environments using Hortonworks. Monitored workload, job performance and collected metrics for Hadoop cluster when required.
  • Installed Hadoop patches, updates and version upgrades when required
  • Installed and configured Cloudera Manager, Hive, Pig, Sqoop and Oozie on the CDH4 cluster.
  • Performed an upgrade in development environment from CDH 4.2 to CDH 4.6
  • Worked with big data developers, designers and scientists in troubleshooting map reduce, hive jobs and tuned them to give high performance.
  • Automated end to end workflow from Data preparation to presentation layer for Artist Dashboard project using Shell Scripting.
  • Provide input into Product Management to influence feature requirements for compute, and networking in VMware cloud offering.
  • Developed Map reduce program which were used to extract and transform the data sets and result dataset were loaded to Cassandra.
  • Orchestrated Sqoop scripts, pig scripts, hive queries using Oozie workflows and sub-workflows
  • Conducting RCA to find out data issues and resolve production problems.
  • Involved in loading the created files into MongoDB for faster access of large customer base without taking performance hit.
  • Proactively involved in ongoing maintenance, support and improvements in Hadoop cluster.
  • Performed data analytics in Hive and then exported this metrics back to Oracle Database using Sqoop.
  • Involved in Minor and Major Release work activities.
  • Proactively involved in ongoing maintenance, support and improvements in Hadoop cluster.
  • Collaborating with business users/product owners/developers to contribute to the analysis of functional requirements.

Environment: Cloudera Hadoop, MapReduce, HDFS, Hortonworks, Cloudera Manager, Hive, Pig, Sqoop, Oozie, Flume, Linux, Zookeeper, LDAP.

Confidential, Middletown, New Jersey

Big Data Developer

Responsibilities:

  • Installed, configured, and maintained Apache Hadoop clusters for application development and major components of Hadoop Ecosystem: Hive, Pig, HBase, Sqoop, Flume, Oozie and Zookeeper.
  • Implemented six nodes CDH4 Hadoop Cluster on CentOS.
  • Importing and exporting data into HDFS and Hive from different RDBMS using Sqoop.
  • Experienced in defining job flows to run multiple Map Reduce and Pig jobs using Oozie.
  • Importing log files using Flume into HDFS and load into Hive tables to query data.
  • Monitoring the runningMap Reduceprograms on the cluster.
  • Responsible for loading data from UNIX file systems to HDFS.
  • Used HBase-Hive integration, written multiple Hive UDFs for complex queries.
  • Involved in writing APIs to ReadHBasetables, cleanse data and write to anotherHBasetable.
  • Created multiple Hive tables, implemented Partitioning, Dynamic Partitioning and Buckets in Hive for efficient data access.
  • Written multiple Map Reduce programs in Java for data extraction, transformation and aggregation from multiple file formats including XML, JSON, CSV and other compressed file formats.
  • Experienced in running batch processes using Pig Scripts and developed Pig UDFs for data manipulation according to Business Requirements.
  • Experienced in writing programs using HBase Client API.
  • Involved in loading data into HBase using HBase Shell, HBase Client API, Pig and Sqoop.
  • Experienced in design, development, tuning and maintenance of NoSQL database.
  • Written Map Reduce program in Python with the Hadoop streaming API.
  • Developed unit test cases for Hadoop Map Reduce jobs with MRUnit.
  • Excellent experience in ETL analysis, designing, developing, testing and implementing ETL processes including performance tuning and query optimizing of database.
  • Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
  • Used Maven as the build tool and SVN for code management.
  • Worked on writing RESTful web services for the application.
  • Implemented testing scripts to support test driven development and continuous integration.

Environment: Hadoop, Map Reduce, HDFS, HBase, Hive, Impala,Pig, Java, SQL, Ganglia, Scoop, Flume, Oozie, Unix, Java, Java Script, Maven, Eclipse.

Confidential

Java Developer

Responsibilities:

  • Designed, developed, maintained, tested, and troubleshoot Java and PL/SQL programs in support of Payroll employees.
  • Developed documentation for new and existing programs, designs specific enhancements to application.
  • Implemented web layer using JSF and Ice faces.
  • Implemented business layer using Spring MVC.
  • Implemented Getting Reports based on start date using HQL.
  • Implemented Session Management using Session Factory in Hibernate.
  • Developed the DO’s and DAO’s using hibernate.
  • Implement SOAP web service to validate zip code using Apache Axis.
  • Wrote complex queries, PL/SQL Stored Procedures, Functions and Packages to implement Business Rules.
  • Wrote PL/SQL program to send EMAIL to a group from backend.
  • Developer scripts to be triggered monthly to give current monthly analysis.
  • Scheduled Jobs to be triggered on a specific day and time.
  • Modified SQL statements to increase the overall performance as a part of basic performance tuning and exception handling.
  • Used Cursors, Arrays, Tables, Bulk collect concepts.
  • Extensively used log4j for logging the log files.
  • Performed UNIT testing in all the environments.
  • UsedSubversionas the version control system

Environment: Java (JDK1.5), J2EE, Eclipse, JSP, JavaScript, JSTL, Ajax, GWT, Log4j,CSS,XML, Spring, EJB, MDB, Hibernate, Web Logic, REST, Rational Rose, Junit, Maven, JIRA, SVN.

Confidential

Java Developer

Responsibilities:

  • Involved in all the phases of the life cycle of the project from requirements gathering to quality assurance testing.
  • Developed Class diagrams, Sequence diagrams using Rational Rose.
  • Responsible in developing Rich Web Interface modules with Struts tags,JSP, JSTL, CSS, JavaScript, Ajax, GWT.
  • Developed presentation layer using Struts framework, and performed validations using Struts Validator plugin.
  • Created SQL script for the Oracle database
  • Implemented the Business logic using Java Spring Transaction Spring AOP.
  • Implemented persistence layer using Spring JDBC to store and update data in database.
  • Produced web service using WSDL/SOAP standard.
  • Implemented J2EE design patterns like Singleton Pattern with Factory Pattern.
  • Extensively involved in the creation of the Session Beans and MDB, using EJB 3.0.
  • Used Hibernate framework for Persistence layer.
  • Extensively involved in writing Stored Procedures for data retrieval and data storage and updates in Oracle database using Hibernate.
  • Deployed and built the application using Maven.
  • Performed testing using JUnit.
  • Used JIRA to track bugs.
  • Extensively used Log4j for logging throughout the application.
  • Produced a Web service using REST with Jersey implementation for providing customer information.
  • Used SVN for source code versioning and code repository.

Environment: Java (JDK1.5), J2EE, Eclipse, JSP, JavaScript, JSTL, Ajax, GWT, Log4j,CSS,XML, Spring, EJB, MDB, Hibernate, Web Logic,REST, Rational Rose, Junit, Maven, JIRA, SVN.

We'd love your feedback!