We provide IT Staff Augmentation Services!

Hadoop & Bigdata Developer Resume

0/5 (Submit Your Rating)

Harrisburg, PA

SUMMARY

  • 9+ years of experience working in SDLC through requirements analysis, design specification, code development, code integration and maintenance of applications.
  • Experience in Hadoop, Core Java development, have end to end experience in developing applications using most of theHadoop ecosystem tools.
  • Good experience in Object Oriented Programming, using Java & J2EE (Servlets, JSP, Java Beans, EJB, JDBC, RMI, XML, JMS, Web Services, AJAX).
  • Experience in installing and configuring all the ecosystem components like HadoopMap Reduce, Hive, Sqoop, Pig, HDFS, Hbase, Kafka, Spark Streaming, spark SQL, Cassandra, ZooKeeper, Oozie, and Flume.
  • Strong hands - onknowledge with DW platforms and databaseslike MS SQL Servers 2012 and 2008, Oracle 11g/10g/9i, Talend ETL, Informatica, DB2 and Teradata.
  • Good Experience in writing UDFs, PIG scripts and Hive Queries for processing and analyzing large volumes of data.
  • Extensive knowledge in writing MapReduce codes in java using Mapreduce framework as per the business requirements and analyzing the data in HDFS.
  • Good knowledge in writing, testing and running MapReduce pipelines using Apache Crunch framework which are made of many UDFs.
  • Experience in working with Python and Hadoop Streaming Command options.
  • Importing and exporting data using Sqoop from HDFS to Relational Database Systems, vice-versa.
  • Extensive knowledge and experience on Installation, configuration, supporting and managing Cloudera Distributions of CDH3/4/5and good knowledge on Hortonworks distributions.
  • Used Hbase along with PIG/Hive for real time low latency queries.
  • Good experience inTableau Desktop, Tableau Server, Tableau Readerin various versions of Tableau 6, Tableau 7, Tableau 8.x and Tableau 9.x.
  • Experience in building dashboards, creating different visualizations using Bars, Lines and Pies, Maps, Scatter plots, Gantts, Bubbles, Histograms, Bullets, Heat maps and Highlight tables.
  • Good understanding/knowledge of HDFS architecture and various components such asJob Tracker, Task Tracker, Name Node, Data Node,HDFS high availability (HA) and Map Reduce programming paradigm.
  • Good knowledge on NOSQL Data bases such as HBase, Cassandra, MongoDB.
  • Experience with Kafka in understanding and performing thousands of megabytes of reads and writes per second on streaming data
  • Very good knowledge of Data Warehouse/Data mart conceptsand Expertise indata modeling for OLAP & OLTP systemsfrom design and analysis to implementation includingthe conceptual, logical and physical data models.
  • Experience in working with Windows, UNIX/LINUX platform with different technologies such as Big Data, SQL, XML, HTML, C#, ASP.NET, Shell Scripting etc.
  • Experience in designing and coding of stored procedures, functions, triggers and packages using PL/SQL.
  • Experience include system analysis, system Configuration, Software development, integration testing, User training, Post go live support.
  • Proven ability to work with senior level business managers and understand the key business drivers that impacts their satisfaction.
  • Strong leadership, communication, interpersonal, and analytical skills with problem solving aptitude.
  • Resourceful, self-starter, self-motivated with aptitude to self-train and adapt to new market trends, requirements and ideas

TECHNICAL SKILLS

Languages: C,C++, Core Java, SQL,Shell Scripting, Python, SAS

Web Technologies: J2EE, JMS, Web Service, JDBC.

Big Data Eco System: HDFS, Map Reduce, Apache Crunch, Hive, Pig, Impala, HBase, Sqoop, NOSQL (HBase, Cassandra), Hadoop Streaming, ZooKeeper, Oozie, Kafka and Flume.

Scripting Languages: HTML, JavaScript, CSS, XML and Ajax

Distributed Technologies: RMI, EJB

Operating system: Windows, Linux and Unix

Methodologies: UML diagrams, Design Patterns, Rational Rose

DBMS / RDBMS: Oracle 11g/10g/9i, Informatica, Talend ETL, SQL Server 2012/2008, MySQL,DB2 and NoSQL

IDE: Eclipse, Microsoft Visual Studio (2008,2012), Flex Builder

Version Control: SVN, CVS and Rational Clear Case Remote Client V7.0.1

Tools: Putty, Squirrel SQL Client, PL/SQL Developer, JUnitSQL Oracle Developer.

PROFESSIONAL EXPERIENCE

Confidential, Harrisburg, PA

Hadoop & BigData Developer

Responsibilities:

  • Data validation and ingestion to Hadoop cluster from MySql and Oracle databases using sqoop from different plants across the world.
  • Building parquet tables in Impala from avro tables ofHive.
  • Worked on different POC projects for different plant requirements.
  • Scheduling Hadoop jobs using Control M tool.
  • Used Spark Streaming using Kafka for streaming realtime vision data.
  • Leveraged spark Sql, built RDDs from HDFS for adhoc analysis and POC.
  • Creating visualizations and dashboards for pilot projects using Tableau 8.x/9.x built on impala parquet tables and publishing on to the server for clients.
  • Worked on Hive for exposing data for further analysis and for generating transforming files from different analytical formats to text files.
  • DevelopedHIVE queries for the analysts.
  • Implemented Partitioning, Dynamic Partitions, Buckets in HIVE.
  • Exported the result set from HIVE to MySQL using Shell scripts.
  • Used Zookeeper for various types of centralized configurations.
  • Involved in maintaining various Unix Shell scripts.
  • Monitor System status and logs and respond accordingly to any warning or failure conditions.
  • Deep understanding of schedulers, workload management, availability, scalability and distributed data platforms.

Environment: Hadoop, HDFS, Map Reduce, Impala,Spark Streaming, Flume, Hive, Kafka, Sqoop, Java 1.6, UNIX Shell Scripting, Tableau.

Confidential, Thousand Oaks, CA

Senior Hadoop Developer

Responsibilities:

  • Good understanding and related experience with Hadoop stack-internals, Hive, Pig and MapReduce&Yarn framework in Core Java.
  • Involved in requirement gathering of the enhancement for the project
  • Installed and configured HadoopMapReduce, HDFS, developed multiple MapReduce jobs in Java for data cleaning and pre-processing.
  • Involved in loading data from UNIX file system to HDFS.
  • Created standard and best practices for Talend ETL components and jobs.
  • Wrote MapReduce jobs to discover trends in data usage by users.
  • Re-architected data flow for efficiency, revised data models, maintained mappings as needed, Data Warehouse and Data Mart design and data modeling in Informatica.
  • Business analysis and data model design for new tables related to ETL.
  • Working with Python Streaming using python framework for Hadoop in text processing to split a line into key/value pair.
  • Automation of the daily Hadoop jobs using Oozie and cronjobs.
  • Involved in managing and reviewing Hadoop log files.
  • Involved in running Hadoop streaming jobs to process terabytes of text data.
  • Load and transform large sets of structured, semi structured and unstructured data.
  • Wrote Hive and PigUDF’s.
  • Implemented Fair schedulers on the Job tracker to share the resources of the Cluster for the Map Reduce jobs given by the users.
  • Automated all the jobs starting from pulling the Data from different Data Sources like MySQL to pushing the result set Data to Hadoop Distributed File System using Sqoop.
  • Used SVN for version control.
  • Helped the team to increase Cluster from 25 Nodes to 40 Nodes.
  • Maintain System integrity of all sub-components (primarily HDFS, MR, HBase, and Flume).

Environment: Hadoop, HDFS, Map Reduce, HBase,Hadoop Streaming, Oozie, Hive, Pig, Sqoop, Java 1.6, UNIX Shell Scripting.

Confidential, St.Louis, Missouri

Hadoop Developer

Responsibilities:

  • Installed and configured Hadoop Map Reduce, HDFS, Developed multiple map reduce jobs in java for data cleaning and preprocessing.
  • Worked on analysing Hadoop stack and different big data analytic tools including Pig and Hive, Hbase& Cassandra databases and Sqoop.
  • Involved in writing the shell scripting for loading data from LINUX file system to HDFS.
  • Performing Sorting’s of Columns & Designing very flexible and very complex schemas with a concept of Super Column key in Cassandra.
  • Created dashboards using Tableau desktop and prepared user stories to create compelling dashboards to deliver actionable insights.
  • Created visualizations for logistics calculations and departmental spend analysis using Tableau.
  • Involved in creatingHive tables, loading with data and writing hive queries which will run internally in map reduce way.
  • Worked as the Test Engineer for preparing test results document for the test cases.
  • Designed a data warehouse using Hive.
  • Worked extensively with Sqoop for importing metadata from DB2 and Teradata.
  • Extensively used Pig for data cleansing.
  • Created partitioned tables in Hive and worked with business teams and created Hive queries for ad hoc access.
  • Evaluated usage of Oozie for Workflow Orchestration Mentored analyst and test team for writing Hive Queries.

Environment: Hadoop, HDFS, Hive, Pig, Sqoop, Hbase, Cassandra, Java (jdk1.4), LINUX, MapReduce, Oozie, Oracle 11g/10g, Teradata,QLIKVIEW, Quality Control.

Confidential, Dover,NH

Hadoop Developer

Responsibilities:

  • Worked on Hadoop cluster which ranged from 4-8 nodes during pre-production stage and it was sometimes extended up to 24 nodes during production
  • Used data to import from RDBMS to Hadoop Distributed File System (HDFS) and later analyzed the imported data using Hadoop Components
  • Established custom MapReduce programs using its core java framework in order to analyze data and used Pig Latin to clean unwanted data.
  • Did various performance optimizations like using distributed cache for small datasets, Partition, Bucketing in hive and Map Side joins.
  • Involved in creating Hive tables andapplied HiveQL on those tables for data validation.
  • Moved the data from Hive tables into MongoDB collections.
  • Involved in loading and transforming large sets of Structured, Semi-Structured and Unstructured data and analyzed them by running Hive queries and Pig scripts
  • Participated in requirement gathering form the Experts and Business Partners and converting the requirements into technical specifications
  • Used Zookeeper to manage coordination among the clusters
  • Experienced in analyzing MongoDB database and compare it with other open-source NoSQL databases to find which one of them better suites the current requirements.
  • Gave assistance in exporting the analyzed data to RDBMS.
  • Created and maintained Technical documentation for launching HADOOPClusters and for executing Hive queries and Pig Scripts
  • Assisted application teams in installing Hadoop updates, operating system, patches and version upgrades when required
  • Assisted in Cluster maintenance, Cluster Monitoring and Troubleshooting, Manage and review data backups and log files
Environment: Hadoop, Pig, Hive, Flume, MapReduce, HDFS, LINUX,MongoDB

Confidential, Detroit, MI

Senior Java/J2EE Developer

Responsibilities:

  • Actively Participated in JAD (Joint application development) sessions for requirements gathering and documenting business process.
  • Used JSP, Struts, JSTL tags, Java Script for building dynamic web pages. Added tag libraries like Display tag, Tiles and Validator for extended flexible page design.(For more flexible page design introduced tag libraries like Display tag and Validator tags.)
  • Incorporated J2EE design Patterns (Business Delegate, Singleton, Data Access Object, Data Transfer Object, MVC) for the Middle Tier development.
  • Data access framework by Spring is used for automatically acquiring and releasing database resources and exception handling by spring data access hierarchy for better handling of database connections with JDBC.
  • Established communication among external systems using Web Services (SOAP).
  • Implemented several JUnit test cases.
  • Implemented a logging application, Web Logging for better trace the data flow on application server using Log4J.
  • Used Clear Case as a version control of the application with developments streams.
  • Worked with team of Developers and Testers to resolve the issues with the server timeouts and database connection pooling issues. Initiated Profiling using RAD for finding the Objects memory leaks.

Environment: Java1.4, J2EE1.3, Struts1.1, HTML, Java Script, JSP1.2, Servlets2.3, Spring1.2, ANT, Log4j1.2.9, PL/SQL, Oracle8i/9i, SQL Navigator5.5, WebSphere Application Server 5.1/6.0, RAD6.0, IBM Clear case.

Confidential

Java Developer

Responsibilities:

  • Responsible for coordinating on-site and off-shore development teams in various phases of the project.
  • Involved in developing dynamic Jsp and doing page validations using Java Script.
  • Involved in database schema design and review meetings.
  • Designed a nightly build process for updating the catalogue and intimating the user of the pending authorization.
  • Used automated test scripts and tools to test the application in various phases. Coordinated with Quality Control teams to fix issues that were identified
  • Involved in writing Stored Procedures using Oracle.
  • Responsible for building projects in deployable files (WAR files and JAR files)
  • Designed and developed base classes, framework classes and common re-usable components.
  • Involved in performance tuning, debugging production problems during testing and deployment phases of the project
  • Involved in re-factoring the existing components to meet the current application requirements
  • Used various Java and J2EE APIs including JDBC, XML, Servlet, JSP, and JavaBean.
  • Support Production Team members in the development and testing of production Implementation Plans and the Midrange group during Migrations.
  • Involved in testing, maintenance and production support of the application.
  • Responded to requests from Technical Team members to prepare a TAR and configured files for Production migration.

Environment: J2EE, Hibernate, JSP, Servlets, Java beans, Java Script, Oracle Application Server OC4J, JDeveloper, Apache ANT 1.6.1, Windows 2000.

We'd love your feedback!