Hadoop Developer/ Admin Resume Columbus, OH - Hire IT People

SUMMARY

Over 8 years of experience spread across Hadoop, Java and ETL, dat includes extensive experience into Big Data Technologies and in development of standalone and web applications in multi - tiered environment using Java,Hadoop, Hive, HBase, Pig, G
Good work experience on large-scale systems development projects, especially enterprise distributed systems.
Very good understanding of Hadoop ecosystems like Sqoop2, Spark and YARN.
Strong Working experience on rule-based decision making, information-parsing and complex data processing using schematron and drools.
Experience in Data Analysis, Data Validation, Data Verification, Data Cleansing, Data Completeness and identifying data mismatch.
Experience in working with MR, PIG scripts &HIVE query Language.
Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa
In depth understanding/noledge of Hadoop Architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and MapReduce concepts
Extending Hive and Pig core functionality by writing custom UDFs
Experience in analyzing data using Hive QL, Pig Latin, and custom MapReduce programs in Java.
Extensive experience with SQL, PL/SQL, PostgreSQL and database concepts
Knowledge of NoSQL databases such as HBase and Cassandra
Knowledge of job workflow scheduling and monitoring tools like Oozie and Zookeeper
Exposure to administrative tasks such as installing Hadoop and its ecosystem components such as Hive and Pig
Handled several techno-functional responsibilities including estimates, identifying functional and technical gaps, requirements gathering, designing solutions, development, developing documentation, and production support
An individual with excellent interpersonal and communication skills, strong business acumen, creative problem solving skills, technical competency, team-player spirit, and leadership skills

TECHNICAL SKILLS

Hadoop/Big Data: HDFS, MapReduce, Pig, Hive, Impala, HBase, Cassandra, Sqoop, Oozie, Zookeeper, Flume

Java & J2EE Technologies: Core Java

IDE Tools: Eclipse, NetBeans

Programming languages: COBOL, Java, KSH & Mark up Languages

Databases: Oracle, MySQL, DB2, IMS, PostgreSQL

Operating Systems: Windows 95/98/2000/XP/Vista/7, Unix

Reporting Tools: Tableau

Other Tools: Putty, WINSCP, EDI(Gentran), Streamweaver, Compuset

PROFESSIONAL EXPERIENCE

Confidential, Columbus, OH

Hadoop Developer/ Admin

Responsibilities:

Worked onHadoopcluster scaling from 4 nodes in development environment to 8 nodes in pre-production stage and up to 24 nodes in production.
Involved in complete Implementation lifecycle, specialized in writing custom Map Reduce, Pig and Hive programs.
Exported teh analyzed data to teh relational databases using Sqoop for visualization and to generate reports for teh BI team.
Extensively used Hive/HQL or Hive queries to query or search for a particular string in Hive tables in HDFS.
Did various performance optimizations like using distributed cache for small datasets, Partition, Bucketing in hive and Map Side joins.
Experience in developing customized UDF's in java to extend Hive and Pig Latin functionality.
Created HBase tables to store various data formats of data coming from different portfolios.
Managing and scheduling Jobs to remove teh duplicate log data files in HDFS using Oozie.
Used Flume extensively in gathering and moving log data files from Application Servers to a central location inHadoopDistributed File System (HDFS).
Implemented test scripts to support test driven development and continuous integration.
Responsible to manage data coming from different sources.
Experienced in Analyzing Cassandra database and compare it with other open-source NoSQL databases to find which one of them better suites teh current requirements.
Used File System check (FSCK) to check teh health of files in HDFS.
Developed teh UNIX shell scripts for creating teh reports from Hive data.
Experienced on loading and transforming of large sets of structured, semi structured and unstructured data.
Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.

Environment: Hadoop, Java, UNIX, HDFS, Pig, Hive, Map Reduce, Sqoop, NoSQL DB’s, Cassandra, HBase, LINUX, Flume, Oozie

Confidential, Piscataway NJ

Hadoop Developer

Responsibilities:

Worked onHadoopcluster scaling from 4 nodes in development environment to 8 nodes in pre-production stage and up to 24 nodes in production.
Involved in complete Implementation lifecycle, specialized in writing custom MapReduce, Pig and Hive programs.
Exported teh analyzed data to teh relational databases using Sqoop for visualization and to generate reports for teh BI team.
Extensively used Hive/HQL or Hive queries to query or search for a particular string in Hive tables in HDFS.
Did various performance optimizations like using distributed cache for small datasets, Partition, Bucketing in hive and Map Side joins.
Experience in developing customized UDF's in java to extend Hive and Pig Latin functionality.
Created HBase tables to store various data formats of data coming from different portfolios.
Managing and scheduling Jobs to remove teh duplicate log data files in HDFS using Oozie.
Used Flume extensively in gathering and moving log data files from Application Servers to a central location inHadoopDistributed File System (HDFS).
Implemented test scripts to support test driven development and continuous integration.
Responsible to manage data coming from different sources.
Experienced in Analyzing Cassandra database and compare it with other open-source NoSQL databases to find which one of them better suites teh current requirements.
Used File System check (FSCK) to check teh health of files in HDFS.
Developed teh UNIX shell scripts for creating teh reports from Hive data.
Experienced on loading and transforming of large sets of structured, semi structured and unstructured data.
Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.

Environment: Hadoop, Java, UNIX, HDFS, Pig, Hive, MapReduce, Sqoop, NoSQL DB’s, Cassandra, Hbase, LINUX, Flume, Oozie

Confidential, Schaumburg, IL

Hadoop Admin and Developer

Responsibilities:

Installed and configured HadoopMapReduce, HDFS and developed multiple MapReduce jobs in Java for data cleansing and pre-processing
Importing and exporting data into HDFS and Hive using Sqoop
Proactively monitored systems and services, architecture design and implementation of Hadoopdeployment, configuration management, backup and disaster recovery systems and procedures
Extracted files from CouchDB, MongoDB through Sqoop and placed in HDFS for processed
Used Flume to collect, aggregate, and store teh web log data from different sources like web servers, mobile and network devices and pushed to HDFS
Developed Puppet scripts to install Hive, Sqoop, etc. on teh nodes
Data back up and synchronization using Amazon Web Services
Worked on Amazon Web Services as teh primary cloud platform
Load and transform large sets of structured, semi structured and unstructured data
Supported Map Reduce Programs those are running on teh cluster
Load log data into HDFS using Flume, Kafka and performing ETL integrations
Designed and implemented DR and OR procedures
Wrote shell scripts to monitor teh health check of Hadoop daemon services and respond accordingly to any warning or failure conditions
Involved in loading data from UNIX file system to HDFS, configuring Hive and writing Hive UDFs
Utilized Java and MySQL from day to day to debug and fix issues with client processes
Used JAVA, J2EE application development skills with Object Oriented Analysis and extensively involved throughout Software Development Life Cycle (SDLC)
Hands-on experience of Sun One Application Server, Web logic Application Server, Web Sphere Application Server, Web Sphere Portal Server, and J2EE application deployment technology
Monitoring Hadoop cluster using tools like Nagios, Ganglia, Ambari and Cloudera Manager
Automation script to monitor HDFS and HBase through Cron jobs
Used MRUnit for debugging MapReduce dat uses sequence files containing key value pairs.
Develop high-performance cache, making teh site stable and improving its performance
Create a complete processing engine, based on Cloudera's distribution
Proficient with SQL languages and good understanding of Informatica and Talend
Administrative support for parallel computation research on a 24-node Fedora/ Linux cluster.

Environment: Hadoop, MapReduce, HDFS, Hive, Apache Spark, Kafka, CouchDB, Flume, AWS, Cassandra, Oracle 11g, Java, Struts, Servlets, HTML, XML, SQL, J2EE, MRUnit, Informatica, JUnit, Tomcat 6, JDBC, JNDI, Maven, SQL, Oracle, XML, Eclipse.

Confidential, New York, NY

Hadoop Admin

Responsibilities:

Installed and configured HadoopMapreduce, HDFS, developed multiple MapReducejobs in Java for data cleaning and preprocessing.
Good understanding and related experience with Hadoop stack - internals, Hive, Pig and Map/Reduce.
Wrote Map Reduce jobs to discover trends in data usage by users.
Involved in defining job flows.
Involved in managing and reviewing Hadoop log files.
Involved in running Hadoop streaming jobs to process terabytes of text data.
Load and transform large sets of structured, semi structured and unstructured data.
Supported Map Reduce Programs those are running on teh cluster.
Involved in loading data from UNIX file system to HDFS.
Installed and configured Hive and also written Hive QL scripts.
Responsible to manage data coming from different sources.
Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.
Implemented Partitioning, Dynamic Partitions, Buckets in HIVE.
Extensive usage of Struts, HTML, CSS, JSP, JQuery, AJAX and JavaScript for interactive pages.
Assist teh team in their development & deployment activities.
Instrumental in preparing TDD &developing Java Web-Services for WU applications for many of teh money transfer functionalities.
Used Web services concepts like SOAP, WSDL, JAXB, and JAXP to interact with other project within Supreme Court for sharing information.
Involved in developing Database access components using Spring DAO integrated with Hibernate for accessing teh data.
Involved in writing HQL queries, Criteria queries and SQL queries for teh Data access layer.
Involved in managing deployments using xml scripts.
Testing - Unit testing through JUNIT & Integration testing in staging environment.
Followed Agile SCRUM principals in developing teh project.
Involved in development of SQL Server Stored Procedures and SSIS DTSX Packages to automate regular mundane tasks as per business needs.
Coordinating with offshore/onshore, collaboration and arranging teh weekly meeting to discuss and track teh development progress.
Involved in coordinating for Unit Testing, Quality Assurance, User Acceptance Testing and Bug Fixing.
Coordination with team, peer reviews and collaborative System level testing.

Environment: Hadoop, Map Reduce, HDFS, Pig, Hive, Java (jdk1.6), Cloudera Distribution, Java, HTML, JavaScript, XML, XSLT, jQuery, AJAX, Web Services, JNDI, SQL Server, Struts2.0, Hibernate.

Confidential

Java Developer

Responsibilities:

Involved in teh elaboration, construction and transition phases of teh Rational Unified Process.
Designed and developed necessary UML Diagrams like Use Case, Class, Sequence, State and Activity diagrams using IBM Rational Rose.
Used IBM Rational Application Developer (RAD) for development.
Extensively applied various design patterns such as MVC-2, Front Controller, Factory, Singleton, Business Delegate, Session Façade, Service Locator, DAO etc. throughout teh application for a clear and manageable distribution of roles.
Implemented teh project as a multi-tier application using Jakarta Struts Framework along with JSP for teh presentation tier.
Used teh Struts Validation Framework for validation and Struts Tiles Framework for reusable presentation components at teh presentation tier.
Developed various Action Classes dat route requests to appropriate handlers.
Developed Session Beans to process user requests and Entity Beans to load and store information from database.
Used JMS (MQSeries) for reliable and asynchronous messaging teh different components.
Wrote Stored Procedures and complicated queries for IBM DB2
Designed and used JUnit test cases during teh development phase.
Extensively used log4j for logging throughout teh application.
Used CVS for efficiently managing teh source code versions with teh development team.

Environment: JDK, J2EE, Web Services (SOAP, WSDL, JAX-WS), Hibernate, Spring, Servlets, JSP, Java Beans, NetBeans, Oracle SQL Developer, JUnit, Clover, CVS, Log4j, PL/SQL, Oracle, Web sphere Application Server, Tomcat Web Server

Confidential

JAVA Developer

Responsibilities:

Involved in Design, Development and Support phases of Software Development Life Cycle (SDLC)
Reviewed teh functional, design, source code and test specifications
Involved in developing teh complete front end development using Java Script and CSS
Author for Functional, Design and Test Specifications
Implemented Backend, Configuration DAO, XML generation modules of DIS
Analyzed, designed and developed teh component
Used JDBC for database access
Used Data Transfer Object (DTO) design patterns
Unit testing and rigorous integration testing of teh whole application
Written and executed teh Test Scripts using JUNIT
Actively involved in system testing
Developed XML parsing tool for regression testing
Prepared teh Installation, Customer guide and Configuration document which were delivered to teh customer along with teh product.

Environment: Java, JavaScript, HTML, CSS, JDK 1.5.1, JDBC, JUnit, Oracle10g, XML, XSL and UML

We provide IT Staff Augmentation Services!

Hadoop Developer/ Admin Resume

Columbus, OH

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship