We provide IT Staff Augmentation Services!

Senior Hadoop Developer/admin Resume

0/5 (Submit Your Rating)

Miami, FloridA

SUMMARY

  • Over 7+ years of extensive IT experience with multinational clients which includes four plus years of Hadoop related architecture experience developing Big data / Hadoop applications.
  • Hands on experience with the Hadoop stack (MapReduce, HDFS, Sqoop, Pig, Hive, HBase, Flume, Oozie and Zookeeper)
  • 6+ years of experience in design and development of data warehouse and business intelligence solutions using Ab Initio ETL tools
  • Well versed in configuring and administrating the Hadoop Cluster using major Hadoop Distributions like Apache Hadoop and Cloudera
  • Proven Expertise in performing analytics on Big Data using Map Reduce, Hive and Pig.
  • Experienced with performing real time analytics on NoSQL data bases like HBase, MongoDB and Cassandra.
  • Developed databases and projects using Python, Java, NoSQL/MYSQL.
  • Good knowledge in creating event processing data pipelines using Kafka, Storm and Hbase.
  • Experience with Talend DI installation, administration and development for data ware house and applications integration.
  • Experienced with ETL to load data into Hadoop/NoSQL
  • Experienced with Dimensional modeling, Data migration, Data cleansing, Data profiling, and ETL Processes features for data warehouses.
  • Excellent knowledge on Hadoop Architecture; as in HDFS, Job Tracker, Task Tracker, Name Node, Data Node, Hortonworks and Map Reduce programming paradigm.
  • Extensively worked on ETL processing tools like Pentaho and Talend.
  • Configured Ab Initio environment to connect to database using DB configuration file, input table, output table, and update table components.
  • Involved in developing solutions to analyze large data sets efficiently.
  • Experience in creating complex SQL Queries and SQL tuning, writing PL/SQL blocks like stored procedures, Functions, Cursors, Index, ANT, Maven, triggers and packages.
  • Worked with Oozie work flow engine to schedule time based jobs to perform multiple actions.
  • Experienced in importing and exportingdatabetween RDBMS andTeraDatainto HDFS using Sqoop
  • Analyzed large amounts of data sets writing Pig scripts and Hive queries
  • Logical Implementation and interaction with HBase, MongoDB.
  • Experienced in writing MapReduce programs &UDFs for both Hive & Pig in Java
  • Used Flume to channel data from different sources to HDFS.
  • Experience with configuration of Hadoop Ecosystem components: Hive, HBase, Pig, Sqoop, Mahout, Zookeeper and Flume.
  • Good experience in Hive partitioning, bucketing and perform different types of joins on Hive tables and implementing Hive SerDe like JSON and Avro.
  • Expertise with NoSQL databases like Cassandra, Hbase.
  • Supported MapReduce Programs running on the cluster and wrote custom MapReduce Scripts for Data Processing in Java.
  • Experience in developing test cases, performing Unit Testing, Integration Testing, experience in QA with test methodologies and skills for manual/automated testing using tools like WinRunner, JUnit.
  • Developed Spark jobs using Scala in test environment for faster data processing and used Spark SQL for querying
  • Good knowledge in Apache Crunch and Hadoop HDFS Admin Shell commands.
  • Experience with Testing Map Reduce programs using MRUnit, JUnit and Easy Mock.
  • Experienced with implementing Web based, Enterprise level applications using J2EE frameworks like Spring, Hibernate, EJB, JMS, JSF and Java.
  • Experience with web - based UI development using JQuery, CSS, HTML5, XHTML
  • Experienced with implementing/consumed SOAP Web Services using Spring CXF and Consumed Rest Web Services using Http Clients.
  • Worked with developers, DBAs, and systems support personnel in elevating and automating successful code to production.
  • Experienced in writing functions, stored procedures, and triggers using PL/SQL.
  • Experienced with build tool ANT, Maven and continuous integrations like Jenkins.
  • Experienced in all facets of Software Development Life Cycle (Analysis, Design, Development, Testing and maintenance) using Waterfall and Agile methodologies
  • Motivated team player with excellent communication, interpersonal, analytical, problem solving skills and zeal to learn new technologies.
  • Highly adept at promptly and thoroughly mastering new technologies with a keen awareness of new industry developments and the evolution of next generation programming solutions.

TECHNICAL SKILLS

Programming Languages: C, C++, JAVA, Python, PHP, SQL, PL/SQL, PIG Latin, HiveQL, Unix shell scripting

Big Data Technologies: Hadoop, MapReduce, Spark, Sqoop, Tera data, Hive, Oozie, PIG, HDFS, Zookeeper, Flume, Talend

J2EE Technologies: Core Java, Servlets, JSP, JDBC, JNDI, Java Beans

Web Technologies: AJAX, HTML5,JavaScript,CSS3

Frameworks: Spring 3.5 - Spring MVC, Spring ORM, Spring Security, Spring ROO, Hibernate, Struts

Application Servers: IBM WebSphere, JBoss WebLogic

Web Servers: WSDL, SOAP, Apache CXF, Apache Axis, REST, Jersey

Relational Databases: Oracle 10/11g, MS SQL Server, My SQL

NoSQL Databases: Monod, HBase

Designing Tools: UML, Visio, Visual Paradigm

IDEs: Eclipse, NetBeans

Operating System: Unix, Windows

PROFESSIONAL EXPERIENCE

Confidential, Miami, Florida

Senior Hadoop Developer/Admin

Responsibilities:

  • Handled importing of data from RDBMS into HDFS using Sqoop.
  • Managing data flow into Pivotal HAWQ (Internal / External tables).
  • Experienced in data cleansing processing using Pig latin operations and UDFs.
  • Experienced in writing Hive Queries for analyzing data in Hive warehouse using Hive Query Language (HQL).
  • Install, upgrade and maintain the Hadoop Clusters using Cloudera manager.
  • Involved in creating Hive tables, loading with data and writing hive queries to process the data.
  • Created scripts to automate the process of Data Ingestion.
  • Conducted predictive analysis using R Language and plotted graphs for predictive results.
  • Talend administrator with hands on Big Data(Hadoop) with Cloudera framework.
  • Gathering the requirement from Senior Management and Code Enhancement and performance improvement for Ab Initio graphs.
  • Writing R language code for statistical computing and visualization of the Hive data for generating reports.
  • Document the installation, deployment, administration and operational processes of Talend MDM platform environments for ETL projects.
  • Analyzes FACETS for Group information, enrolling subscribers, Adding members, Related Entities, Class/Plan definition and premium rate tables.
  • Developed proof of concept(POC) for real time data ingestion using Kafka, Storm, Zookeeper, Hbase.
  • Handled importing of data from various data sources, performed transformations using Hive, Map Reduce, loaded data into HDFS and Extracted the data fromMySQL into HDFS using Sqoop.
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
  • Used Maven to clean, compile, build and ANT to deploy the jars in HDFS.
  • Developed Simple to complex Map/reduce Jobs.
  • Logical Implementation and interaction with HBase, MongoDB.
  • Successfully loaded files to Hive and HDFS from Oracle, SQL Server, MySql, and Teradata using Sqoop.
  • Worked hands on with ETL process. Handled importing data from various sources, performed transformations.
  • Involved in testing FACETS for Group information, enrolling subscribers, Adding members, Related Entities, Class/Plan definition and premium rate tables.
  • Experienced with ETL to load data into Hadoop/NoSQL
  • Analyzed the data by performing Hive queries and running Pig scripts to know user behavior.
  • Developed Simple to complex Map/reduce Jobs using Hive.
  • Created partitioned tables in Hive.
  • Worked on Installed and configured Hadoop 0.22.0 Map Reduce, HDFS, developed multiple Map Reduce jobs in java for data cleaning and preprocessing.
  • Solution planning with java EE and Cassandra.
  • Developed Map reduce code for Apache spark in Python and Scala
  • Importing and exporting data into HDFS and HIVE using Sqoop
  • Experience with Maven for structured build Unix shell scripting.
  • Responsible to manage data coming from different sources
  • Monitoring the running Map Reduce programs on the cluster.
  • Responsible for loading data from UNIX file systems to HDFS.
  • Experienced in scheduling jobs using TIDAL job scheduler.
  • Developed PIG scripts for source data validation and transformation.
  • Design solution architecture which include TIDAL jobs, SISS packages, Database objects.
  • Experienced in handling different file formats like Text file, Avro data files, Sequence files, Xml and Json files
  • Experienced in configuring work flows, submitting jobs, implementing schedulers using Cisco Tidal.
  • Wrote Java programs to generate reports to meet the business requirement using JAVA API’s and data from Pivotal HAWQ tables.

Environment: Pivotal HD 2.0, HDFS, HAWQ, ANT, Sqoop, Talend, Ab Initio, Hbase, Cloudera, Teradata, ETL, Hive, Pig, SQL, PostgreSQL, R Language, pgadmin, NoSQL, Storm, Cisco tidal, shell scripting, Python, Java, Cassandra.

Confidential, Greenwood village

Hadoop Developer/Admin

Responsibilities:

  • Involved in Installing, Configuring Hadoop Eco System, Cloudera Manager using CDH4 Distribution.
  • Responsible to manage data coming from different sources and involved in HDFS maintenance and loading of structured and unstructured data
  • Imported data using Sqoop from Tera data using Tera data connector.
  • Possess good Linux and Shell Scripting and familiarity with open source configuration management and deployment tools such as puppet or chef.
  • Involved in design and developed Storm and Kafka based pipeline.
  • Tested and validated Identity Management CITT implementation as it pertains to the upgrade to CARE and FACETS enrollment system.
  • Install OS and administrated Hadoop Stack with CDH5(with YARN) Cloudera distribution including configuration management, monitoring, debugging and performance management.
  • Integrated Quartz scheduler with Oozie work flows to get data from multiple data sources parallel using fork.
  • Responsible for architecting integrated HIPAA, Medicare solutions, FACETS.
  • Responsible for performing Predictive analysis on top of the customer usage data using R Language.
  • Responsible for developing a metadata configurable driven to execute Druid and Hive queries via report query engine against the data warehousing (Hadoop).
  • Troubleshooting, debugging and fixing Talend specific issues, while maintaining the health and performance of the ETL environment.
  • Developed Complex generic graphs in Ab Initio
  • Perform system testing using Informatica and Tidal jobs for validation.
  • Created build scripts using Maven and ANT
  • Processed Multiple Data sources input to same Reducer using Generic Writable and Multi Input format.
  • Job scheduling through Cron Tab and TIDAL
  • Created Data Pipeline of Map Reduce programs using Chained Mappers.
  • Exported the patterns analyzed back to Teradata using Sqoop.
  • Developed Ab Initio graphs using databases, dataset, repartition, transform, sort and partition components for extracting, loading and transforming external data by creating DML’s, XFR’s, SQL’s.
  • Visualize the HDFS data to customer using BI tool with the help of Hive ODBC Driver.
  • Implemented Optimized join base by joining different data sets to get top claims based on state using Map Reduce.
  • Worked Big data processing of clinical and non clinical data using Map Reduce.
  • Implemented complex map reduce programs to perform joins on the Map side using Distributed Cache in Java.
  • Migrated ETL jobs to Pig scripts do transformations, even joins and some pre-aggregations before storing the data onto HDFS.
  • Worked on implementing SPARK with SCALA
  • Extracted the data from Teradata into HDFS using the Sqoop.
  • Data protection and privacy configurations on sensitive databases like MYSQL, MongoDB
  • Responsible for importing log files from various sources into HDFS using Flume
  • Created customized BI tool for manager team that perform Query analytics using HiveQL.
  • Used Hive and Pig to generate BI reports.
  • Imported data using Sqoop to load data from MySQL to HDFS on regular basis
  • Worked hands on with ETL process using Python and Java.
  • Modifying/writing scripts in Bash and korn shell for optimizing day to day administration.
  • Created Partitions, Buckets based on State to further process using Bucket based Hive joins.
  • Created Hive Generic UDF's to process business logic that varies based on policy.
  • Moved Relational Data base data using Sqoop into Hive Dynamic partition tables using staging tables.
  • Optimizing the Hive queries using Partitioning and Bucketing techniques, for controlling the data distribution
  • Worked on custom Pig Loaders and storage classes to work with variety of data formats such as JSON and XML file formats
  • Logical implementation with Hbase.
  • Experienced with different kind of compression techniques like LZO, GZip, Snappy.
  • Used Oozie workflow engine to manage interdependent Hadoop jobs and to automate several types of Hadoop jobs such as Java map-reduce Hive, Pig, and Sqoop.
  • Developed Unit test cases using Junit, Easy Mock and MRUnit testing frameworks.
  • Experienced in Monitoring Cluster using Cloudera manager.

Environment: Hadoop, HDFS, HBase, MongoDb, ANT, Talend, Druid, Storm, Cloudera, Teradata, ETL, Deployment tools, Bash, Korn, R Language, Spark, MapReduce, Python, Maven, TeraData, Java, Hive, Pig, Sqoop, Ab Initio, Flume, Oozie, Hue, SQL, ETL, Cloudera Manager, MySQL.

Confidential - Austin, TX

Hadoop Developer/Admin

Responsibilities:

  • Handled importing of data from various data sources, performed transformations using Hive, Map Reduce, loaded data into HDFS and Extracted the data fromMySQL into HDFS using Sqoop.
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
  • Developed Simple to complex Map/reduce Jobs.
  • Real time streaming the data using Spark and Kafka.
  • Analyzed the data by performing Hive queries and running Pig scripts to know user behavior.
  • Developed Simple to complex Map/reduce Jobs using Hive.
  • Created partitioned tables in Hive.
  • Administered and supported distribution of Hortonworks.
  • Wrote Korn shell, Bash shell, Pearl scripts to automate most DB maintenance tasks.
  • Worked on Installed and configured Hadoop 0.22.0 Map Reduce, HDFS, developed multiple Map Reduce jobs in java for data cleaning and preprocessing.
  • Importing data into hdfs using Spark and Kafka.
  • Used Maven for continuous build integration and deployment.
  • Importing and exporting data into HDFS and HIVE using Sqoop
  • Developed and tested scripts in Python
  • Responsible to manage data coming from different sources
  • Monitoring the running Map Reduce programs on the cluster.
  • Responsible for loading data from UNIX file systems to HDFS.
  • Installed and configured Hive and also wrote Hive UDFs.
  • Involved in creating Hive Tables, loading with data and writing Hive queries which will invoke and run Map Reduce jobs in the backend.
  • Implemented the workflows using Apache Oozie framework to automate tasks.
  • Developed scripts and automated data management from end to end and sync up b/w all the clusters.

Environment: Apache Hadoop, Java (jdk1.6), Bash, Spark, Kafa, Korn, Hortonworks, Deployment tools, Python, Data tax, Flat files, Oracle 11g/10g, MySql, Toad 9.6, Window NT, UNIX, Sqoop, Hive, Oozie.

Confidential - San Ramon, CA

Hadoop Developer

Responsibilities:

  • Analyzed large data sets by running Hive queries and Pig scripts
  • Involved in creating Hive tables, and loading and analyzing data using hive queries
  • Developed Simple to complex Map Reduce Jobs using Hive and Pig
  • Involved in runningHadoopjobs for processing millions of records of text data
  • Load and transform large sets of structured, semi structured and unstructured data
  • Responsible to manage data coming from different sources
  • Implemented Partitioning, Dynamic Partitions, Buckets in HIVE.
  • Familiarity with a NoSQL database such as MongoDb.
  • Performed Hadoop installation, configuration of multiple nodes in AWS-EC2 using Hortonworks platform.
  • Created Oracle Schedules and Control-M jobs for execution of some Oracle stored procedures on a scheduled basis.
  • Designed the ETL process from various sources into the Hadoop/HDFS for analysis and further processing.
  • Monitor System health and logs and respond accordingly to any warning or failure conditions.
  • Implemented the workflows using Apache Oozie framework to automate tasks.
  • Worked with application teams to install operating system,Hadoopupdates, patches, version upgrades as required
  • Developed unit test cases for Hadoop MapReduce jobs with MRUnit
  • Developed multiple Map Reduce jobs in java for data cleaning and preprocessing
  • Involved in loading data from LINUX file system to HDFS
  • Worked with Informatica 8.6x and above (Source Analyzer, Mapping Designer, Mapplet Designer, Transformations Designer, Warehouse Designer, Repository Manager, and Workflow Manager/Server Manager). Learnt Talend on special interest and used it for the project to make them easy.
  • Responsible for analyzing, designing and coding of applications using Perl.
  • Worked on data conversion by extracting data from databases, reform data and load data into Cassandra nodes.
  • Implemented data ingestion and handling clusters in real time processing using Kafka.
  • Experienced in runningHadoopstreaming jobs to process terabytes of xml format data.
  • Load and transform large sets of structured, semi structured and unstructured data.
  • Assisted in exporting analyzed data to relational databases using Sqoop
  • Supported Map Reduce Programs those are running on the cluster
  • Created and maintained Technical documentation for launchingHADOOPClusters and for executing Hive queries and Pig Scripts

Environment: Hadoop, HDFS, Pig, Hive, Map Reduce, Hortonworks, Talend, ETL, Perl, MongoDB, Sqoop, Oozie, Kafka and Big Data,Python, Apache Java (jdk1.6), Data tax, Flat files, Oracle 11g/10g, MySQL, Toad, Windows NT, LINUX, Cassandra.

Confidential - Phoenix, AZ

Java Programmer

Responsibilities:

  • Used Rational Rose for Use Case Diagram, Class Diagrams, Sequence diagrams and Object diagrams in design phase.
  • Involved in creation of UML diagrams like Class, Activity, and Sequence Diagrams using modeling tools of IBM Rational Rose
  • Involved in the full life cycle development of the modules for the project.
  • Used Eclipse IDE for application developmentand expertise on UNIX shell scripting, Python.
  • Used Spring framework for dependency injection and hands on experience with Lambda Expressions.
  • Worked with Spring AOP for transaction and logging.
  • Design, Build and Maintain automated load test scripts using Neustar or JMeter load test tools.
  • Used JBoss application server for deploying applications.
  • Used SOAP XML Web services for transferring data between different applications.
  • Developed web services using top down approach from WSDL to Java.
  • Used MVC design pattern for designing application, JSP as the view component.
  • Persistence layer was implemented using Hibernate Framework. Integrated Hibernate with spring framework.
  • Worked with complex SQL queries, SQL Joins and Stored Procedures using TOAD for data retrieval and update.
  • Used JUnit for performing unit testing.
  • Used Log4J to capture the logs that included runtime exceptions.

Environment: Eclipse, Web Services, UML, Struts (MVC), Lambda Expressions, Shell Scripting, Hibernate, Python, spring, JSP, WSDL, JMS, Rational Rose, JavaScript, Junit, PL/SQL, Oracle 10G, SVN

Confidential

JAVA/J2EE Developer

Responsibilities:

  • Developing light weight business component and integrated applications using struts
  • Designing and developing front-end, middleware and back-end applications.
  • Optimizing server/client side validation.
  • Transfer old Perl scripts into new Python scripts, add new functions and features. Develop automated test method and documentations for these scripts.
  • Worked together with the team in helping transition from Oracle to DB2.
  • Developed the global logging module which was used across all the modules using Log4Jcomponents.
  • Developed the presentation layer for the credit enhancement module in JSP.
  • Struts were used to implement the Model View Layer (MVC) architecture. Validations were done on the client side as well as the server side.
  • Involved in the configuration management using ClearCase.
  • Extensive experience in working with LINQ to objets, LINQ to XML and Lambda Expressions.
  • Detecting and resolving errors/defects in the quality control environment.
  • Using Ibatis for mapping Java classes with database.
  • Involved in Code review and integration testing.
  • Used Debugging tools such as PMD, Find Bugs and checkstyle.

Environment: Java v1.6, J2EE 6, Struts 1.2, iBatis, XML, Lambda Expressions. Perl, JSP, CSS, Python, HTML, JAVASCRIPT, JQuery, Oracle 10g, DB2, Unix, RAD, NetBeans, Clear Case, WebSphere V8.0 (beta)

We'd love your feedback!