Hadoop/Big Data Developer Resume Fort Collins, CO. - Hire IT People

SUMMARY:

Overall 8+ years of proactive IT experience in Analysis, Design, Development, Implementation, and Testing of software applications which includes an accomplished almost 4+ Years of experience in Bigdata, Development, and Design of Java based enterprise applications.
Leveraged strong skills in developing applications involving Bigdata technologies like Hadoop, Yarn, Flume, Hive, Pig, Sqoop, HBase, Cloudera, MapR, Avro, Spark and Scala.
Extensively worked on major components of Hadoop Ecosystem like HDFS, HBase, Hive, Sqoop, PIG.
Developed various scripts, numerous batch jobs to schedule various Hadoop programs.
Experience in analysing data using HiveQL, and custom programs in Java.
Hands on experience in importing and exporting data from different databases like Oracle, MySQL, into HDFS and Hive using Sqoop.
Implemented Flume for collecting, aggregating and moving a large number of server logs and streaming data to HDFS.
Hands on experience in spark, Scala, and Mark logic.
Extensively used Design Patterns to solve complex programs.
Developed Hive queries for data analysis to meet the business requirements.
Experience in extending Hive and Pig core functionality by writing custom UDFs like UDAFs and UDTFs.
Experienced implementing Security mechanism for Hive Data.
Extensively used ETL methodology for performing Data Profiling, Data Migration, Extraction, Transformation and loading using Talend/SSIS and designed data conversions from a wide variety of source systems.
Extensively created mappings in Talend using tMap, tJoin, tReplicate, tParallelize, tJava, tJavarow, tDie, tAggregateRow, tWarn, tLogCatcher, tMysqlScd, tFilter, tGlobalmap etc.
Experience with Hive Queries Performance Tuning.
Experienced in improving data cleansing process using Pig Latin operations, transformations and join operations.
Extensive knowledge of NoSQL database like HBase.
Experienced in performing CRUD operations using HBase Java Client API and Rest API.
Experience in designing both time drove and data driven automated workflows using Bedrock and Talend.
Good knowledge in creating PL/SQL Stored Procedures, Creating indexes, Packages, Functions, Triggers, Cursors with Oracle (9i, 10g, 11g), and MySQL server.
Expert in designing and writing on - demand UNIX shell scripts.
Extensively worked with Object Oriented Analysis, Design (OOAD) and development of software using UML Methodology, good knowledge of J2EE design patterns and Core Java Design Patterns.
Excellent Java development skills using J2EEFrameworks like Struts, EJBs and Web Services.
Proficient in development methodologies such as Scrum, Agile, and Waterfall.
Passion to excel in any assignment and have good debugging and problem-solving skills.
Ability to work under high pressure and close deadlines.
Excellent adaptability, ability to learn, good analytical and programmatic skills.

TECHNICAL SKILLS:

Hadoop/Big Data Technologies: HDFS, HBase, Hive, Pig, Impala, SQOOP, Flume, OOZIE, Spark, SparkQL, and Zookeeper, AWS, Cloudera, Hortonworks, Kafka, Avro, BigQuery.

Languages: Core Java, XML, HTML and HiveQL.

J2EE Technologies: Servlets, JSP, JMS, JSTL, AJAX, DOJO, JSON and Blaze DS.

Frameworks: Spring 2, Struts 2 and Hibernate 3.

Application & Web Services: WebSphere 6.0, JBoss 4.X and Tomcat 5.

Scripting Languages: Java Script, Angular JS, Pig Latin, Python 2.7and Scala.

Database (SQL/No SQL): Oracle 9i, SQL Server 2005, MySQL, HBase and Mongo DB 2.2

IDE: Eclipse and Edit plus.

PM Tools: MS MPP, Risk Management, ESA.

Other Tools: SVN, Apache Ant, Junit and Star UML, TOAD, Pl/SQL Developer, Perforce, JIRA, Bugzilla, Visual Source, QC, Agile Methodology.

EAI Tools: TIBCO 5.6.

Bug Tracking/ Ticketing: Mercury Quality Centre and Service Now.

Operating System: Windows 98/2000, Linux /Unix and Mac.

PROFESSIONAL EXPERIENCE:

Confidential, Fort Collins, CO.

Hadoop/Big Data Developer

Responsibilities:

Translation of functional and technical requirements into detailed architecture and design.
Developed automated scripts for all jobs starting from pulling the data from Mainframes to HDFS system.
Developed and Designed ETL Applications and Automated using Oozie workflows and Shell scripts with error handling and mailing System.
Implemented nine nodes CDH4Hadoop cluster on Ubuntu LINUX.
Implemented programs by joining data sets from different sources using joins.
Optimized s programs by configuring configurationally parameters and implemented optimized joins.
Implemented map reduce solutions like Top-K, summarizations, data partitions using design patterns.
Implemented map reduce programs to handle different file formats like XML, Avro, sequence files and implemented compression techniques.
Developed hive queries according to business requirement.
Designed/created Hive Internal tables, partitions to store structured data.
Developed Hive custom UDF's to in corporate business logics into Hive Queries.
Used Hive Optimized file formats like ORC formats and Parquet formats.
Implemented Hive Serializes, desterilizes to handle Avro files, used Xpath expressions to handle XML files.
Importing & exporting data from RDBMS through Sqoop.
Designed Cassandra data modelling to analyse near real time analysis.
Configured Cassandra clustered, v-nodes, replication strategies, and configured data model using data stax community.
Ensured NFS is configured for Name Node.
Designed/Implemented time series data analysis using Cassandra file system.
Implemented CRUD operation on top of Cassandra data using CQL and Rest API.
Implemented data import/export data from structured data using Sqoop import/export options.
Implemented Sqoop saved jobs, incremental imports to import data.
Used Compression Techniques (snappy) with file formats to leverage the storage in HDFS.
Used cloud era manager to perform cluster monitoring, debug jobs, handle job submission on the cluster.
Successfully migrated Legacy application to Bigdata application using Hive/Pig in Production level.

Environment: Hadoop, HDFS, Map Reduce, Hive, Sqoop, Oozy, Cloudera, PIG, Java (JDK 1.6), Eclipse, MySQL and Ubuntu, Zookeeper.

Confidential, Boston, MA.

Hadoop/Big Data Developer

Responsibilities:

Involved in all phases of development activities from requirements collection to production support.
Detailed understanding of the current system and find out the different sources of data for EMR.
Involved in a Cluster setup.
Performed Batch processing of logs from various data sources using
Automated job Cloudera submission Via Jenkins scripts and Chef.
Predictive analytics (which can monitor inventory levels and ensure product availability)
Analysis of customers' purchasing behaviours in JMS.
Response to value-added services based on clients' profiles and purchasing habits
Worked on gathering and refining requirements, interviewing business users to understand and document data requirements including elements, entities, and relationships, in addition to visualization and report specifications.
Defined UDFs using PIG and Hive in order to capture customer behaviour.
Design and implement map reduce jobs to support distributed processing using java, Hive, Spark SQL and Apache Pig, Oozie.
Integrated Apache Kafka for data ingestion.
Create Scala, Hive external tables on the output before partitioning, bucketing is applied on top of it.
Providing Shell scripting graphs in order to show the trends
Maintenance of data importing scripts using HBase, Hive and jobs
Developed and maintain several batch jobs to run automatically depending on business requirements
Import and export data between the environments like MySQL, HDFS, and Unit testing and Deploying for internal usage monitoring performance of the solution.

Environment: Apache Hadoop, Cloudera, RHEL, Hive, HBase, PIG, HDFS, Java Map-Reduce, Core Java, Python, Maven, GIT, Jenkins, UNIX, MYSQL, Eclipse, Oozie, Sqoop, Flume and Cloudera Distribution, Oracle, Teradata, and MySQL.

Confidential, Provo, Utah

Hadoop Developer

Responsibilities:

Detailed Understanding of the existing build system, Tools related for information on various products and releases and test results information
Designed and implemented jobs to support distributed processing using java, Hive, and Apache Pig.
Developed UDF's to provide custom hive and pig capabilities using SOAP/RESTful services.
Built a mechanism for talend, automatically moving the existing proprietary binary format data files to HDFS using a service called Ingestion service.
Implemented a prototype to integrate PDF documents into a web application using GitHub.
Comprehensive knowledge and experience in process improvement, normalization/de-normalization, data extraction, data cleansing, Scrum data manipulation
Performed Scala, Data transformations in Scala, HIVE and used partitions, buckets for performance improvements.
Written custom Input format and record reader classes for reading and processing the binary format in.
Written Custom writable classes for Hadoop serialization and De serialization of Time series tuples.
Implemented Custom File loader for Pig so that we can query directly on the large Data files such as build logs
Used Python for pattern matching in build logs to format errors and warnings
Developed Pig Latin scripts & Shell script for validating the different query modes in Historian.
Created Hive external tables on the output before partitioning bucketing is applied on top of it.
Improved the Performance by Scala, tuning of HIVE and map reduce using talend, Active MQ, and JBoss.
Developed Daily Test engine using Python for continuous tests.
Used Shell scripting for Jenkins job automation with Talend.
Building a custom calculation engine which can be programmed according to user needs.
Ingestion of data into Hadoop using Shell scripting for Scrum, Elastic Sqoop and apply data transformations and using Pig and HIVE.
Handled the performance improvement changes to Pre-Ingestion service which is responsible for generating the Bigdata Format binary files from an older version of Historian.
Worked with support teams and resolved operational & performance issues
Research, Scrum, evaluate and utilize new technologies/tools/frameworks around Hadoop eco system
Prepared graphs from test results posted to MIA.

Environment: Apache Hadoop, Hive, Scala, PIG, HDFS, Cloudera, Java Map-Reduce, Core Java, Python, Maven, GIT, Jenkins, UNIX, MYSQL, Eclipse, Oozie, Sqoop, Flume, Oracle, My SQL, and CDH4.X.

Confidential, North Bergen, New Jersey.

Hadoop Developer.

Responsibilities:

Worked on analysing Hadoop cluster using different Bigdata analytic tools including Hive, Pig.
Installed and configured the Hadoop cluster using the Cloudera's CDH distribution and monitored the cluster performance using the Cloudera Manager.
Monitored workload, job performance and capacity planning using Cloudera Manager.
Implemented schedulers on the Job tracker to share resources of the cluster for the jobs given by cluster.
Involved in Design, develop Hive Data model, loading with data and writing Java UDF for Hive.
Handled importing and exporting data into HDFS by developing solutions, analysed the data using, Hive and produce summary results from Hadoop to downstream systems.
Used Sqoop to import and export the data from Hadoop Distributed File System (HDFS) to RDBMS.
Created Hive tables and loaded data from HDFS to hive tables as per the requirement.
Established custom s programs to analyse data and used HQL queries for data cleansing.
Created components like Hive UDFs for missing functionality in Hive to analyse and process large volumes of data extracted from the No-SQL database Cassandra.
Collecting and aggregating substantial amounts of log data using Apache Flume and staging data in HDFS for further analysis.
Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
Worked on optimization of algorithm using combiners and partitions to deliver the best results and worked on Application performance optimization for an HDFS cluster.
Comprehensive knowledge and experience in process improvement, normalization/de-normalization, data extraction, data cleansing, Scrum data manipulation.

Environment: Cloudera Distribution (CDH), HDFS, Pig, Hive, Map Reduce, Sqoop, Hbase, Impala, Java, SQL, Cassandra.

Confidential

Java Developer.

Responsibilities:

Involved in the analysis, design, and development, testing phases of the Software Development Life Cycle (SDLC).
Analysis, design, and development of Application based on J2EE using Struts and Tiles, Spring 2.0 and Hibernate 3.0.
Involved in interacting with the Business Analyst and Architect during the Sprint Planning Sessions.
Used XML Web Services for transferring data between different applications.
Used Apache CXF web service stack for developing web services and SOAPUI and XML-SPY for testing web services.
Used JAXB for binding XML to Java.
Used SAX and DOM parsers to parse XML data.
Hibernate was used for Object-Relational mapping with Oracle database.
Worked with Spring IOC for injecting the beans and reduced the coupling between the classes.
Implemented Spring IOC (Inversion of Control)/DI (Dependency Injection) for wiring the object dependencies across the application.
Implemented spring transaction management for implementing transactions for the application.
Implemented design patterns for Service Locator.
Performed unit testing using JUnit3, Easy Mock Testing Framework for performing Unit testing.
Worked on PL/SQL stored procedures using PL/SQL Developer.
Involved in Fixing the production Defects for the application.
Used ANT as build-tool for building J2EE applications.

Environment: HDFS, HBase, HDP Horton, Sqoop, Data Processing Layer, HUE, AZURE, Erwin, MS Visio, Tableau, SQL, MongoDB, Oozie, UNIX, MySQL, RDBMS, Ambari, Solr Cloud, PL/SQL, TOAD, Java.

Confidential

Data Analyst.

Responsibilities:

Involved in design and implementation of server-side programming.
Involved in gathering requirements, analysed them and prepared high-level documents.
Participated in all client meetings to understand the requirements.
Actively involved in designing and data modelling using Rational Rose Tool (UML)
Involved in the design of the SPACE database.
Designed and development of User Interfaces, Menus using HTML, JSP, JSP Custom Tag, JavaScript.
Implemented User Interface using spring tiles framework.
Tuxedo server, which provides case details, is fetched by the help of web services technology (i.e. binding, finding a service and use of XML message format etc.)
Involved in the integration system with BT's systems like GTC, CSS or through e Link hub and IBMMQ series.
Developed, Deployed and tested JSP's, Servlets in WebLogic.
Used Eclipse as IDE tool and integrated WebLogic with Eclipse to deploy and develop the applications and JDBC to connect the database.

Environment: Struts Framework, Java 1.3, XML, Data Modelling, JDBC, SQL, Pl/SQL, JMS, Web Services, SOAP, Solaris 9, ANT tool, Toad, Eclipse.

We provide IT Staff Augmentation Services!

Hadoop/big Data Developer Resume

Fort Collins, Co

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship