Hadoop/Big Data Developer Resume Atlanta, Georgia - Hire IT People

SUMMARY:

Overall 7+ years of proactive IT experience in Analysis, Design, Development, Implementation,and Testing of software applications which includes an accomplished almost 4+ Years of experience in Big Data, Development and Design of Java based enterprise applications.
Leveraged strong Skills in developing applications involving Big Data technologies like Hadoop, MapReduce, Yarn, Flume, Hive, Pig, Sqoop, HBase, PIvotal, Cloudera, MapR, Avro, Spark and Scala.
Extensively worked on major components of Hadoop Ecosystem likeHDFS, HBase, Hive, Sqoop, PIG,and Mapreduce.
Developed various scripts, numerous batch jobs to schedule various Hadoop programs.
Experience in analyzing data using HiveQL, and custom MapReduce programs in Java.
Hands on experience in importing and exporting data from different databases like Oracle, Mysql, into HDFS and Hive using Sqoop.
Implemented Flume for collecting, aggregating and moving a largenumber of server logs and streaming data to HDFS.
Hands on experience in spark, scala,and Marklogic.
Extensively used MapReduce Design Patterns to solve complex MapReduce programs.
Developed Hive queries for data analysis to meet the business requirements.
Experience in extending Hive and Pig core functionality by writing custom UDFs like UDAFs and UDTFs.
Experienced implementing Security mechanism for Hive Data.
Extensively used ETL methodology for performing Data Profiling, Data Migration, Extraction, Transformation and loading using Talend/SSIS and designed data conversions from a wide variety of source systems.
Extensively created mappings in Talend using tMap, tJoin, tReplicate, tParallelize, tJava, tJavarow, tDie, tAggregateRow, tWarn, tLogCatcher, tMysqlScd, tFilter, tGlobalmap etc.
Experience with Hive Queries Performance Tuning.
Experienced with improving data cleansing process using Pig Latin operations, transformations and join operations.
Extensive knowledge of NoSQL database like HBase.
Experienced with performing CRUD operations using HBase Java Client API and Rest API.
Experience in designing both time driven and data driven automated workflows using Bedrock and Talend.
Good knowledge in creating PL/SQL Stored Procedures, Creating indexes, Packages, Functions, Triggers, Cursors with Oracle (9i, 10g, 11g), and MySQL server.
Expert in designing and writing on - demand UNIX shell scripts.
Extensively worked with Object Oriented Analysis, Design (OOAD) and development of software using UML Methodology, good knowledge of J2EE design patterns and Core Java Design Patterns.
Excellent Java development skills using J2EE Frameworks like Struts, EJBs and Web Services.
Proficient in development methodologies such as Scrum, Agile,and Waterfall.
Passion to excel in any assignment and have good debugging and problem solving skills.Ability to work under high pressure and close deadlines.
Excellent adaptability, ability to learn, good analytical and programmatic skills.

TECHNICAL SKILLS:

Hadoop/Big Data Technologies: HDFS, Map Reduce, HBase, Hive, Pig, Impala, SQOOP, Flume, OOZIE, Spark, Spark QL, and Zookeeper, AWS, Cloud era, Horton works, Kafka, Avro, Big Query.

Languages: Core Java, XML, HTML and Hive QL.

J2EE Technologies: Servlets, JSP, JMS, JSTL, AJAX, DOJO, JSON and Blaze DS.

Frameworks: Spring 2, Struts 2 and Hibernate 3.

Application & Web Services: WebSphere 6.0, JBoss 4.X and Tomcat 5.

Scripting Languages: Java Script, Angular JS, Pig Latin, Python 2.7and Scala.

Database (SQL/No SQL): Oracle 9i, SQL Server 2005, MySQL, HBase and Mongo DB 2.2

IDE: Eclipse and Edit plus.

PM Tools: MS MPP, Risk Management, ESA.

Other Tools: SVN, Apache Ant, JUnit and Star UML, TOAD, Pl/SQL Developer, Perforce, JIRA, Bugzilla, Visual Source, QC, Agile Methodology.

EAI Tools: TIBCO 5.6.

Bug Tracking/ Ticketing: Mercury Quality Center and Service Now.

Operating System: Windows 98/2000, Linux /Unix and Mac.

PROFESSIONAL EXPERIENCE:

Confidential, Atlanta, Georgia

Hadoop/Big Data Developer

Responsibilities:

Translation of functional and technical requirements into detailed architecture and design.
Developed automated scripts for all jobs starting from pulling the data from Mainframes to HDFS system.
Developed and Designed ETL Applications and Automated using Oozie workflows and Shell scripts with error handling and mailing System.
Implemented nine nodes CDH4 Hadoop cluster on Ubuntu LINUX.
Implemented Map reduce programs by joining data sets from different sources using joins.
Optimized map reduces programs by configuring map reduce configurationally parameters and implemented optimized joins.
Implemented map reduce solutions like Top-K, summarizations, data partitions using map reduce design patterns.
Implemented map reduce programs to handle different file formats like Xml, Avro, sequence files and implemented compression techniques.
Developed hive queries according to business requirement.
Designed/created Hive Internal tables, partitions to store structured data.
Developed Hive custom UDF's to in corporate business logics into Hive Queries.
Used Hive Optimized file formats like ORC formats and Parquet formats.
Implemented Hive Serializes, desterilizes to handle Avro files, used Xpath expressions to handle Xml files.
Importing & exporting data from RDBMS through Sqoop.
Designed Cassandra data modeling to analyze near real time analysis.
Configured Cassandra clustered, v-nodes, replication strategies, and configured data model using data stax community.
Ensured NFS is configured for Name Node.
Designed/Implemented time series data analysis using Cassandra file system.
Implemented CRUD operation on top of Cassandra data using CQL and Rest API.
Implemented data import/export data from structured data using Sqoop import/export options.
Implemented Sqoop saved jobs, incremental imports to import data.
Used Compression Techniques (snappy) with file formats to leverage the storage in HDFS.
Used cloud era manager to perform cluster monitoring, debug map reduce jobs, handle job submission on the cluster.
Successfully migrated Legacy application to Big Data application using Hive/Pig in Production level.

Environment: Hadoop, HDFS, Map Reduce, Hive, Sqoop, OOzie, Cloudera, PIG, Java (JDK 1.6), Eclipse, MySQL and Ubuntu, Zookeeper.

Confidential, New York

Hadoop/Big Data Developer

Responsibilities:

Involved in all phases of development activities from requirements collection to production support.
Detailed understanding of the current system and find out the different sources of data for EMR.
Involved in a Cluster setup.
Performed Batch processing of logs from various data sources using MapReduce
Automated job cloudera submission Via Jenkins scripts and Chef.
Predictive analytics (which can monitor inventory levels and ensure product availability)
Analysis of customers' purchasing behaviors in JMS.
Response to value-added services based on clients' profiles and purchasing habits
Worked on gathering and refining requirements, interviewing business users to understand and document data requirements including elements, entities,and relationships, in addition to visualization and report specifications.
Defined UDFs using PIG and Hive in order to capture customer behavior.
Design and implement map reduce jobs to support distributed processing using java, Hive, Spark SQL and Apache Pig, Oozie
Integrated Apache Kafka for data ingestion.
Create Scala, Hive external tables on the map reduce output before partitioning, bucketing is applied on top of it.
Providing Shell scripting pivotal graphs in order to show the trends
Maintenance of data importing scripts using HBase, Hive and Map reduce jobs
Developed and maintain several batch jobs to run automatically depending on business requirements
Import and export data between the environments like MySQL, HDFS and Unit testing and Deploying for internal usage monitoring performance of solution.
Environment:Apache Hadoop, Cloudera, RHEL, Hive, HBase, PIG, HDFS, Java Map-Reduce, Core Java, Python, Maven, GIT, Jenkins, UNIX, MYSQL, Eclipse, Oozie, Sqoop, Flume and Cloudera Distribution, Oracle, Teradata,and MySql.

Confidential, Provo, Utah

Hadoop Developer Dec

Responsibilities:

Detailed Understanding on the existing build system, Tools related for information of various products and releases and test results information
Designed and implemented map reduce jobs to support distributed processing using java, Hive and Apache Pig.
Developed UDF's to provide custom hive and pig capabilities using SOAP/RESTful services.
Built a mechanism for talend, automatically moving the existing proprietary binary format data files to HDFS using a service called Ingestion service.
Implemented a prototype to integrate PDF documents into a web application using Github.
Comprehensive knowledge and experience in process improvement, normalization/de-normalization, data extraction, data cleansing, Scrum data manipulation
Performed Scala, Data transformations in Scala, HIVE and used partitions, buckets for performance improvements.
Written custom Input format and record reader classes for reading and processing the binary format in MapReduce.
Written Custom writable classes for Hadoop serialization and De serialization of Time series tuples.
Implemented Custom File loader for Pig so that we can query directly on the large Data files such as build logs
Used Python for pattern matching in build logs to format errors and warnings
Developed Pig Latin scripts & Shell scrip for validating the different query modes in Historian.
Created Hive external tables on the map reduce output before partitioning; bucketing is applied on top of it.
Improved the Performance by Scala, tuning of HIVE and map reduce using talend, ActiveMQ,and JBoss.
Developed Daily Test engine using Python for continuous tests.
Used Shell scripting for Jenkins job automation with Talend.
Building a custom calculation engine which can be programmed according to user needs.
Ingestion of data into Hadoop using Shell scripting for Scrum, Elastic Sqoop and apply data transformations and using Pig and HIVE.
Handled the performance improvement changes to Pre Ingestion service which is responsible for generating the Big Data Format binary files from an older version of Historian
Worked with support teams and resolved operational & performance issues
Research, Scrum, evaluate and utilize new technologies/tools/frameworks around Hadoop eco system
Prepared graphs from test results posted to MIA

Environment: Apache Hadoop, Hive, Scala, PIG, HDFS, Cloud era, Java Map-Reduce, Core Java, Python, Maven, GIT, Jenkins, UNIX, MYSQL, Eclipse, Oozie, Sqoop, Flume, Oracle, My SQL, and CDH4.X.

Confidential

Hadoop Developer

Responsibilities:

Worked on analyzing Hadoop cluster using different Big Data analytic tools including Hive, Pig and Map Reduce.
Installed and configured the Hadoop cluster using the Cloud era's CDH distribution and monitored the cluster performance using the Cloud era Manager.
Monitored workload, job performance and capacity planning using Cloud era Manager.
Implemented schedulers on the Job tracker to share resources of the cluster for the Map Reduce jobs given by cluster.
Involved in Design, develop Hive Data model, loading with data and writing Java UDF for Hive.
Handled importing and exporting data into HDFS by developing solutions, analyzed the data using Map Reduce, Hive and produce summary results from Hadoop to downstream systems.
Used Sqoop to import and export the data from Hadoop Distributed File System (HDFS) to RDBMS.
Created Hive tables and loaded data from HDFS to hive tables as per the requirement.
Established custom Map Reduces programs to analyze data and used HQL queries for data cleansing.
Created components like Hive UDFs for missing functionality in Hive to analyze and process large volumes of data extracted from No-SQL database Cassandra.
Collecting and aggregating substantial amounts of log data using Apache Flume and staging data in HDFS for further analysis.
Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
Worked on optimization of Map Reduce algorithm using combiners and partitions to deliver the best results and worked on Application performance optimization for an HDFS cluster.
Comprehensive Knowledge and experience in process improvement, normalization/de-normalization, data extraction, data cleansing, Scrum data manipulation.

Environment: Cloud era Distribution (CDH), HDFS, Pig, Hive, Map Reduce, Sqoop, Hbase, Impala, Java, SQL, Cassandra.

Confidential

B ig Data/Hadoop Developer

Responsibilities:

Involved in the analysis, design, and development, testing phases of Software Development Life Cycle (SDLC)
Analysis, design and development of Application based on J2EE using Struts and Tiles, Spring 2.0 and Hibernate 3.0.
Developed the services to run Map Reduce jobs as per the daily requirement.
Involved in creating Hive tables, loading them with data and writing hive queries.
Involved in optimizing Hive Queries, joins to get better results for Hive ad-hoc queries.
Used Pig to perform data transformations, event joins, filter and some pre-aggregations before storing the data into HDFS.
Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
Hands on experience with NoSQL databases like HBase for POC (proof of concept) in storing URL's, images, products and supplements information at real time.
Developed integrated dash board to perform CRUD operations on HBase data using Thrift API.
Implemented error notification module to support team using HBase co-processors (Observers).
Configured, integrated Flume sources, channels, destinations to analyze log data in HDFS.
Implemented flume custom interceptors to perform cleansing operations before moving data onto HDFS.
Involved in troubleshooting errors in Shell, Hive and Map Reduce.
Worked on debugging, performance tuning of Hive & Pig Jobs.
Developed Oozie workflows which are scheduled monthly.

Environment: Map Reduce, HDFS, HBase, HDP Horton, Sqoop, Data Processing Layer, HUE, AZURE, Erwin, MS Visio, Tableau, SQL, MongoDB, Oozie, UNIX, MySQL, RDBMS, Ambari, Solr Cloud, PL/SQL, TOAD, Java.

Confidential

Java Developer

Responsibilities:

Involved in coding, designing, documenting, debugging and maintenance of several applications.
Involved in creation of SQL tables, indexes and was involved in writing queries to read/manipulate data.
Used JDBC to establish connection between the database and the application.
Created the user interface using HTML, CSS and JavaScript.
Maintenance and support of the existing applications.
Responsible for the development of database SQL queries
Created/modified shell scripts for scheduling and automating tasks.
Wrote unit test cases using Junit framework.

Environment: Java, J2EE, Servlets, JSP, SQL, PL/SQL, HTML, JavaScript, CSS, Eclipse, Oracle, MYSQL, IBM WebSphere, JIRA.

We provide IT Staff Augmentation Services!

Hadoop/big Data Developer Resume

Atlanta, GeorgiA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship