Hadoop (Big Data) Developer/Admin Resume Minneapolis, MN - Hire IT People

SUMMARY

8+ years of professional experience in Software Development and Requirement Analysis in Agile work environment with 4+ years of Big Data Ecosystems experience in ingestion, storage, querying, processing and analysis of Big Data.
Experience in dealing with Apache Hadoop components like HDFS, MapReduce, Hive, HBase, Pig, Sqoop, Oozie, Mahout, Python, Spark, Storm, Cassandra, MongoDB, Big Data and Big Data Analytics.
Good understanding/knowledge of Hadoop Architecture and various components such as HDFS, JobTracker, TaskTracker, NameNode, DataNode, Secondry Namenode, and MapReduce concepts.
Experienced managing No - SQL DB on large Hadoop distribution Systems such as: Cloudera, Hortonworks HDP, MapR M series etc.
Experienced developing Hadoop integration for data ingestion, data mapping and data process capabilities.
Experienced in building analytics for structured and unstructured data and managing large data ingestion using technologies like Kafka/Avro/Thift.
Worked with various data sources such as Flat files and RDBMS-Teradata, SQL server 2005, Netezza and Oracle. Extensive work in ETL process consisting of data transformation, data sourcing, mapping, conversion.
Exceptional ability to quickly master new concepts and capable of working in groups as well as independently.
Has good knowledge of virtualization and worked on VMware Virtual Center.
Excellent working knowledge of different statistical analysis tools like SPSS and Microsoft Excel.
Hands on experience in installing, configuring, and using Hadoop ecosystem components like Hadoop MapReduce, HDFS, HBase, Hive, Sqoop, Pig, Zookeeper, Storm, Spark, Kafka and Flume.
Strong understanding of Data Modeling and experience with Data Cleansing, Data Profiling and Data analysis.
Experience in ETL (Datastage ) analysis, designing, developing, testing and implementing ETL processes including performance tuning and query optimizing of databases.
Experience in extracting source data from Sequential files, XML files, Excel files, transforming and loading it into the Confidential data warehouse.
Strong experience with Java/J2EE technologies such as Core Java, JDBC, JSP, JSTL, HTML, JavaScript, JSON
Proficiency in programming with different IDE's like Eclipse, NetBeans.
Involved in database design, creating Tables, Views, Stored Procedures, Functions, Triggers and Indexes.
Good understanding of service oriented architecture (SOA) and web services like XML, XSD, XSDL, SOAP.
Good Knowledge about scalable, secure cloud architecture based on Amazon Web Services (leveraging AWS cloud services: EC2, Cloud Formation, VPC, S3, etc.
Good Knowledge on Hadoop Cluster architecture and monitoring the cluster.
In-depth understanding of Data Structure and Algorithms.
Experience in managing and troubleshooting Hadoop related issues.
Expertise in setting up standards and processes for Hadoop based application design and implementation.
Experience in importing and exporting data using Sqoop from Relational Database Systems to HDFS and vice-versa.
Experience in Object Oriented Analysis, Design (OOAD) and development of software using UML Methodology, good knowledge of J2EE design patterns and Core Java design patterns.
Experience in managing Hadoop clusters using Cloudera Manager.
Very good experience in complete project life cycle (design, development, testing and implementation) of Client Server and Web applications.
Extensive experience working in Oracle, Netezza, DB2, SQL Server and My SQL database.
Hands on experience in VPN, Putty, winSCP, VNCviewer, etc.
Hands on experience in application development using Java, RDBMS, and Linux shell scripting.
Ability to adapt to evolving technology, strong sense of responsibility and accomplishment.

TECHNICAL SKILLS

Hadoop ECO Systems: HDFS, Map Reducing, HDFS, Oozie, Hive, Pig, Sqoop, Flume, Zookeeper and HBase, Cassandra

NO SQL: HBase, Cassandra, MongoDB

Data Bases: MS SQL Server 2000/2005/2008/2012 , MY SQL, Oracle 9i/10g

Languages: Languages Java JDK1.4 1.5 1.6 (JDK 5 JDK 6), C/C++, SQL, PL/SQL.

Operating Systems: Windows Server 2000/2003/2008 , Windows XP/Vista, Mac OS, UNIX, LINUX

Java Technologies: Servlets, JavaBeans, JDBC, JNDI

Frame Works: JUnit and JTest

IDE’s & Utilities: Eclipse, Maven, NetBeans.

SQL Server Tools: SQL Server Management Studio, Enterprise Manager, QueryAnalyser, Profiler, Export & Import (DTS).

WebDev. Technologies: ASP.NET, HTML,XML

PROFESSIONAL EXPERIENCE

Confidential - Minneapolis, MN

Hadoop (Big Data) Developer/Admin

Responsibilities:

Installed, configured, and maintained Apache Hadoop clusters for application development and major components of Hadoop Ecosystem: Hive, Pig, HBase, Sqoop, Flume, Oozie and Zookeeper.
Used Sqoop to transfer data between RDBMS and HDFS.
Involved in collecting and aggregating large amounts of streaming data into HDFS using Flume and defined channel selectors to multiplex data into different sinks.
Implemented complex map reduce programs to perform map side joins using distributed cache.
Designed and implemented custom writable, custom input formats, custom partitions and custom comparators in Mapreduce.
Thoroughly tested Mapreduce programs using MRUnit and Junit testing frameworks.
Responsible for troubleshooting issues in the execution of Mapreduce jobs by inspecting and reviewing log files.
Converted existing SQL queries into Hive QL queries.
Implemented UDFs, UDAFs, UDTFs in java for hive to process the data that can’t be performed using Hive inbuilt functions.
Effectively used Oozie to develop automatic workflows of Sqoop, Mapreduce and Hive jobs.
Exported the analyzed data into relational databases using Sqoop for visualization and to generate reports for the BI team.
Gathered the business requirements from the Business Partners and Subject Matter Experts.
Utilized Agile Scrum Methodology to help manage and organize a team of 4 developers with regular code review sessions.
Weekly meetings with technical collaborators and active participation in code review sessions with senior and junior developers.
Loaded and analyzed Omniture logs generated by different web applications.
Loaded and transformed large sets of structured, semi structured and unstructured data in various formats like text, zip, XML and JSON.
Refined the Website clickstream data from Omniture logs and moved it into Hive.
Wrote multiple MapReduce programs to power data for extraction, transformation and aggregation from multiple file formats including XML, JSON, CSV & other compressed file formats.
Defined job flows and developed simple to complex Map Reduce jobs as per the requirement.
Optimized Map/Reduce Jobs to use HDFS efficiently by using various compression mechanisms.
Developed PIG UDFs for manipulating the data according to Business Requirements and also worked on developing custom PIG Loaders.
Worked on developing ETL processes (Data Stage open Studio) to load data from multiple data sources to HDFS using FLUME and SQOOP, and performed structural modifications using Map Reduce, HIVE.
Responsible for creating Hive tables based on business requirements.
Developed Scala and SQL code to extract data from various databases.
Worked on regular expression related text-processing using the in-memory computing capabilities of Spark using Scala.
Implemented Partitioning, Dynamic Partitions and Buckets in HIVE for efficient data access.
Involved in NoSQL database design, integration and implementation.
Loaded data into NoSQL database Hbase.
Knowledge on handling Hive queries using Spark SQL that integrate Spark environment.
Also, explored Spark MLIB library to do POC on recommendation engines.

Environment: Hadoop, CDH4, Map Reduce, HDFS, Pig, Hive, Impala, Oozie, Java, spark, Kafka, Flume, Storm, Knox, Linux, Scala, Maven, Java Scripting, Oracle 11g/10g, SVN

Confidential, Calabasas, CA

Hadoop Developer

Responsibilities:

Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, Hbase database and Sqoop.
Responsible for building scalable distributed data solutions using Hadoop.
Implemented nine nodes CDH3 Hadoop cluster on CentOS
Implemented Apache Crunch library on top of map reduce and spark for data aggregation.
Involved in loading data from LINUX file system to HDFS.
Worked on installing cluster, commissioning & decommissioning of datanode, name node recovery, capacity planning, and slots configuration.
Created HBase tables to store variable data formats of PII data coming from different portfolios.
Implemented a script to transmit sysprin information from Oracle toHbase using Sqoop.
Implemented best income logic using Pig scripts and UDFs.
Implemented test scripts to support test driven development and continuous integration.
Worked on tuning the performance Pig queries.
Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
Responsible to manage data coming from different sources.
Involved in loading data from file system to HDFS.
Usage of Impala for the high throughput SQL queries.
Load and transform large sets of structured, semi structured and unstructured data
Cluster coordination services through Zookeeper.
Experience in managing and reviewing Hadoop log files.
Job management using Fair scheduler.
Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
Responsible for cluster maintenance, adding and removing cluster nodes, cluster monitoring and troubleshooting.
Installed Oozie workflow engine to run multiple Hive and pig jobs.
Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
Supported in setting up QA environment and updating configurations for implementing scripts with Pig and Sqoop.

Environment: Hadoop, HDFS, Pig, Sqoop, HBase, Shell Scripting, CDH3, CentOS

Confidential, San Mateo, CA

Hadoop Developer

Responsibilities:

Involved in review of functional and non-functional requirements.
Facilitated knowledge transfer sessions.
Installed and configured Hadoop Mapreduce, HDFS, Developed multiple MapReduce jobs in java for data cleaning and pre-processing.
Importing and exporting data into HDFS and Hive using Sqoop.
Experienced in defining job flows.
Experienced in managing and reviewing Hadoop log files.
Connected to the external servers through VPN, putty, VNCViewer
Extracted files from CouchDB through Sqoop and placed in HDFS and processed.
Experienced in running Hadoop streaming jobs to process terabytes of xml format data.
Load and transform large sets of structured, semi structured and unstructured data.
Responsible to manage data coming from different sources.
Got good experience with NOSQL database.
Supported Map Reduce Programs those are running on the cluster.
Involved in loading data from UNIX file system to HDFS.
Installed and configured Hive and also written Hive UDFs.
Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.
Gained very good business knowledge on health insurance, claim processing, fraud suspect identification, appeals process etc.
Developed a custom File System plug in for Hadoop so it can access files on Data Platform.
This plugin allows Hadoop MapReduce programs, HBase, Pig and Hive to work unmodified and access files directly.
Designed and implemented Mapreduce-based large-scale parallel relation-learning system
Extracted feeds form social media sites such as Facebook, Twitter using Python scripts.
Setup and benchmarked Hadoop/HBase clusters for internal use
Setup Hadoop cluster on Amazon EC2 using whirr for POC.

Environment: Hadoop, MapReduce, HDFS, Hive, Sqoop, HBase, UNIX Shell Scripting.

Confidential

Java Testing & SVM admin

Responsibilities:

Developed Map Reduce programs in Java for parsing the raw data and populating staging
Worked on both WebLogic Portal 9.2 for Portal development and WebLogic 8.1 for Data Services Programming
Used Eclipse 6.0 as IDE for application development.
Involved in writing test cases by using set of conditions to test the application
Configured Struts framework to implement MVC design patterns
Build sql queries for fetching the required columns and data from database.
Used Subversion as the version control system
Managed the SVN related responsibilities and maintained the versions accordingly.
Done SVN check in and check out’s.
Used Hibernate for handling database transactions and persisting objects
Used AJAX for interactive user operations and client side validations
Developed ANT script for compiling and deployment
Performed unit testing using Junit
Extensively used Log4j for logging the log files

Environment: Java/J2EE, SQL, PL/SQL, JSP, EJB, Struts, SVN, JDBC, XML, XSLT, UML, JUnit, Log4j

Confidential

Java Developer

Responsibilities:

Involved in Requirement Analysis, Development and Documentation.
Used MVC architecture (Jakarta Struts framework) for Web tier.
Participation in developing form-beans and action mappings required for struts implementation and validation framework using struts.
Development of front-end screens with JSP Using Eclipse.
Involved in Development of Medical Records module.
Responsible for development of the functionality using Struts and EJB components.
Coding for DAO Objects using JDBC (using DAO pattern)
XML and XSDs are used to define data formats.
Implemented J2EE design patterns value object singleton, DAO for the presentation tier, business tier and Integration Tier layers of the project.
Involved in Bug fixing and functionality enhancements.
Designed and developed excellent Logging Mechanism for each order process using Log4J.
Involved in writing Oracle SQL Queries.
Involved in Check-in and Checkout process using CVS.
Developed additional functionality in the software as per business requirements.
Involved in requirement analysis and complete development of client side code.
Followed Sun standard coding and documentation standards.
Participation in project planning with business analysts and team members to analyze the Business requirements and translated business requirements into working software.
Developed software application modules using disciplined software development process.

Environment: Java, J2EE, JSP, EJB, ANT, STRUTS1.2, Log4J, Web logic 7.0, JDBC, MyEclipse, Windows, XP, CVS, Oracle.

We provide IT Staff Augmentation Services!

Hadoop (big Data) Developer/admin Resume

Minneapolis, MN

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship