Big Data Engineer Resume Columbia, MD - Hire IT People

OBJECTIVE

Looking for a challenging position as HADOOP DEVELOPER/ENGINEER where I can use my knowledge, technical and analytical skills to contribute to projects that add value to the organization.

SUMMARY

Having total work experience of 12+ years in Information Technology with skills in analysis, design, development, testing and deploying various software applications, which include Web related, windows applications with emphasis on Object Oriented Programming and Mainframe applications.
About 3+ years of work experience on Big Data Analytics as Hadoop Developer/Engineer.
Experienced on major Hadoop ecosystem’s projects such as Pig, Hive, HBase and monitoring them with Cloudera Manager, AWS & Hortonworks.
Extensive experience in developing Pig Latin Scripts and using Hive Query Language for data analytics
Hands on experience working on NoSQL databases including HBase and its integration with Hadoop cluster
Have hands on experience in writing Map Reduce jobs on Hadoop Ecosystem including Hive and Pig.
Hands on experience in installing, configuring and using ecosystem components like Hadoop Map Reduce, HDFS, Pig, Hive, Sqoop, Python, Scala and Spark.
Handling and further processing schema oriented and non - schema oriented data using Pig.
Read, processed and stored desperate data in parallel using Pig.
Good knowledge of database connectivity (JDBC) for databases like Oracle, DB2, SQL Server, MySQL and Netezza.
Experienced in coding SQL, Procedures/Functions, Triggers and Packages on database (RDBMS) packages like Oracle.
Developed stored procedures and queries using SQL.
Worked on Agile methodology, SOA for many of the applications.
Excellent analytical, problem solving, communication and interpersonal skills with ability to interact with individuals Confidential all levels and can work as a part of a team as well as independently.
In-depth understanding of Data Structure and Algorithms.
Strong Communication skills of written, oral, interpersonal and presentation.
Ability to perform Confidential a high level, meet deadlines, adaptable to ever changing priorities.
Good knowledge in using job scheduling and monitoring tools like Oozie and Zookeeper
Mentor team in UNIX and open source tools/platformsrevolving around Hadoop.

TECHNICAL SKILLS

Hadoop/Big Data: HDFS, Map Reduce, Hive, Pig, Sqoop, Zookeeper, Python, Scala and Spark

No SQL Databases: HBase

Programming Languages: Java, PL/SQL, Pig Latin, Hive QL, Unix shell scripts, Cobol, JCL,VSAM

Operating Systems: UNIX, Windows, LINUX, Z/OS

Web technologies: JSP, JDBC

Databases: Oracle 9i/10g, Netezza, Microsoft SQL Server and MySQL

Java IDE: Eclipse 3.x

Tools: TOAD, SQL Developer

PROFESSIONAL EXPERIENCE

Confidential, Columbia, MD

Big Data Engineer

Responsibilities:

Implemented the Sqoop scripts in order to make the interaction between Pig and MySQL Database.
Created Hive queries that helped market analysts spot emerging trends by comparing fresh data with EDW reference tables and historical metrics.
Created Hive tables to store the processed results in a tabular format.
Writing the script files for processing data and loading to HDFS
Writing CLI commands using HDFS.
Developed the UNIX shell/Python scripts for creating the reports from Hive data.
Completely involved in the requirement analysis phase.
Responsible for building scalable distributed data solutions using Hadoop
Optimized Map/Reduce Jobs to use HDFS efficiently by using various compression mechanisms
Responsible for writing Hive Queries for analyzing data in Hive warehouse using Hive Query Language (HQL).
Exported the result set from Hive to Netezza using Shell scripts.
Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs
Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs
Worked on Hadoop testing which involves unit testing of Map-Reduce code, Hive and Pig UDF’s.
Developed manual test validation test cases which involves data sampling, data completeness and data quality.
Developed Spark scripts by using Scala shell commands as per the requirement.
Used Spark API over Cloudera Hadoop YARN to perform analytics on data in Hive.
Developed Scala scripts, UDFFs using both Data frames/SQL/Data sets and RDD/MapReduce in Spark 1.6 for Data Aggregation, queries and writing data back into OLTP system through Sqoop.
Experienced in performance tuning of Spark Applications for setting right Batch Interval time, correct level of Parallelism and memory tuning.
Optimizing of existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frames and Pair RDD's.

Environment: Amazon(AWS) EC/2, Hadoop 2.4/2.6, Hive, Map Reduce, Sqoop, Pig, JDK1.6/1.7,HDFS, Flume, Tidal, HBase, Zookeeper, Mahout, Spark, Scala, Unix Shell/python script, Restful Web services, PL/SQL and SQL.

Confidential, Columbia, MD

Hadoop Developer

Responsibilities:

Developed Map-Reduce programs to parse the raw data, populate staging tables and store the refined data in partitioned tables in the EDW.
Implemented the Sqoop scripts in order to make the interaction between Pig and MySQL Database.
Created Hive queries that helped market analysts spot emerging trends by comparing fresh data with EDW reference tables and historical metrics.
Created Hive tables to store the processed results in a tabular format.
Writing the script files for processing data and loading to HDFS
Writing CLI commands using HDFS.
Developed the UNIX shell/Python scripts for creating the reports from Hive data.
Completely involved in the requirement analysis phase.
Enabled speedy reviews and first mover advantages by using Oozie to automate data loading into the Hadoop Distributed File System and PIG to pre-process the data.
Provided design recommendations and thought leadership to sponsors/stakeholders that improved review processes and resolved technical problems.
Managed and reviewed Hadoop log files.
Tested raw data and executed performance scripts.
Involved in gathering the requirements, designing, development and testing
Responsible for building scalable distributed data solutions using Hadoop
Optimized Map/Reduce Jobs to use HDFS efficiently by using various compression mechanisms
Responsible for writing Hive Queries for analyzing data in Hive warehouse using Hive Query Language (HQL).
Exported the result set from Hive to Netezza using Shell scripts.
Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs
Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs
Worked on Hadoop testing which involves unit testing of Map-Reduce code, Hive and Pig UDF’s.
Developed manual test validation test cases which involves data sampling, data completeness and data quality.

Environment: Hadoop 2.4/2.6, Web Services (SOAP and REST), JMS, JavaScript, AngularJS, JSP, AWS, XML, XSD, Oracle PL/SQL, IBM WebSphere Portal, Hive, Map Reduce, Sqoop, Pig, JDK1.6/1.7,HDFS, Flume, Tidal, HBase, Zookeeper, Mahout, Spark, Scala, Unix Shell/python script.

Confidential, Fremont, CA

Hadoop Developer

Responsibilities:

Installed and configured Hadoop Map Reduce, HDFS, Developed multiple Map Reduce jobs in java for data cleaning and preprocessing.
Loaded the customer profiles data, customer spending data, credit from legacy warehouses onto HDFS using Sqoop.
Built data pipeline using Pig and Java Map Reduce to store onto HDFS.
Used Oozie to orchestrate the map reduce jobs that extract the data on a timely manner.
Applied transformations and filtered both traffic using Pig.
Used Pattern matching algorithms to recognize the customer across different sources and built risk profiles for each customer using Hive and stored the results in HBase.
Performed unit testing using MRUnit.
Responsible for building scalable distributed data solutions using Hadoop
Installed and configured Hive, Pig, Sqoop, Flume and Oozie on the Hadoop cluster
Setup and benchmarked Hadoop/HBase clusters for internal use
Developed Simple to complex Map/reduce Jobs using Hive and Pig
Optimized Map/Reduce Jobs to use HDFS efficiently by using various compression mechanisms
Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and Extracted the data from MySQL into HDFS using Sqoop
Analyzed the data by performing Hive queries and running Pig scripts to study customer behavior
Installed Oozie workflow engine to run multiple Hive and Pig jobs
Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team
Responsible for writing Hive Queries for analyzing data in Hive warehouse using Hive Query Language (HQL).
Provide support data analysts in running Pig and Hive queries.
Importing and exporting Data from MySQL/Oracle to HiveQL using SQOOP.
Importing and exporting Data from MySQL/Oracle to HDFS using SQOOP.
Responsible for defining the data flow within Hadoop eco system and direct the team in implement them.
Exported the result set from Hive to MySQL using Shell scripts.
Developed HIVE queries for the analysts.

Environment: Hadoop, Hive, Zookeeper, Map Reduce, Sqoop, Pig 0.10 and 0.11, JDK1.6,HDFS, Flume, Oozie, DB2, HBase, Mahout, PL/SQL and SQL.

Confidential, San Ramon, CA

Java/PLSQL Developer

Responsibilities:

Responsible for gathering and analyzing requirements and converting them into technical specifications
Used Rational Rose for creating sequence and class diagrams
Developed presentation layer using JSP, Java, HTML and JavaScript
Designed and developed Hibernate configuration and session-per-request design pattern for making database connectivity and accessing the session for database transactions respectively. Used HQL and SQL for fetching and storing data in databases
Participated in the design and development of database schema and Entity-Relationship diagrams of the backend Oracle database tables for the application
Implemented web services with Apache Axis
Designed and Developed Stored Procedures, Triggers in Oracle to cater the needs for the entire application. Developed complex SQL queries for extracting data from the database
Designed and built SOAP web service interfaces implemented in Java
Used Apache Ant for the build process
Used Clear Case for version control and Clear Quest for bug tracking

Environment: Java, JDK 1.5, Servlets, Hibernate, Oracle 10g, Eclipse, Web Services (SOAP), JavaScript, HTML, XML

Confidential

PL/SQL Developer

Responsibilities:

Experience on Oracle SQL, PL/SQL and Perl development
Lead experience giving recommendations and direction on development strategies, conducting code reviews, and mentoring junior developers, and setting Standards and establishing Best Practices.
Good RDBMS understanding
Expertise in tuning Oracle SQLs
Good Understanding of Data warehouse concepts
Knowledge of SDLC cycles
Strong experience in Technical documentation, coding and testing
Expertise in problem solving through debugging, research and investigation
Familiar with Best practices, Standard Concepts.
Good Communication skills
Requirements analysis of the inputs.
Execution of the required deliverables.
Defect prevention activities.
Creation of UTP, UTR.

Environment: SQL* PLUS, PL/SQL Developer 9.0, SQL-Loader, SQL Navigator, SSH, DB2, Perl, Windows 2000/2003/NT/XP, UNIX, C, C++, Java, TOAD, Eclipse, Notepad ++ 5.7, HP Open View, WinZip, Oracle 10g/9i/8i, MS Office (Visio, Excel, Word, Access), MS SQL Server 2005/2000,IBM Mainframes,AS400,JCL,Cobol,DB2

We provide IT Staff Augmentation Services!

Big Data Engineer Resume

Columbia, MD

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship