Hadoop Developer Resume Pawtucket, RI. - Hire IT People

SUMMARY:

Experience around 5+ years in IT industry with complete software development of life cycle (SDLC) which includes business requirements gathering, system analysis & design, data modeling, development, testing and implementation of the projects.
Experience in configuration, deployments and managing of different Hadoop distributions like Cloudera (CDH4 & CDH5) and Hortonworks (HDP).
Experience of import/export data using Sqoop from Hadoop distributed file systems to relational database systems and vice versa. Good knowledge in understanding the Map Reduce programs.
Experience in Big Data Hadoop Ecosystems experience in ingestion, storage, querying, processing and analysis of big data.
Experience in optimization techniques in sorting and phase of Map reduce programs and implemented optimized joins that will join data from different data sources.
Experience in defining job flows managing and reviewing Hadoop log files.
Created and maintained Tables, views, procedures, functions, packages, DB triggers, and Indexes.
Used Sqoop to import data from RDBMS into hive tables. Developed map reduce jobs using java to preprocess data.
Involved in HDFS maintenance and loading of structured and unstructured data.
Created hive internal/external tables and worked on them using HIVE QL. Responsible for managing data coming from different data sources.
Load and transform large sets of structured, semi structured and unstructured data and Responsible to manage data coming from different sources.
Experience in handling various file formats like AVRO, Sequential, text, xml, JSON and Parquet with different compression techniques such as gzip, LZO, Snappy etc.
Imported the data from source HDFS into Spark Data Frame for in - memory data computation to generate the optimized output response and better visualizations.
Experience on collection the real time streaming data and creating the pipeline for raw data from different source using Kafka and store data into HDFS and NoSQL using Spark.
Implemented POC for using Impala for data processing on top of HIVE for better utilization.
Knowledge in NoSQL Databases HBase, Cassandra and it's integrated with Hadoop cluster.
Experienced with Oozie to automate the data movement between different Hadoop systems.
Good understanding on security requirements for Hadoop and integrate with Kerberos authentication and authorization infrastructure. Mentored analyst and test team for writing Hive Queries.
Experience in writing Hive Queries for processing and analyzing large volumes of data.
Interacted effectively with different team members of the Business Engineering, Quality Assurance and other teams involved with the System Development Life cycle.

TECHNICAL SKILLS:

Big data Eco system Components: HDFS, Hadoop MapReduce, Zookeeper, Hive, Sqoop, Spark, Kafka, Oozie, HiveQL.

GUI Tools: Hue, GitHub, GitLab, Splunk.

TOAD, Toad: Data Point, PL/SQL Developer, SQL Developer, and SQL* PLUS.

Databases: Oracle (SQL,), Teradata, SQL Server.

Web Technologies: HTML, CSS, JavaScript

Operating Systems: Linux 5, UNIX, Windows XP, 7, 8, and 10.

PROFESSIONAL EXPERIENCE:

Confidential, Pawtucket, RI.

Hadoop Developer

Responsibilities:

Involved in complete Big Data flow of the application data ingestion from upstream to HDFS, processing the data in HDFS and analyzing the data using several tools.
Imported the data from various formats like Text, CSV, AVRO and Parquet to HDFS cluster with compressed for optimization.
Experience on ingesting data from RDBMS sources like - Oracle, SQL Server and Teradata into HDFS using Sqoop
Configured Hive and participated in writing Hive UDF's and UDAF's. Also, created partitions such as Static and Dynamic with bucketing.
Importing and exporting data into HDFS and hive using Sqoop and Kafka with batch and streaming.
Using Hive join queries to join multiple tables of a source system and load them into Data Lake.
Experience in managing and reviewing huge Hadoop log files.
Involved in HDFS maintenance and loading of structured and unstructured data. Implemented Data Integrity and Data Quality checks in Hadoop using Hive and Linux scripts.
Involved in migration of the data from Oracle to Hadoop data lake using Sqoop import. Implemented schema extraction for Parquet and Avro file Formats in Hive.
Created Apache Oozie workflows and coordinators to schedule and monitor various jobs including Sqoop, hive and shell script actions.
Created Data Pipelines as per the business requirements and scheduled it using Oozie Coordinators.
Maintaining technical documentation for each step of development environment including HLD and LLD.
Involved in development, building, testing, and deploy to Hadoop cluster in distributed mode.
Gathered the business requirements from the Business Partners and Subject Matter Experts.
Continuous monitoring and managing the Hadoop cluster using Cloudera Manager.
Extensively used ESP workstation to schedule the Oozie jobs.
Experience in understanding the security requirements for Hadoop and integrate with Kerberos authentication and authorization infrastructure.
Built the automated build and deployment framework using GitHub and Maven etc.
Worked on BI tools as Tableau to create dashboards like weekly, monthly, daily reports using tableau desktop and publish them to HDFS cluster.
Creating reports using tableau for business data visualization.

Environment: Hadoop, HDFS, Hive, Oozie, Sqoop, Oozie, ESP Workstation, Shell Scripting, HBase, GitHub, Tableau, Oracle, MySQLClient: JP Morgan Chase

Confidential, Columbus, Ohio

Hadoop Developer

Responsibilities:

Worked on Hadoop cluster scaling from 4 nodes in development environment to 8 nodes in pre-production stage and up to 24 nodes in production.
Involved in complete Implementation lifecycle development.
Extensively used Hive/HQL or Hive queries to query or search for a string in Hive tables in HDFS.
Experience in creating Hive Managed Tables and External tables and loading the transformed data to those tables. Experience in using AVRO, JSON, XML file formats.
Possess good Linux and Hadoop System Administration skills, networking, shell scripting and familiarity with open source configuration management and deployment tools such as chef.
Managing and scheduling Jobs to remove the duplicate log data files in HDFS using Oozie.
Utilized Oozie workflow engine to manage interdependent Hadoop jobs and to automate several types of Hadoop jobs such as Java MapReduce, Hive and Sqoop as well as system specific jobs.
Used Apache Oozie for scheduling and managing the Hadoop Jobs, knowledgeable on HCatalog.
Created and designed data ingest pipelines using technologies such as Spring integration, Apache Storm-Kafka.
Implemented test scripts to support test driven development and continuous integration.
Dumped the data from HDFS to Oracle database and vice-versa using Sqoop.
Documenting the procedures performed for the project development.
Exported data from HDFS environment into RDBMS using Sqoop for report generation and visualization purpose
Responsible for writing Hive Queries for analyzing data in Hive warehouse using Hive Query Language (HQL)
Experience in Analyzing Cassandra database and compare it with other open-source NoSQL databases to find which one of them better suites the current requirements.
Involved in moving all log files generated from various sources to HDFS for further processing
Extracted the data from Teradata into HDFS using the Sqoop. Supported Data Analysts in running Map Reduce Programs
Developed Hive queries to process the data and generate the data cubes for visualizing
Developed the UNIX shell scripts for creating the reports from Hive data.
Documenting and transferring knowledge regarding the various objects and the changes done by me to production support team.
Extensively used Sqoop to get data from RDBMS sources like Teradata and Oracle.
Involved in collecting metrics for all the ingested data on weekly basis and providing report for the business.

Environment: Linux (RedHat), UNIX Shell, Oracle, Hive, MapReduce, Core Java, JDK1.7, Oozie Workflows, Cloudera, HBASE, SQOOP, Cloudera Manager.

Confidential

SQL Developer

Responsibilities:

Interacted with the users for understanding and gathering business requirements.
Designed a complex SSIS package for data transfer from three different firm sources to a single destination like SQL server 2005.
Developed and optimized database design for new applications. Data residing in the source tables have been migrated into staging and then final tables.
Implemented data views and control tools for guarantee data transformation using SSIS. Successfully deployed SSIS packages with defined security.
Developed logical database and converted into physical database using Erwin.
Involved to write complex T- SQL queries and Stored Procedures for generating reports. Successfully worked with Report Server and configured into SQL Server 2005.
Responsible to monitor performance and optimize SQL queries for maximum efficiency.
Proficiently scheduled the Subscription Reports with the Subscription Report wizard.
Involved in the analysis, design, development, testing, deployment and user training of analytical and transactional reporting system.
Used stored procedures, wrote new stored procedures and triggers, modified existing ones, and tuned them such that they perform well.
Tuned SQL queries using execution plans for better performance.
Optimized by assigning relative weights to the tables in the Catalog and Query Performance. Analyzed reports and fixed bugs in stored procedures.

Environment: MS SQL Server 2005/2008, SSDT, T- SQL, SQL Profiler, Execution Plan, Win Merge, Notepad ++

Confidential

Associate Client Analyst

Responsibilities:

Perform financial analyses and rent roll reviews for assigned portfolios in accordance with CMSA guidelines, Agency requirements and internal policies and procedures
Research and comment on period to period variances, contact borrowers for additional information and interact with other areas of servicing to ensure complete and accurate analyses are reported
Ensure trigger events and other loan covenants are addressed upon completion of financial analysis
Perform quality control reviews of financial analyses and trigger analyses
Work in conjunction with the Client Relations group to represent the Company to investors, trustees, rating agencies and borrowers, etc. with respect to property financial statement matters
Ensure all systems are updated with the results of the financial statement analysis; these systems include, but are not limited to Asset Surveillance, Investor Query, CAG Workbench and Freddie Mac PRS system
Handle client requests relating to assigned portfolio(s) in an accurate and expedient manner
Monitor compliance for Financial Statement collection, analysis, and distribution and follow up with external parties
Manage third party vendor & client relationships
Domestic and international travel may be required

Environment: Advanced experience in Microsoft Office including Outlook, Word, PowerPoint, and Excel.

We provide IT Staff Augmentation Services!

Hadoop Developer Resume

Pawtucket, Ri

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship