Hadoop Developer Resume RI - Hire IT People

SUMMARY:

Experience around 5+ years in IT industry with complete software development of life cycle (SDLC) which includes business requirements gathering, system analysis & design, data modeling, development, testing and implementation of teh projects.
Experience in configuration, deployments and managing of different Hadoop distributions like Cloudera (CDH4 & CDH5) and Hortonworks (HDP).
Experience of import/export data using Sqoop from Hadoop distributed file systems to relational database systems and vice versa.
Good noledge in understanding teh Map Reduce programs.
Experience in Big Data Hadoop Ecosystems experience in ingestion, storage, querying, processing and analysis of big data.
Experience in optimization techniques in sorting and phase of Map reduce programs and implemented optimized joins dat will join data from different data sources.
Experience in defining job flows managing and reviewing Hadoop log files.
Created and maintained Tables, views, procedures, functions, packages, DB triggers, and Indexes.
Used Sqoop to import data from RDBMS into hive tables.
Developed map reduce jobs using java to preprocess data.
Involved in HDFS maintenance and loading of structured and unstructured data.
Created hive internal/external tables and worked on them using HIVE QL.
Responsible for managing data coming from different data sources.
Load and transform large sets of structured, semi structured and unstructured data and Responsible to manage data coming from different sources.
Experience in handling various file formats like AVRO, Sequential, text, xml, JSON and Parquet with different compression techniques such as gzip, LZO, Snappy etc.
Imported teh data from source HDFS into Spark Data Frame for in - memory data computation to generate teh optimized output response and better visualizations.
Experience on collection teh real time streaming data and creating teh pipeline for raw data from different source using Kafka and store data into HDFS and NoSQL using Spark.
Implemented POC for using Impala for data processing on top of HIVE for better utilization.
Knowledge in NoSQL Databases HBase, Cassandra and it's integrated withHadoopcluster.
Experienced with Oozie to automate teh data movement between differentHadoopsystems.
Good understanding onsecurity requirements forHadoopand integrate with Kerberos authentication and authorization infrastructure. Mentored analyst and test team for writing Hive Queries.
Experience in writing Hive Queries for processing and analyzing large volumes of data.
Interacted effectively with different team members of teh Business Engineering, Quality Assurance and other teams involved with teh System Development Life cycle.

TECHNICAL SKILLS:

Big data Eco system Components: HDFS, Hadoop MapReduce, Zookeeper, Hive,Sqoop,Spark, Kafka, Oozie, HiveQL.

GUI Tools: Hue, GitHub, GitLab, Splunk.

Query Tools: TOAD, Toad-Data Point, PL/SQL Developer, SQL Developer, and SQL* PLUS.

Databases: Oracle (SQL,), Teradata,SQL Server.

Web Technologies: HTML, CSS, JavaScript

Operating Systems: Linux 5, UNIX, Windows XP, 7, 8, and 10.

PROFESSIONAL EXPERIENCE:

Confidential, RI

Hadoop Developer

Responsibilities:

Involved in complete Big Data flow of teh application data ingestion from upstream to HDFS, processing teh data in HDFS and analyzing teh data using several tools.
Imported teh data from various formats like Text, CSV, AVRO and Parquet to HDFS cluster with compressed for optimization.
Experience on ingesting data from RDBMS sources like - Oracle, SQL Server and Teradata into HDFS using Sqoop
Configured Hive and participated in writing Hive UDF's and UDAF's. Also, created partitions such as Static and Dynamic with bucketing.
Importing and exporting data into HDFS and hive using Sqoop and Kafka with batch and streaming.
Using Hive join queries to join multiple tables of a source system and load them into Data Lake.
Experience in managing and reviewing hugeHadooplog files.
Involved in Production Support by performing Normal Loads, Bulk Loads, Initial Loads, Incremental Loads, Daily loads and Monthly loads and Developed reports based on issues related to thedatawarehouse
Involved in planning, building, and managing successful large-scaleDataWarehouseand decision support systems
Involved in building teh ETL architecture and Source to target mapping to load data into Datawarehouse.
Involved in HDFS maintenance and loading of structured and unstructured data.
Implemented Data Integrity and Data Quality checks inHadoopusing Hive and Linux scripts.
Involved in migration of teh data from Oracle to Hadoop data lake using Sqoop import.
Implemented schema extraction for Parquet and Avro file Formats in Hive.
Created Apache Oozie workflows and coordinators to schedule and monitor various jobs including Sqoop, hive and shell script actions.
Worked on building teh ETL architecture and Source to Target mapping to loaddataintoDatawarehouse
Managed teh Metadata associated with teh ETL processes used to populate theDataWarehouse
Conducted unit testing ondatamodel changes, ETL, anddatawarehouseloads to ensure accurate results within teh context of defined requirements
Created Data Pipelines as per teh business requirements and scheduled it using Oozie Coordinators.
Maintaining technical documentation for each step of development environment including HLD and LLD.
Involved in development, building, testing, and deploy toHadoopcluster in distributed mode.
Gathered teh business requirements from teh Business Partners and Subject Matter Experts.
Continuous monitoring and managing theHadoopcluster using Cloudera Manager.
Extensively used ESP workstation to schedule teh Oozie jobs.
Experience in understanding teh security requirements forHadoopand integrate with Kerberos authentication and authorization infrastructure.
Built teh automated build and deployment framework using GitHub and Maven etc.
Worked on BI tools as Tableau to create dashboards like weekly, monthly, daily reports using tableau desktop and publish them to HDFS cluster.
Creating reports using tableau for business data visualization.

Environment: Hadoop, HDFS, Hive, Oozie, Sqoop, Oozie, ESP Workstation, Shell Scripting, HBase, GitHub, Tableau, Oracle, MySQL

Confidential, Columbus, Ohio

Hadoop Developer

Responsibilities:

Worked onHadoopcluster scaling from 4 nodes in development environment to 8 nodes in pre-production stage and up to 24 nodes in production.
Involved in complete Implementation lifecycle development.
Extensively used Hive/HQL or Hive queries to query or search for a string in Hive tables in HDFS.
Experience in creating Hive Managed Tables and External tables and loading teh transformed data to those tables. Experience in using AVRO, JSON, XML file formats.
Extracted thedatafrom teh flat files and other RDBMS databases into staging area and populated ontoDatawarehouse
Created sessions, configured workflows to extractdatafrom various sources, transformeddata, and loading intodatawarehouse.
Possess good Linux andHadoopSystem Administration skills, networking, shell scripting and familiarity with open source configuration management and deployment tools such as chef.
Managing and scheduling Jobs to remove teh duplicate log data files in HDFS using Oozie.
Utilized Oozie workflow engine to manage interdependent Hadoop jobs and to automate several types of Hadoop jobs such as Java MapReduce, Hive and Sqoop as well as system specific jobs.
Used Apache Oozie for scheduling and managing theHadoopJobs, noledgeable on HCatalog.
Created and designed data ingest pipelines using technologies such as Spring integration, Apache Storm-Kafka.
Implemented test scripts to support test driven development and continuous integration.
Dumped teh data from HDFS to Oracle database and vice-versa using Sqoop.
Documenting teh procedures performed for teh project development.
Exported data from HDFS environment into RDBMS using Sqoop for report generation and visualization purpose
Responsible for writing Hive Queries for analyzing data in Hive warehouse using Hive Query Language (HQL)
Experience in Analyzing Cassandra database and compare it with other open-source NoSQL databases to find which one of them better suites teh current requirements.
Involved in moving all log files generated from various sources to HDFS for further processing
Extracted teh data from Teradata into HDFS using teh Sqoop
Supported Data Analysts in running Map Reduce Programs
Developed Hive queries to process teh data and generate teh data cubes for visualizing
Developed teh UNIX shell scripts for creating teh reports from Hive data.
Documenting and transferring noledge regarding teh various objects and teh changes done by me to production support team.
Extensively used Sqoop to get data from RDBMS sources like Teradata and Oracle.
Involved in collecting metrics for all teh ingested data on weekly basis and providing report for teh business.

Environment: Linux (RedHat), UNIX Shell, Oracle, Hive, MapReduce, Core Java, JDK1.7, Oozie Workflows, Cloudera, HBASE, SQOOP, Cloudera Manager.

Confidential

SQL Developer

Responsibilities:

Interacted with teh users for understanding and gathering business requirements.
Designed a complex SSIS package for data transfer from three different firm sources to a single destination like SQL server 2005.
Developed and optimized database design for new applications.
Data residing in teh source tables has been migrated into staging and then final tables.
Implemented data views and control tools for guarantee data transformation using SSIS.
Successfully deployed SSIS packages with defined security.
Developed logical database and converted into physical database using Erwin.
Involved to write complex T-SQLqueries and Stored Procedures for generating reports.
Successfully worked with Report Server and configured intoSQLServer 2005.
Responsible to monitor performance and optimizeSQLqueries for maximum efficiency.
Proficiently scheduled teh Subscription Reports with teh Subscription Report wizard.
Involved in teh analysis, design, development, testing, deployment and user of analytical and transactional reporting system.
Used stored procedures, wrote new stored procedures and triggers, modified existing ones, and tuned them such dat they perform well.
TunedSQLqueries using execution plans for better performance.
Optimized by assigning relative weights to teh tables in teh Catalog and Query Performance.
Analyzed reports and fixed bugs in stored procedures.

Environment: MSSQLServer 2005/2008, SSDT, T-SQL,SQLProfiler, Execution Plan, Win Merge, Notepad ++

Confidential

Associate Client Analyst

Responsibilities:

Perform financial analyses and rent roll reviews for assigned portfolios in accordance with CMSA guidelines, Agency requirements and internal policies and procedures
Research and comment on period to period variances, contact borrowers for additional information and interact with other areas of servicing to ensure complete and accurate analyses are reported
Ensure trigger events and other loan covenants are addressed upon completion of financial analysis
Perform quality control reviews of financial analyses and trigger analyses
Work in conjunction with teh Client Relations group to represent teh Company to investors, trustees, rating agencies and borrowers, etc. with respect to property financial statement matters
Ensure all systems are updated with teh results of teh financial statement analysis; these systems include, but are not limited to Asset Surveillance, Investor Query, CAG Workbench and Freddie Mac PRS system
Handle client requests relating to assigned portfolio(s) in an accurate and expedient manner
Monitor compliance for Financial Statement collection, analysis, and distribution and follow up with external parties
Manage third party vendor & client relationships
Domestic and international travel may be required

Environment: Advanced experience in Microsoft Office including Outlook, Word, PowerPoint, and Excel.

We provide IT Staff Augmentation Services!

Hadoop Developer Resume

RI

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship