We provide IT Staff Augmentation Services!

Hadoop Developer Resume

0/5 (Submit Your Rating)

RI

SUMMARY:

  • Experience around 5+ years in IT industry with complete software development of life cycle (SDLC) which includes business requirements gathering, system analysis & design, data modeling, development, testing and implementation of teh projects.
  • Experience in configuration, deployments and managing of different Hadoop distributions like Cloudera (CDH4 & CDH5) and Hortonworks (HDP).
  • Experience of import/export data using Sqoop from Hadoop distributed file systems to relational database systems and vice versa.
  • Good noledge in understanding teh Map Reduce programs.
  • Experience in Big Data Hadoop Ecosystems experience in ingestion, storage, querying, processing and analysis of big data.
  • Experience in optimization techniques in sorting and phase of Map reduce programs and implemented optimized joins dat will join data from different data sources.
  • Experience in defining job flows managing and reviewing Hadoop log files.
  • Created and maintained Tables, views, procedures, functions, packages, DB triggers, and Indexes.
  • Used Sqoop to import data from RDBMS into hive tables.
  • Developed map reduce jobs using java to preprocess data.
  • Involved in HDFS maintenance and loading of structured and unstructured data.
  • Created hive internal/external tables and worked on them using HIVE QL.
  • Responsible for managing data coming from different data sources.
  • Load and transform large sets of structured, semi structured and unstructured data and Responsible to manage data coming from different sources.
  • Experience in handling various file formats like AVRO, Sequential, text, xml, JSON and Parquet with different compression techniques such as gzip, LZO, Snappy etc.
  • Imported teh data from source HDFS into Spark Data Frame for in - memory data computation to generate teh optimized output response and better visualizations.
  • Experience on collection teh real time streaming data and creating teh pipeline for raw data from different source using Kafka and store data into HDFS and NoSQL using Spark.
  • Implemented POC for using Impala for data processing on top of HIVE for better utilization.
  • Knowledge in NoSQL Databases HBase, Cassandra and it's integrated withHadoopcluster.
  • Experienced with Oozie to automate teh data movement between differentHadoopsystems.
  • Good understanding onsecurity requirements forHadoopand integrate with Kerberos authentication and authorization infrastructure. Mentored analyst and test team for writing Hive Queries.
  • Experience in writing Hive Queries for processing and analyzing large volumes of data.
  • Interacted effectively with different team members of teh Business Engineering, Quality Assurance and other teams involved with teh System Development Life cycle.

TECHNICAL SKILLS:

Big data Eco system Components: HDFS, Hadoop MapReduce, Zookeeper, Hive,Sqoop,Spark, Kafka, Oozie, HiveQL.

GUI Tools: Hue, GitHub, GitLab, Splunk.

Query Tools: TOAD, Toad-Data Point, PL/SQL Developer, SQL Developer, and SQL* PLUS.

Databases: Oracle (SQL,), Teradata,SQL Server.

Web Technologies: HTML, CSS, JavaScript

Operating Systems: Linux 5, UNIX, Windows XP, 7, 8, and 10.

PROFESSIONAL EXPERIENCE:

Confidential, RI

Hadoop Developer

Responsibilities:

  • Involved in complete Big Data flow of teh application data ingestion from upstream to HDFS, processing teh data in HDFS and analyzing teh data using several tools.
  • Imported teh data from various formats like Text, CSV, AVRO and Parquet to HDFS cluster with compressed for optimization.
  • Experience on ingesting data from RDBMS sources like - Oracle, SQL Server and Teradata into HDFS using Sqoop
  • Configured Hive and participated in writing Hive UDF's and UDAF's. Also, created partitions such as Static and Dynamic with bucketing.
  • Importing and exporting data into HDFS and hive using Sqoop and Kafka with batch and streaming.
  • Using Hive join queries to join multiple tables of a source system and load them into Data Lake.
  • Experience in managing and reviewing hugeHadooplog files.
  • Involved in Production Support by performing Normal Loads, Bulk Loads, Initial Loads, Incremental Loads, Daily loads and Monthly loads and Developed reports based on issues related to thedatawarehouse
  • Involved in planning, building, and managing successful large-scaleDataWarehouseand decision support systems
  • Involved in building teh ETL architecture and Source to target mapping to load data into Datawarehouse.
  • Involved in HDFS maintenance and loading of structured and unstructured data.
  • Implemented Data Integrity and Data Quality checks inHadoopusing Hive and Linux scripts.
  • Involved in migration of teh data from Oracle to Hadoop data lake using Sqoop import.
  • Implemented schema extraction for Parquet and Avro file Formats in Hive.
  • Created Apache Oozie workflows and coordinators to schedule and monitor various jobs including Sqoop, hive and shell script actions.
  • Worked on building teh ETL architecture and Source to Target mapping to loaddataintoDatawarehouse
  • Managed teh Metadata associated with teh ETL processes used to populate theDataWarehouse
  • Conducted unit testing ondatamodel changes, ETL, anddatawarehouseloads to ensure accurate results within teh context of defined requirements
  • Created Data Pipelines as per teh business requirements and scheduled it using Oozie Coordinators.
  • Maintaining technical documentation for each step of development environment including HLD and LLD.
  • Involved in development, building, testing, and deploy toHadoopcluster in distributed mode.
  • Gathered teh business requirements from teh Business Partners and Subject Matter Experts.
  • Continuous monitoring and managing theHadoopcluster using Cloudera Manager.
  • Extensively used ESP workstation to schedule teh Oozie jobs.
  • Experience in understanding teh security requirements forHadoopand integrate with Kerberos authentication and authorization infrastructure.
  • Built teh automated build and deployment framework using GitHub and Maven etc.
  • Worked on BI tools as Tableau to create dashboards like weekly, monthly, daily reports using tableau desktop and publish them to HDFS cluster.
  • Creating reports using tableau for business data visualization.

Environment: Hadoop, HDFS, Hive, Oozie, Sqoop, Oozie, ESP Workstation, Shell Scripting, HBase, GitHub, Tableau, Oracle, MySQL

Confidential, Columbus, Ohio

Hadoop Developer

Responsibilities:

  • Worked onHadoopcluster scaling from 4 nodes in development environment to 8 nodes in pre-production stage and up to 24 nodes in production.
  • Involved in complete Implementation lifecycle development.
  • Extensively used Hive/HQL or Hive queries to query or search for a string in Hive tables in HDFS.
  • Experience in creating Hive Managed Tables and External tables and loading teh transformed data to those tables. Experience in using AVRO, JSON, XML file formats.
  • Extracted thedatafrom teh flat files and other RDBMS databases into staging area and populated ontoDatawarehouse
  • Created sessions, configured workflows to extractdatafrom various sources, transformeddata, and loading intodatawarehouse.
  • Possess good Linux andHadoopSystem Administration skills, networking, shell scripting and familiarity with open source configuration management and deployment tools such as chef.
  • Managing and scheduling Jobs to remove teh duplicate log data files in HDFS using Oozie.
  • Utilized Oozie workflow engine to manage interdependent Hadoop jobs and to automate several types of Hadoop jobs such as Java MapReduce, Hive and Sqoop as well as system specific jobs.
  • Used Apache Oozie for scheduling and managing theHadoopJobs, noledgeable on HCatalog.
  • Created and designed data ingest pipelines using technologies such as Spring integration, Apache Storm-Kafka.
  • Implemented test scripts to support test driven development and continuous integration.
  • Dumped teh data from HDFS to Oracle database and vice-versa using Sqoop.
  • Documenting teh procedures performed for teh project development.
  • Exported data from HDFS environment into RDBMS using Sqoop for report generation and visualization purpose
  • Responsible for writing Hive Queries for analyzing data in Hive warehouse using Hive Query Language (HQL)
  • Experience in Analyzing Cassandra database and compare it with other open-source NoSQL databases to find which one of them better suites teh current requirements.
  • Involved in moving all log files generated from various sources to HDFS for further processing
  • Extracted teh data from Teradata into HDFS using teh Sqoop
  • Supported Data Analysts in running Map Reduce Programs
  • Developed Hive queries to process teh data and generate teh data cubes for visualizing
  • Developed teh UNIX shell scripts for creating teh reports from Hive data.
  • Documenting and transferring noledge regarding teh various objects and teh changes done by me to production support team.
  • Extensively used Sqoop to get data from RDBMS sources like Teradata and Oracle.
  • Involved in collecting metrics for all teh ingested data on weekly basis and providing report for teh business.

Environment: Linux (RedHat), UNIX Shell, Oracle, Hive, MapReduce, Core Java, JDK1.7, Oozie Workflows, Cloudera, HBASE, SQOOP, Cloudera Manager.

Confidential

SQL Developer

Responsibilities:

  • Interacted with teh users for understanding and gathering business requirements.
  • Designed a complex SSIS package for data transfer from three different firm sources to a single destination like SQL server 2005.
  • Developed and optimized database design for new applications.
  • Data residing in teh source tables has been migrated into staging and then final tables.
  • Implemented data views and control tools for guarantee data transformation using SSIS.
  • Successfully deployed SSIS packages with defined security.
  • Developed logical database and converted into physical database using Erwin.
  • Involved to write complex T-SQLqueries and Stored Procedures for generating reports.
  • Successfully worked with Report Server and configured intoSQLServer 2005.
  • Responsible to monitor performance and optimizeSQLqueries for maximum efficiency.
  • Proficiently scheduled teh Subscription Reports with teh Subscription Report wizard.
  • Involved in teh analysis, design, development, testing, deployment and user of analytical and transactional reporting system.
  • Used stored procedures, wrote new stored procedures and triggers, modified existing ones, and tuned them such dat they perform well.
  • TunedSQLqueries using execution plans for better performance.
  • Optimized by assigning relative weights to teh tables in teh Catalog and Query Performance.
  • Analyzed reports and fixed bugs in stored procedures.

Environment: MSSQLServer 2005/2008, SSDT, T-SQL,SQLProfiler, Execution Plan, Win Merge, Notepad ++

Confidential

Associate Client Analyst

Responsibilities:

  • Perform financial analyses and rent roll reviews for assigned portfolios in accordance with CMSA guidelines, Agency requirements and internal policies and procedures
  • Research and comment on period to period variances, contact borrowers for additional information and interact with other areas of servicing to ensure complete and accurate analyses are reported
  • Ensure trigger events and other loan covenants are addressed upon completion of financial analysis
  • Perform quality control reviews of financial analyses and trigger analyses
  • Work in conjunction with teh Client Relations group to represent teh Company to investors, trustees, rating agencies and borrowers, etc. with respect to property financial statement matters
  • Ensure all systems are updated with teh results of teh financial statement analysis; these systems include, but are not limited to Asset Surveillance, Investor Query, CAG Workbench and Freddie Mac PRS system
  • Handle client requests relating to assigned portfolio(s) in an accurate and expedient manner
  • Monitor compliance for Financial Statement collection, analysis, and distribution and follow up with external parties
  • Manage third party vendor & client relationships
  • Domestic and international travel may be required

Environment: Advanced experience in Microsoft Office including Outlook, Word, PowerPoint, and Excel.

We'd love your feedback!