We provide IT Staff Augmentation Services!

Big Data/ Bi Developer Resume

5.00/5 (Submit Your Rating)

San Roman, CA

SUMMARY:

  • Good experience in Apache Hadoop Framework, HDFS, Map/Reduce, Pig, Hive, Sqoop and Cloudera's Hadoop distribution.
  • Good experience in Oozie Framework and Automating daily import jobs.
  • Extensive experience with ETL Informatica and Query big data tools like Pig Latin and Hive QL.
  • Proficient in big data ingestion and streaming tools like Flume, S3Cmd, Sqoop, Kafka and Storm.
  • Have hands on experience in Sequence files, AVRO and HAR file formats and compression.
  • Worked with application teams to install operating system, Hadoop updates, patches and version upgrades as required.
  • Excellent understanding of Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce programming paradigm.
  • Hands on experience in installing, configuring and using ecosystem components like HadoopMap Reduce, Apache Camel, HDFS, HBase, Oozie, Sqoop, Flume, Pig & Hive.
  • Excellent hand in experience in analyzing data using Pig Latin, HQL, HBase and MapReduce programs in Java.
  • Expertise in Amazon AWS concepts like EMR, EC2,S3,DataPipeline,IAM,VPC and Red Shift which provides fast and efficient processing of Big Data.
  • Extending Hive and Pig core functionality by writing custom UDFs like UDAFs and UDTFs.
  • Have good experience in Sqoop and Apache Flume for collecting, aggregating and moving large amounts of Relational and Streaming data from application servers.
  • Have dealt with Zookeeper, Oozie, AppWorx and Data Pipeline Operational Services for coordinating the cluster and scheduling workflows.
  • In - depth understanding of designing and coding using SQL, Linux /UNIX technologies.
  • Have dealt with Zookeeper, Oozie, Appworx and Data Pipeline Operational Services for coordinating the cluster and scheduling workflows.

TECHNICAL SKILLS:

Big Data (Hadoop Framework): Hadoop, MapReduce, HDFS, Hbase, Cassandra, Zookeeper, Ambari,Kafka, Hive, Pig, Sqoop, Oozie and Flume, Impala

Databases: My SQL, Hbase, MongoDB/Cassandra

Languages: SQL, HQL, Pig Latin, Map reduce

Development Tools: Eclipse, Toad, My SQL

Web Technologies: JSP, JDBC, AWT, Swing, JSF, XML, VMWare

Office Tools: Microsoft office suite

Scripting Languages: Linux Shell Scripts 

Operating Systems: Windows 8,Windows 7, UNIX, Linux, CentOS, Ubuntu

PROFESSIONAL EXPERIENCE:

Big DATA/ BI Developer

Confidential, San Roman, CA

Responsibilities:

  • Responsible for importing log files from various sources into HDFS using Flume. 
  • Created customized BI tool for manager team that perform Query analytics using HiveQL. 
  • Used Hive, spark SQL Connection and Pig to generate Tableau BI reports. 
  • Imported data using Sqoop to load data from MySQL to HDFS on regular basis. 
  • Created Partitions, Buckets based on State to further process using Bucket based Hive joins. 
  • Created Hive Generic UDF's to process business logic that varies based on policy. 
  • Moved Relational Data base data using Sqoop into Hive Dynamic partition tables using staging tables. 
  • Developed various data connections from data source to Tableau Server for report and dashboard development
  • Worked with clients to better understand their reporting and dash boarding needs and present solutions using structured Waterfall and Agile project methodology approach
  • Developed metrics, attributes, filters, reports, dashboards and also created advanced chart types, visualizations and complex calculations to manipulate the data.

Hadoop Developer

Confidential, San Jose, CA

Responsibilities:

  • Collaborated with designers and analysts to implement enhancements or new applications
  • Developed code to meet story acceptance criteria
  • Conducted design and code review to ensure compliance with standards
  • Developed and maintained large scale distributed data platforms with experienced in data warehouses, data marts and data lakes.
  • Experienced with software techniques, data manipulation techniques, ETL, and SQL tuning
  • Developed solutions utilizing the Hadoop ecosystem such Hadoop, Spark, Hive, HBASE, Pig, Sqoop, Oozie, Ambari, Zookeeper etc
  • Experienced in data loading from RDBS system to HDFS system using SQOOP and Flume.
  • Set up Hadoop monitoring tools like Grafana, Graphite and ambari for different dashboards .

Hadoop Developer

Confidential, Bethpage, NY

Responsibilities:

  • Managed highly unstructured and semi structured data of 350 TB in size.
  • Designed and developed Pig ETL scripts to process data in a Nightly batch to perform trend analysis in Internet WiFi Usage.
  • Developed Hive scripts for end user / analyst requirements for Adhoc analysis.
  • Used and Wrote Partitions, Bucketing concepts in Hive and designed both Managed and External tables in Hive for optimized performance.
  • Solved performance issues in Hive and Pig scripts with understanding of Joins, Group and aggregation and how does it translate to MapReduce jobs.
  • Involved tuning Hive and Pig scripts to improve performance.
  • Coordinated with offshore/onshore, collaboration and arranging the weekly meeting to discuss and track the development progress.
  • POC conducted for different Use cases and documented on AWS Platform.
  • Involved troubleshooting performance issues and tuning Hadoop cluster.
  • Used Sqoop in performing last modified and append imports from Oracle to HDFS.
  • Created Sequence files, AVRO and HAR file formats.
  • Developed Oozie workflow for scheduling and orchestrating Talend ETL process.
  • Performed data analysis with HBase using Hive external tables to HBase.
  • Troubleshoot Single Point of Failure (SPOF) of Hadoop Daemons and recovery procedures.
  • Installed and configure Cloudera CDH4 nodes on Amazon EC2.
  • Worked with the infrastructure and admin team in designing, modeling, sizing and configuring Hadoop cluster of 15 nodes on AWS EC2.
  • Gathered the business requirements from the Business Partners and Subject Matter Experts.
  • Developed Scripts and Batch Job to schedule various Hadoop Program.
  • Wrote Hive queries for data analysis to meet the business requirements.

Environment: Java 6, Eclipse, Oracle 11g, Sub Version, Hadoop, Hive, HBase, Linux, Map Reduce, HDFS, Hive, Java (JDK 1.6),Teradata, Informatica, Talend Studio, Hadoop Distribution of Cloudera, MapReduce, DataStax, IBM Data Stage 8.1, Toad 9.6, UNIX Shell Scripting, Putty and Eclipse.

Sr. Hadoop Developer/ ADMINISTRATOR

Confidential, San Francisco, CA

Responsibilities:

  • Installed, configured and deployed a 50 node Cloudera Hadoop cluster for development and production.
  • Performed trend analysis in electricity and gas consumption.
  • Worked on setting up high availability for major production cluster and designed automatic failover.
  • Configured Hive Metastore, which stores the metadata for Hive tables and partitions in a relational database.
  • Configured Flume for efficiently collecting, aggregating and moving large amounts of log data.
  • Worked on configuring security for Hadoop cluster (Kerberos, Active Directory).
  • Responsible to manage data coming from different sources.
  • Installed and configured Zookeeper for Hadoop cluster.
  • Tuning MR Programs those are running on the Hadoop cluster.
  • Involved in HDFS maintenance, upgrading the cluster to latest versions of CDH.
  • Wrote Map Reduce job using Java API.
  • Imported/exported data from RDMS to HDFS using Sqoop.
  • Wrote Hive and Pig queries for data analysis to meet the business requirements.
  • Created Hive tables and working on them using Hive QL.
  • Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.
  • Implemented Partitioning, Dynamic Partitions, Buckets in HIVE.

Environment: Hadoop, HDFS, Talend, Map Reduce, Sqoop, Hive, Pig, Oozie, NDM, Cassandra, SVN, CDH4, Cloudera Manager, MySQL and Eclipse.

Hadoop Big Data Consultant

Confidential, Los Angeles, CA

Responsibilities:

  • Installed and configured Hadoop MapReduce, HDFS, developed multiple MapReduce jobs in Java for data cleaning and preprocessing.
  • Worked with highly unstructured and semi structured data of 25 TB in size.
  • Involved in benchmarking Hadoop cluster.
  • Implemented Flume (Multiplexing) to steam data from upstream pipes in to HDFS.
  • Used Sqoop to import data from DB2 system in to HDFS.
  • Good understanding and related experience with Hadoop stack - internals, Hive, Pig and Map/Reduce.
  • Wrote Map Reduce jobs to discover trends in data usage by users.
  • Involved in defining job flows.
  • Involved in managing and reviewing Hadoop log files.
  • Managed Hadoop streaming jobs to process terabytes of text data.
  • Loaded and transformed large sets of structured, semi structured and unstructured data.
  • Supported Map Reduce Programs those are running on the cluster.
  • Involved in loading data from UNIX file system to HDFS.
  • Installed and configured Hive and also written Hive QL scripts.
  • Responsible to manage data coming from different sources.
  • Assisted the team in their Tableau development & deployment activities.
  • Involved in writing HQL queries, Criteria queries and SQL queries for the Data access layer.
  • Involved in development of SQL Server Stored Procedures and SSIS DTSX Packages to automate regular\mundane tasks as per business needs.
  • Involved in coordinating for Unit Testing, Quality Assurance, User Acceptance Testing and Bug Fixing.

Environment: Hadoop, MapReduce, HDFS, Pig, Hive, Java (jdk1.6), Cloudera Distribution, Java, Tableau, HTML, AWS S3, XML, XSLT, JQuery, AJAX, Web Services, JNDI, SQL Server, Struts2.0, Hibernate.

Big Data/ Hadoop Engineer

Confidential, Springfield, Missouri

Responsibilities:

  • Explored and used Hadoop ecosystem features and architectures.
  • Configured the Hadoop Cluster in Local (Standalone), Pseudo-Distributed, Fully-Distributed Mode.
  • Worked closely with business team to gather their requirements and new support features.
  • Developed Map-Reduce jobs for Log Analysis and Analytics.
  • Wrote Map-Reduce job to generate reports for the number of activities created on a particular day, during a time interval etc. for the Analytics module.
  • The MR Job read the data from HDFS, where the data was dumped from the multiple sources and the output was written back to HDFS.
  • Configured Sqoop and developed scripts to extract data from MySQL into HDFS.
  • Used Hive for analysis of web site traffic.
  • Wrote programs using scripting languages like Pig to manipulate data.
  • Implemented the workflows using the Apache Oozie framework to automate tasks.
  • Wrote Hadoop Job Client utilities and integrated them into monitoring system.
  • Reviewed the HDFS usage and system design for future scalability and fault-tolerance.
  • Prepared Extensive Shell scripts to get the required info from logs.
  • Performed white box testing and monitoring all the logs in Dev and Prod environments

Environment: Apache Hadoop, HDFS, Map/Reduce Java, Sqoop, Pig, Hive, Oozie, Flume, Core Java, DB Visualizer, Nexus, Apache Derby, MySQL and Linux.

BI Developer

Confidential, Seattle, Washington

Responsibilities:

  • Worked with several clients with day to day requests and responsibilities.
  • Involved in analyzing system failures, identifying root causes and recommended course of actions.
  • Worked on Hive for exposing data for further analysis and for generating transforming files from different analytical formats to text files.
  • Wrote the shell scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
  • Utilized Java and MySQL from day to day to debug and fix issues with client processes.
  • Developed, tested, and implemented financial-services application to bring multiple clients into standard database format.
  • Assisted in designing, building, and maintaining database to analyze life cycle of checking and debit transactions.
  • Excellent JAVA, J2EE application development skills with strong experience in Object Oriented Analysis, Extensively involved throughout Software Development Life Cycle (SDLC).
  • Strong experience of software and system development using JSP, Servlet, Java Server Face, EJB, JDBC,JNDI, Struts, Maven, Subversion, JUnit, SQL language.
  • Rich experience of database design and hands-on experience of large database systems: Oracle 8i and Oracle 9i, DB2, PL, SQL.
  • Hands-on experience of Sun One Application Server, Web logic Application Server, Web Sphere Application Server, Web Sphere Portal Server, and J2EE application deployment technology.

Environment: Java, JDBC, JNDI, Struts, Maven, Subversion, JUnit, SQL language, spring, Hibernate, JUnit, Oracle, XML, Putty and Eclipse.

BI Cognos Developer

Confidential, San Francisco, California

Responsibilities:

  • Involved in development and deployment of this project.
  • Developed Listener classes to listen JMS Message on JMS Queues and created spring beans for listener classes to JMS Queues.
  • Developed and enhanced existing Normalizer components using JAXP to convert them into generic format for Alert system process, various formats of incoming xml documents/messages sent by various upstream systems.
  • Used JAXP Document Builder Factory for creating the existing xml document DOM instances to create a document, append new elements and save.
  • Used Spring JMS Template and Message Creator to post messages into JMS queues.
  • Architect the Cognos 8 BI environment to support high concurrency or Expert in designing the Cognos business intelligence environment.
  • Worked with Business Analysts, client users and internal systems resources on Cognos installations, upgrades, troubleshooting, and bug resolution.
  • Handling complete range of BI capabilities: reporting, analysis, score carding, dashboards and business event management.
  • Responsible for all technical design deliverables.
  • Established Cognos security policies, including authentication and authorization processes and roles.
  • Collaborated with peer business intelligence and data warehouse architects on reporting solution designs
  • Provided technical support for Cognos 8 applications including report analysis, development, maintenance, and report migration.
  • Developing and maintaining Framework Manager Models; and applying proper securities and publishing packages.
  • Provided training, support and guide for best practices for Adhoc Query Users and Report Authors.
  • Designed Functional Requirement Specifications for report development, modified relationships between different database tables and building models using Framework Manager.

Environment: JDK 1.7, JAXP, Spring, Spring JDBC, JMS, MQ server, Tomcat 7, Oracle 10g, Eclipse, MQ Surfer, WinSCP, SharePoint, SVN, Maven, Java, report studio, Oracle, PL/SQL, Web logic Application Server 8.1, XML.

We'd love your feedback!