Hadoop Database & Analysis En, Research & Development Resume San Francisco, CA - Hire IT People

SUMMARY

2+ Years of experience which including big data, analysis and cloud computing
3+ Years experience in Java Development on Unix Platform
Experienced with agile environment, familiarity and experience in different phases of software development life cycle
Experienced both Linux and Windows environment
Grasp new technologies quickly
Strong teamwork spirit and communication skills
Hands on experience in building and configuring the Hadoop environment
Constantly install, test and certify the interoperability of multiple and partner products
Strong Hadoop and ETL development experience
Data Driven Professional with hands on experience in Hadoop development and administration
Experience in HDFS, MapReduce, Hbase, Hive, Pig, Sqoop, Cloudera HUE
Expertise in writing HIVE queries & Pig scripts
Hands on experience in writing Map Reduce jobs and Hive & Pig UDFs program
Experience on load/extract data from Hbase and configure Hbase cluster
Experience in performance tuning, storage capacity management and high availability of Hadoop and Hbase cluster
Manage and execute complex, multi - job test plans and procedures
Strong in mathematics, specifically statistics
Deep understanding computing and storage system on the hardware side

TECHNICAL SKILLS

Programming: Java, C++, C, VB, Unix shell scripting, PowerShell script, SQL, Matlab

Database: MySQL, SQL Server

NoSQL Databases: Hbase, MongoDB, Cassandra

Hadoop Ecosystem: HDFS, Hbase, MongoDB, Hive, Pig, Flume, Sqoop, Zookeeper, Avro, Ooize

Cloud Computing: Amazon Web Service, EC2, S3, Microsoft Azure

Source safe: TFS, SVN, Maven

PROFESSIONAL EXPERIENCE

Confidential, San Francisco, CA

Hadoop Database & Analysis En, Research & Development

Responsibilities:

Involved in functional requirement review. Worked closely with product manager and bossiness analyst
Experienced with agile environment use TeamPulse project management software
Design & replace multiple Dashboard using web form and MVC framework, extract data from SQL server. Making POC using Telerik API, PowerBI and Birst BI platform
Setup and configured Cloudera Manager and Navigator on Microsoft Azure VMs from scratch. Setup Kerberos(MIT) encryption using windows server 2012. Writing Powershell Script to realize Cloudera Manager installation automation on the Azure Cloud
Take part in DevOps team to push Data package into different environment from Visual Studio Team Foundation Server (TFS) to Octopus Deploy (deployment automation tool) by writing PowerShell script
Take part in Database design includes: Member record matching, Change log Tracking, Interface table design
Extensively writing complex Stored Procedures and queries to realize ETL process from raw data (XML or JSON file) to different databases. Using Dell Boomi to convert data format and writing stored procedures to verify data (by checking data types, transaction types, nulls and duplicates), move data through staging database to target databases using SQL server management studio 2012 Enterprise version
Designed, delpoyed and monitored SSIS packages and jobs for the ETL stored procedures based on business requirements
Gained good experience on working with start-up company

Environment: SQL server, Visual Studio, TFS, Teampulse, Octopus Deploy, PowerShell, C#, VB, Javascript, Cloudera CDH5.3

Confidential, Woburn, MA

Hadoop Developer (Intern)

Responsibilities:

Involved in functional requirement review. Worked closely with Risk & Compliance Team and BA
Setup and configured Hadoop cluster consisting of 20 nodes use Cloudera manager
Designed and configured Flume servers and Kafka to collect data from the all kinds of sources and store to HDFS
Set up data ETL’s and pipelines to transfer data from RDBMS to HDFS and vice versa using Sqoop
Create custom Hadoop Map-reduce jobs for cleaning and pre-processing click-streaming data to create training data for machine learning
Actively involved in working with Hadoop Administration team to debugging various slow running jobs and doing the necessary optimization (RPC, network bandwidth, Hbase filters, Hive joins)
Creating Hive tables and working on them using Hive QL, which involve hive table pattern design, query optimization and data import and export to HDFS and MongoDB
Did various performance optimization like using distributed cache for small datasets, partition, bucketing inhiveand Map Sidejoins when writing MR jobs
Installed Oozie workflow engine to run multipleHiveand Pig jobs which run independently with time and data availability
Worked with cloud services like Amazon web services (AWS) and Microsoft Azure
Used different file formats like Text files, ORC/RC, XML, JSON
Cluster co-ordination services through Zookeeper
Avro, Thrift, and protocol buffer
Good knowledge on Hadoop Yarn, Spark, Storm architecture
Assisted in creating and maintaining Technical documentation to launching Hadoop Clusters and for executingHivequeries and Pig Scripts

Environment: Hadoop 0.20.2 MR1, CDH4.7, HDFS, Hbase 0.90.x, MongoDB 2.2, Hive 0.12.0, Impala, Pig 0.12.0, Flume 1.5.0, Sqoop,Zookeeper, Avro, Ooize 3.3.0

Confidential

Hadoop Developer (Intern)

Responsibilities:

Installed and configured Hadoop Yarn environment
Importing and exporting data into HDFS and Hive using Sqoop at the daily base
Extensively using Hive query language to perform table joins and matching
Extensively create, load and query hive tables and constantly debug the unhealthy query
Hand on experience working on written Hive UDFs/UDAFs.
Experienced in defining job flows use Oozie
Experienced in managing and reviewing Hadoop log files and process log files with Storm stream processor
Load and transform large sets of structured, semi structured and unstructured data use Hbase
Responsible to manage data coming from different sources, hands on experience on writing JDBC and ODBC driver using core Java language
Involved in loading data from UNIX file system to HDFS through FTP sever periodically using Linux command, setup and configure the authentication to different users and groups
Gained very good business knowledge on health insurance, claim processing, fraud suspect identification, appeals process etc.

Environment: Hadoop Yarn, HDFS, Hive, Hbase, Oozie, AWS, CDH 4.7, Java, Scala

We provide IT Staff Augmentation Services!

Hadoop Database & Analysis En, Research & Development Resume

San Francisco, CA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship