Hadoop Developer Resume San Jose - Hire IT People

SUMMARY:

Hadoop Developer with 8 plus years of Information Technology experience in the field of development, enhancement and maintenance for Banking and Financial applications as well as Retail DW applications
Expert in designing data pipelines from Ingestion, processing and reporting
4 years of hands on experience in implementing data platforms in Hadoop
Experience in working with Agile and waterfall models
Hands - on experience in SQL, HiveQL, Spark, DataStage(ETL) and performance tuning
Expert in process automation using scripting languages like Python and Shell
Adept at maintaining focus on achieving bottom-line results while formulating and implementing advanced technology and business solutions to meet a diversity of needs
An active team player with effective communication and interpersonal skills
Enjoy brainstorming the merits of new ideas, establishing a direction, and proceeding with resolve. A quiet achiever, encouraging mentor, and decisive problem-solver
Consistently deliver business-critical projects on schedule and within budget, despite intense pressure and tight timelines

TECHNICAL SKILLS:

Big Data Platform: Hadoop, Hive, Sqoop, SparkSQL, PySpark, MapReduce (Python), YARN, HDFS, Hbase

Scripting: Shell Scripting, Python

DBMS: Teradata, Oracle, DB2, MySQL

ETL: DataStage, Talend for Big Data

Version Control: Git, SVN, TFS

Cloud: AWS S3,EC2,RedShift

PROFESSIONAL EXPERIENCE:

Confidential, San Jose

Hadoop Developer

Technology: Hortonworks HDP 2.5.3,Hadoop, Hive, SparkSQL, PySpark, Teradata, Shell, Sqoop, Apache Livy, Git, Rally, Hbase

Responsibilities:

Design data ingestion process using tools Sqoop, Flume and Kafka
Design ETL data pipelines using combination of tools like Hive, SparkSQL, Hbase and PySpark
Migrate existing Spark Jobs in production to run via “Spark Compute as a Service using Apache Livy ” framework which enabled SparkSession sharing and improve performance
Design and develop a homogenous layer on hive to accommodate various data sources adhere to the same data model
Designed logic to implement ACID in hive by integrating hive on Hbase
Process and load real time data for every 30 minutes on to HDFS using HiveQL
Develop reporting queries using OLAP functions on top of the Financial data in Hive and publish to the Business users on regular time intervals
Automation of manually running reports using Shell Scripting, Teradata and scheduled in Crontab
Migration of Teradata tables to Hadoop using Hive and orchestration via internal python Framework

Confidential, Austin, TX

Hadoop Developer

Programming Languages: Map R Hadoop v2,Hive,Pig,Spark,Shell,JIRA,Oozie

Responsibilities:

Design ETL pipelines using Hive and Spark (PySpark)
Develop job orchestration using Shell scripting
Refactor Data Ingestion into Hadoop Data Lake from disparate third party vendors for better performance using SFTP, Gsutil and Teradata connector for Sqoop
Refactor HDFS schema design according to best practices
Design scalable data layout in Hive by choosing the right file formats (parquet, sequencefile, ORC) and compression codecs (snappy, Lzo etc.)
Develop SparkSQL code to replace traditional Hive MapReduce jobs
Work on SCRUM mode, participate in sprint retrospectives and planning sessions
Automated testing script to perform QA
Job scheduling using Oozie

Confidential, Bentonville, AR

Hadoop Developer

Programming Languages: Talend, Hadoop, HiveQL, HDFS, Pig, Sqoop, Spark, UNIX, Oracle, Teradata, Azure

Responsibilities:

Provide BI consulting solutions
Use Big Data technologies like Hadoop, Cassandra in BI data delivery
Involved in deployment of HDP cluster on Microsoft Azure
Data Migration from existing Teradata Systems to Hortonworks HDInsight cluster on Azure
Leverage core expertise in solution design and managing enterprise wide BI (Data warehousing/Data Integration) implementations
Design and implement ETL solutions with tools like Datastage, Talend Open Studio for Big Data
Perform data analysis over large datasets using Apache Pig, Apache Hive and Spark
Design and build data staging and summary(aggregated) area in Hive DW
Perform Data visualization techniques like dashboards, scorecards using Tableau

Confidential

Hadoop Developer

Programming Languages: Datastage, Unix Scripting, Teradata, Oracle. DB2,TFS

Responsibilities:

Confidential

Hadoop Developer

Programming Languages: DataStage, Oracle, DB2, MySQL, Toad, IBM SORSA,SVN

Responsibilities:

Create Technical Design and ETL mapping documents
Perform impact analysis pertaining to DML and DDL changes to the Banking Data Warehouse
Prepare DDL and DML scripts
Design DataStage ETL jobs, DataStage Sequences and Shell scripts
Unit testing of DataStage ETL jobs
Design the flow of execution using Datastage
Performance tuning of SQL queries and Datastage jobs
Involved in version control activities using Tortoise SVN