Sr Big Data Architect Resume Irvine CA - Hire IT People

SUMMARY

Over 12+ years of experience in Software development lifecycle - Software analysis, design, development, testing, deployment and maintenance.
Working as Big Data Architect for the last 4 years and having strong background of big data stack like Spark, Scala,Kafka, Hadoop, HDFS, MapReduce, Hive,Cassendra, Python,SQOOP, and PIG.
Hands-on experience wif Apache Spark and its components (Spark core and Spark SQL)
Experienced in converting HiveQL queries into Spark transformations using Spark RDDs and Scala
Hands on experience in in-memory data processing wif Apache Spark
Developed Spark scripts by using Scala shell commands as per the requirement
Experience in designing and developing POCs in Spark using Scala to compare the performance of Spark wif Hive and SQL/Oracle
Broad understanding and experience of real-time analytics and batch processing using apache spark.
Hands on experience in AWS (Amazon Web Services),Cassendra,Kafka,python and cloud computing.
Experience wif agile development methodologies like Scrum and Test-Driven Development, Continuous Integration
Ability to translate business requirements into system design
Experience in importing and exporting data from HDFS to RDBMS/ non-RDBMS and vice-versa using SQOOP
Analyzed large amounts of data sets by writing Pig scripts and Hive queries.
Hands on experience in writing pig Latin scripts and pig commands
Experience wif front end technologies like HTML, CSS and JavaScript
Experienced in using tools like Eclipse, NetBeans, GIT, Tortoise SVN and TOAD.
Experience in database development using SQL and PL/SQL and experience working on databases like Oracle 9i/11g, MySQL and SQL Server.
Effective team player and excellent communication skills wif insight to determine priorities, schedule work and meet critical timelines.
Certified in FINRA (Financial Industry Regulatory Authority, Inc)

TECHNICAL SKILLS

Big Data: Apache Spark, Scala, Map Reduce, HDFS, HBase, Hive, Pig, SQOOP, PostgreSQL

Databases: Oracle 9i/11g, My SQL, SQL Server 2000/2005

Hadoop distributions: Cloudera, Hortonworks, AWSDWH (Reporting) OBIEE 10.1.3.2.0/11 gDWH (ETL) Informatica Power Center 9.6.x

Languages: SQL, PL/SQL, Python,Java

UI: HTML, CSS, JavaScript

Defect Tracking Tools: Quality Center, JIRA

Tools: SQL Tools, TOAD

Version Control: Tortoise SVN, GitHub

Operating Systems: Windows ..., Linux/Unix

PROFESSIONAL EXPERIENCE

Confidential, Irvine CA

Sr Big Data Architect

Responsibilities:

Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
Hands on experience in Spark,Cassendra,kafka,Python and Spark Streaming creating RDD's, applying operations -Transformation and Actions.
Used HIVE to analyze the partitioned and bucketed data and compute various metrics for reporting.
Handled importing of data from various data sources, performed transformations using Hive, Spark and loaded data into HDFS.
Developed park code and Spark-SQL for faster testing and processing of data.
Snapped the cleansed data to the Analytics Cluster for reporting purpose to Business.
Hands on experience on AWS platform wif S3 & EMR.
Experience on working wif different data types like FLATFILES, ORC, AVRO and JSON.
Automation of Business reports using Bash scripts in UNIX on Data lake by sending them to business owners.
Provided design recommendations and thought leadership to sponsors/stakeholders that improved review processes and resolved technical problems and suggested some solution.

Environment: Apache Spark, Scala, Spark-Core, Spark-SQL, Python, Hadoop, MapReduce, HDFS, Hive, Pig, MongoDB, Sqoop, Oozie,Kafka, MySQL, Python,Java (jdk1.7), AWS

Confidential, Sunnyvale CA

Senior Big Data Architect

Responsibilities:

Build patterns according to business requirements to help find violations in the market and generate alerts by using Big Data technology (Hive, Tez and Spark) on AWS
Worked as a Scrum Master, facilitating team productivities and monitoring project progress by applying Agile Methodology Scrum and Kanban on JIRA board to ensure quality of deliverables
Optimize the long-run pattern by writing shell-scripts and using optimization settings in Hive (e.g. successfully changed 20 hours daily pattern into 7 hours run by figuring out data skew in TB level table, which was adopted company-wise and saved around 50,000 USD per year)
Migrate on-prem RDBMS (Oracle, Greenplum) code into HiveQL and Spark SQL running on AWS EMR
Participate in Machine Learning project, including decision tree modeling and feature engineering
Responsible for ETL and data warehouse process to transfer and register data into AWS S3
Develop Hive UDF functions wif Java and modify framework code wif Python

Environment: Apache Spark, Scala, Spark-Core, Spark-Streaming, Python,Spark-SQL, Hadoop, MapReduce, HDFS, Hive,Kafka, Pig, MongoDB, Sqoop, Oozie, MySQL, Java (jdk1.7), AWS

Confidential

Senior Apache Spark Consultant

Responsibilities:

Gather business requirements for the project by coordinating wif Business users and data warehousing (front-end) team members.
Involved in products data injection into HDFS using Spark
Created partitioned tables and bucketed data in Hive to improve the performance
Use Amazon Web Services (AWS), EC2 for computing and S3 as storage mechanism.
Load data into MongoDB using hive-mongo connection jars for the purpose of reports generation.
Developed Spark scripts by using Scala shell commands as per the requirement.
Loaded the data into Spark RDD and do in memory data Computation to generate the output response.
Optimizing of existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frames and Pair RDD's.
Performed advanced procedures like text analytics and processing, using the in-memory computing capabilities of Spark using Scala.
Worked on migrating Map Reduce programs into Spark transformations using Spark and Scala.
Handled importing of data from various data sources from Oracle into HDFS vice-versa using Sqoop.
Involved in converting Map Reduce programs into Spark transformations using Spark RDD's on Scala.
Migrating various Hive UDF's and queries into Spark SQL for faster requests.
Involved in creating Hive tables, loading wif data and writing hive queries which run internally in MapReduce.

Environment: Apache Spark, Scala, Spark-Core, Spark-Streaming, Spark-SQL, Hadoop, MapReduce, HDFS, Hive, Pig, Kafka,MongoDB, Sqoop, Oozie,Python, MySQL, Java (jdk1.7), AWS

Confidential

Big Data Developer

Responsibilities:

Lead the AML Cards North America development and DQ team successfully to implement the compliance project.
Involved in the project from POC and worked from data staging till saturation of DataMart and reporting. Worked in an onsite-offshore environment.
Completely responsible for creating data model for storing & processing data and for generating & reporting alerts. dis model is being implemented as standard across all regions as a global solution.
Involved in discussions and guiding other region teams on SCB Big data platform and AML cards data model and strategy.
Responsible for technical design and review of data dictionary (Business requirement).
Responsible for providing technical solutions and work arounds.
Migrate of the needed data from Data warehouse and Product processors into HDFS using SQOOP and importing various formats of flat files into HDFS.
Involved in discussion wif source systems for issues related to DQ in data.
Implemented partitioning, dynamic partitions, buckets and Custom UDF's in HIVE.
Used Hive to process data and Batch data filtering.
Supported and Monitored Map Reduce Programs running on the cluster.
Monitored logs and responded accordingly to any warning or failure conditions.
Responsible for preserving code and design integrity using SVN and SharePoint.

Environment: Apache Hadoop, HDFS, Hive, Map Reduce, Hive, Pig, HBase, Zookeeper, Oozie, MongoDB, Python,Java, Sqoop

Confidential

Oracle Database Developer

Responsibilities:

Designed, developed, and maintain an internal interface application allowing one application to share data wif another.
Analyzed 90% of all changes and modifications to the interface application.
Coordinated development work efforts that spanned multiple applications and developers.
Developed and maintain data models for internal and external interfaces.
Worked wif other Bureaus in the Department of State to implement data sharing interfaces.
Attended Configuration Management Process Working Group and Configuration Control Board meetings.
Performed DDL (CREATE, ALTER, DROP, TRUNCATE and RENAME), DML (INSERT, UPDATE, DELETE and SELECT) and DCL (GRANT and REVOKE) operations where permitted.
Design and develop database applications.
Design the database structure for an application.
Estimate storage requirements for an application.
Specify modifications of the database structures for an application.
Keep the database administrator informed of required changes.
Tune the application during development.
Establish an application's security requirements during development.
Created Functions, Procedures and Packages as part of the development.
Assisted the Configuration Management group to design new procedures and processes.
Lead the Interfaces Team wif responsibility to maintain and support both internal and external interfaces.
Responsible for following all processes and procedures in place for the entire Software Development Life Cycle.
Wrote documents in support of the SDLC phases. Documents include requirements and analysis reports, design documents, and technical documentation.
Created MS Project schedules for large work efforts.

Environment: Oracle 9i, Informatica 7.1.x, Control-M, TOAD, Linux/Unix

We provide IT Staff Augmentation Services!

Sr Big Data Architect Resume

Irvine, CA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship