Big Data Hadoop Architect / Lead Developer Resume ouston, TX - Hire IT People

SUMMARY:

16+ years of experience in software development, architecture decisions and leading projects from concept through the release process
4+ years of experience in Hadoop Big Data solutions - (Architect, Data model, data pipeline for ingestion, development)
2+ years of experience in Scala / Spark
Hands on Experience in Big Data technologies (Hive, Sqoop, Scala, Spark, Spark-Sql, Pig, Oozie, HDFS, Map Reduce)
Cloudera Certified Developer for Apache Hadoop
Proficient in Scala/Spark development and performance tuning
Hands on in Hive, Sqoop, ORC and performance tuning (Tuning by using Vectorization, CBO, Bucketing and Partitioning)
Good exposure in Hadoop Lambda Architecture
Proficient in all Phases of SDLC (Analysis, Design, Development, Testing and Deployment) and gathering user requirements and converting them into software requirement specifications
Work closely with Business customers and worked as liaison between the Customer and Off-shore
Excellent Analytical, Programming and Logical skills
Good exposure in OLAP
Capable of handling multiple projects & teams at the same time
Good Experience as a Tech / Project Lead

TECHNICAL SKILLS:

Big Data Eco System: Cloudera Distribution for Hadoop (CDH), MapReduce, HDFS, YARN, Hive, Pig, Sqoop, Storm, Impala, Spark, Parquet, Flume, AWS, Snappy, Avro, HBase

Programming Languages: Core Java, Scala

Scripting Languages: Shell Scripting

Operating Systems: LINUX, Windows

Database: Oracle, MySQL, Teradata, SQL DW

Tools: IntelliJ, Eclipse, Toad, ER Studio, Apache Ranger

Other Technologies: MS Azure, SSAS, SSRS, PowerBI, Blob, ADF, AWS S3, KMS

Methodologies: Waterfall, Agile

PROFESSIONAL EXPERIENCE:

Confidential, Houston, TX

Big Data Hadoop Architect / Lead Developer

Responsibilities:

Performance Tuning (Vectorization, CBO, Partition & Bucking) large Hive reporting queries
Optimize the load process during Partition & Bucketing
Developing Scala/Sparks Jobs using various APIs (RDD, DataFrame, DataSet)
Writing Spark Sql Queries using Analytical / Aggregate function
Create data pipeline to ingest different data sources into Hadoop data lake using Spark/Scala
Perform map side join in Scala/Spark programs using Broadcasting variables
Ingested Structured / Semi-Structured data sources using Spark/Scala
Create Hive and Scala/Spark queries using Analytical functions
Perform historical and incremental loading of data into Hive Partitioned tables using Sqoop
Interact with different stakeholders to get the requirement to bring the data into Enterprise Data Warehouse (EDL - HDP 2.4)
Translate the requirements into architecture
Architecture & Data Governance processes
Interact with the Risk assessment team for the Cyber Security approval for the Fed LLC data
Prepare application architecture diagrams, application blueprints, roadmaps, etc.,
Define the Big Data - Hadoop guidelines and roll-out to the project team
Manage the off-shore team to get the requirement done in Hadoop and MS Azure
Design and create Data Model
Review, interpret and respond to detailed business requirements specifications (BRS) to ensure alignment between customer expectations and current or future ICT capability
Develop, test and implement technology solutions and report on delivery commitments to ensure solutions are implemented as expected and to agreed timeframes

Technologies: Horton Works, HDFS, Hive, Pig, Hue, Sqoop, Scala, Spark, Spark-Sql, Apache Ranger, Shell script, UNIX, Oracle, Toad, Talend, Amazon AWS, S3, KMS, Bucket Policies, MS Azure, DMG, Blob, ADF, SQL DW, SSAS, SSRS, Power BI, ER Studio, Load Balancer .

Confidential, Bellevue, WA

Big Data Architect / Lead Developer

Responsibilities:

Performance Tuning (Vectorization, CBO, Partition & Bucking) large Hive reporting queries
Optimize the load process during Partition & Bucketing
Developing Scala/Sparks Jobs using various APIs (RDD, DataFrame, Dataset)
Writing Spark Sql Queries using Analytical / Aggregate function
Create data pipeline to ingest different data sources into Hadoop data lake using Spark/Scala
Perform map side join in Scala/Spark programs using Broadcasting variables
Ingested Structured / Semi-Structured data sources using Spark/Scala
Create Hive and Scala/Spark queries using Analytical functions
Perform historical and incremental loading of data into Hive Partitioned tables using Sqoop
Provided design recommendations and thought leadership to sponsors /stakeholders that improved review processes and resolved technical problems.
Co-coordinate between the Business and the Off-shore team
Requirement gathering and prepare the Design
Work with different Business and stake holders for each track
Export and Import data into HDFS- HBase and Hive . creating Hive tables, loading with data and writing Hive queries
Loading data into Hive partitioned tables

Technologies: Horton Works, HDFS, MapReduce, Hive, Pig, Apache Ranger, Flume, Storm, Hue, Sqoop, Shell script, UNIX, Oracle, Toad, Active MQ, Scala, Spark, Spark-Sql, .