Big Data Developer Resume
5.00/5 (Submit Your Rating)
San, AntoniO
SUMMARY:
- Data enthusiast with over 8 years of professional experience in Analysis, Design and Application Development for software applications using distributed technologies involving Spark, Spark SQL, Hive, Sqoop, HDFS, Scala, DataStage
- Cloudera certified CCA 175 - Cloudera Spark and Hadoop Developer
- Very Good knowledge in Flume, Kafka and Spark Streaming, Mongo DB.
- Good experience on Data Stage Components Designer, Director and Admin for designing, developing and debugging the jobs
- Experience in development of Parallel jobs, stages, sequencers in IBM Data Stage.
- Good knowledge on UNIX commands.
- In depth knowledge in Data Warehousing concepts with emphasis on ETL and Slowly changing dimensions (Type1, Type2) and Change Data Capture.
- Extensively involved in development of OLTP and OLAP schema’s in various database systems using RDMS like Oracle.
- Exposure in working with diverse software methodologies including Agile/Scrum.
- Excellent communication and interpersonal skills and capable of learning new technologies very quickly
TECHNICAL SKILLS:
Programming: Scala, SQL, Shell/Bash scripting
Big: Data Technologies: Spark, Hive, Spark-SQL, Sqoop, HDFS, Flume.
ETL and BI: Tools: IBM DataStage v8.7 v9.1 v11.5, Tableau
Databases: Oracle, DB2, SQL Server, Mongo DB
Tools: Control-M, RTC, WinSCP, Putty, SQL Developer, Squirrel, UCD
Methodologies: Agile, Waterfall
PROFESSIONAL EXPERIENCE:
Confidential -San Antonio
Big Data Developer
Responsibilities:
- Built FIH (Financial Information Hub) Data lake by extracting data from different investment sources.
- Created Sqoop jobs for importing data from DB2 & Oracle to HDFS.
- Created Hive schemas using performance techniques like partitioning and bucketing.
- Written extensive Hive queries to do transformations on the data to be used by downstream models.
- Developed Spark jobs using Scala and Spark SQL to evaluate Alert Risk Score.
- Created Control-M workflow for invoking Spark and Sqoop jobs.
- Developed around 50 IBM DataStage jobs to load Datawarehouse and Data Store.
- Developed complete end to end Big - data processing in Hadoop eco system.
- Automated Enterprise Case and Alert Management System Alerts and Cases reports using Tableau.
- Involved in complete end to end code deployment process in Production.
Confidential
ETL Developer
Responsibilities:
- Developed load framework to reduce future development work using IBM DataStage & oracle database.
- Integrated 12 USAA financial data sources into Data Hub.
- Developed DataStage jobs to extract financial data from Enterprise Datawarehouse
- Designed and developed jobs for extracting, cleansing transforming, integrating, and loading data using Data Stage Designer
- Created Hive tables & queries to extract data and feed into forecast analytics engine.
- Created end to end ETL process to feed 12 different downstream systems with Brokerage data.
- Enhancing/Fine tuning the existing jobs to improve performance.
- Developed reusable UNIX shell scripts for FTP, Email Notifications and to trigger DataStage jobs.
- Created 500 new control-M jobs in each environment (Dev, Test and prod) and process flow diagrams.
Confidential
ETL Developer
Responsibilities:
- Developed DataStage jobs to extract Credit card details, customer details, account details and the transaction details on daily and monthly basis from upstream
- Developed Data quality jobs in DataStage with all the quality checks and generated the clean file
- Developed CDC logic in DataStage to processed the data before loading to target tables.
- Developed DataStage jobs with many processing stages to like Transformer, Lookup, Join, Pivot, CDC to process the data.
- E3 framework is developed for the reusability purpose
- Worked on migrating all the jobs to E3 framework
- Developed reusable UNIX shell scripts for FTP, Email Notifications and to trigger DataStage jobs.