Senior Software Engineer Resume
Durham, NC
SUMMARY
- 7+ years of Total IT Experience.
- Experience in Data architecture, Data Lake, Business Intelligence, BigData Hadoop, Data warehousing, Blackened distributed design & development, ETL Data Integration and cloud Migration.
- Expertise in SnowFlake cloud computing (AWS) using SnowSQL, Python and Apache Airflow
- Experience in building Data Lake and ingesting data into it with custom ingestion framework.
- Experience in building Data Lake, ETL/ELT pipeline and continuous data loading on Cloud Data Warehouse
- Extensive experience in BigData technologies Hadoop, BigData, Hive, Pig, Hbase, Spark, Kafka, Java and Python.
- Experience in cloud platform (Amazon S3 and SnowFlake Computing), AWS and GCP Cloud products
- Around 2 years of Big Data, Hadoop Experience.
- 3+ years of Experience on ETL DataStage, ELT, Automation techniques, Informatica, Netezza, Oracle, TeraData, Posgres, MySQL, SQL Server etc.
- 2 + years of Experience in Apache Spark, Scala, Java/J2EE, Hive, Impala, Oozie, Kafka, MongoDb.
- 2+ years of Experience in Amazon Web Services: EC2, Amazon Elastic Map Reduce (EMR), Amazon S3, Lambda, Redshift, RDS.
- Experience with AWS Core Services: CloudFormation, EC2, ECS/ Docker, ELB, CodePipeline, CodeDeploy, CodeBuild, CodeCommit / Git, RDS, S3, CloudWatch, Lambda, IAM, Jenkins.
- Support Technical Program Manager, Research Scientist, and a growing virtual team aimed at analyzing usage data to derive new insights and fuel customer success
- Build a high quality BI and Data Warehousing team and design the team to scale
- Create ETLs to take data from various operational systems like Adobe Target, Adobe Connect, Sales Force and create a unified dimensional or star schema data model for analytics and reporting
- Leading a team of data engineers who are responsible for developing data engineering solutions (Data model design and development, ETL development as well as reporting and analytical solutions)
- Experience in Core Java with strong understanding and working knowledge of Object - Oriented Concepts like Collections, Multi - Threading, Exception Handling and Polymorphism.
- Experience in architecture and technology leadership across Batch and Streaming data processing platform in Big Data and Cloud Data technologies
- Experience in BIG DATA Hadoop HDFS, Apache Spark, Hive,Pig and Sqoop, Flume,Cloudera, Hue experience.
- Experience in Design/ Development and Implementation of Big Data Application.
- Expert in wringing SQL in Hive/Impala environment.
- Experience in Creating Audit control system for ETL process for Big Data and Data Warehouse Application.
- Strong skills in ETL (DataStage/Informatica) Architect/Design and development and Performance Tuning the Data warehouse/Big Data.
- Proficient in Data Analysis, Data Validation, Data Lineage Data Cleansing, Data Verification and identifying data mismatch
- Experience in Version Control/ Upgrading DataStage Version and Code Migration using GitHub.
TECHNICAL SKILLS
Cloud Platform: AWS
Big Data / Hadoop: Cloudera 4.x/5.x Manager/Hue, Apache Hadoop, HortonWork HDP 2.3, Hive, Pig, Sqoop, Kafka 0.9.0.x/0.10.2.0, Ambari, Nifi
ETL Tools: Informatica 6.2/7/9.5, DataStage 7.5/8.7/11.3 , Bteq, Fast load, Mload, TPT.
Languages: SQL, PL/SQL, Python, Java, Shell Script
BI Analytics / Visualization: Tableau, Microstrategy, Impala, Beeline, Hive, Microsoft Azure, Elastic Search, Real time analytics using Kibana 5.4 (Timelion), Zoomdata
Database/File System: Mongo Db,Teradata 14/V2R6/V2R5, Oracle 10g/9i, Netezza, SQL Server 2000/2005/2008 , DB2, Hadoop HDFS
Operating Systems: Linux, IBM AIX, Ubuntu
IDE Tools: Intellij, Eclipse, Netbeans
Design Tools: ERwin 9.5.2/7.3/4.1 , MS - Visio 2007, Power Designer 15.2/16.5, IBM - InfoSphere Data Architect 8.x/9.x
Agile Tools: Rally, JIRA
PROFESSIONAL EXPERIENCE
Confidential, Durham,NC
Senior Software Engineer
Responsibilities:
- Created and ImplementedMicroservicesusing Spring Boot.
- Designed and developed the application using AGILE-SCRUM methodology.
- Used Amazon WebServices(AWS) like EC2, S3, RDS, cloud watch and Cloud Front for promoting code in various environments.
- Designed various tables required for the project in Oracle database.
- Prepared JUnit and Integration test cases and integrated with Jenkins.
- Followedagilemethodology during the development process of the datadesigning.
- Migrated the existing Datastage Jobs into equivalent Spark-Scala with reusable artifacts.
- Ingested Netezza data into Hadoop and vice-versa by using the Sqoop scripts.
- Designed and customized Kafka client like Producer and Consumer.
- Developed brand new complex Spark-Java applications to transform the source input files to target output files.
- Developed Spark-Java applications with the map-side joins and reduce-side joins based on the JIRA stories.
- Created Linux Shell scripts to invoke the Hadoop and artifacts deployment.
- Scheduled Oozie workflows to execute the Spark applications and data ingestion components and Hadoop Jobs by Direct Acyclic Graph (DAG) of actions with control flows.
- Optimized the Spark applications by following the Spark APIs efficiently and effectively to handle large volume datasets.
- Designed data modeling in Hive to load complex source data in HDFS.
- Designed Data parsing, data recovery, fault tolerance, exception module using Spark/Java
- Designing ETL Data Pipeline flow to ingest the data from RDBMS source to Hadoop using shell script, sqoop, ssis package and mysql.
Confidential, Fairfax, VA
Software Engineer
Responsibilities:
- Migrating Oracle Exadata to cloud platform (SnowFlake Computing) using AWS as a Data Lake
- Responsible for cloud application architecture design and deployment
- Defined the architecture for the data migration to AWS cloud and Snowflake database.
- Defined the Spark based application for tokenization of Sensitive data before migrating to cloud.
- Developed utility to CDC in ETL ( Ability to Identify Inserts and updated records and load type-2 dimensions)
- Developed utility to CDC in ETL ( Ability to load snapshot, fact and transaction tables)
- Responsible for design/development activities which may include leading, participating, or supporting concept, planning, design and execution stages, including
- Building ETL pipelines for continuous data load into SnowFlake cloud data warehouse.
- Developing scripts using Python programming language, SnowSQL to migrate the data into cloud
- Working on existing Data Integration projects to load the data into Netezza as well as loading into Hive as foundation layer
- Developing ETL jobs to integrate data from multiple sources and loading into NZ warehouse for BI & reporting Analytics.
- Working on Data Sharing and Data Security
- Collaborate with Product Management in defining the product vision and guiding teams on planning, designing and building software.
- Worked on defining the strategy for data ingestion into the S3 lake and used Hive for defining the consumption of the data.
Confidential, Raleigh,NC
ETL Developer
Responsibilities:
- Defined projects, environment variables, user groups and privileges and set up different environments (Dev/QA/Prod). Created master controlling sequencer jobs using the Data Stage Job Sequencer.
- Extensively used the CDC ( Change Data Capture ) stage to implement the slowly changing Dimensional and Fact tables.
- Designed the complete 3NF and End State ETL architecture and developed around 150 jobs, routines and sequences on Datastage 8.1 platform for the end to end extraction, transformation and load process meeting aggressive deadlines.
- Extensively used Datastage components Dataset, Sort, Transformer, Modify, Lookup, Join, Merge, Oracle Enterprise stage & Change Data Capture and Slowly changing dimensions.
- DataStage job design, development & debugging to populate data warehouse which is used for ad - hoc reporting.
- Preparation of interface design and mapping.
- Unit and System Integration testing.