Sr. Software Engineer Resume
GA
SUMMARY
- Seeking an analytical role for a Data - Driven organization to utilize my data engineering and business intelligence skills for the following:
- Create actionable insights for business user to make better business decisions.
- Construct full spectrum data pipelines from raw data to consumption and analysis.
- 12+ years of IT experience in Data Warehousing and Data lake with emphasis on Project Planning & Management, Business Requirement Analysis, Application Design, Development, Testing, implementation and maintenance of Data Warehouse using wide range of Technologies such as Spark, Scala, Cassandra, Google BigQuery, BigData Hadoop stack, Teradata, Cassandra, Informatica, Unix, Java and Erwin.
- 7+ years’ experience as ETL Developer cum Data Architect in building enterprise datawarehouse applications using Teradata, Informatica, Erwin data modelling and 5+ years relevant experience in Big Data enterprise applications development, building scalable and high performance Big Data analytical systems with specialization in Hadoop Platform, Distributed Computing using Spark, Scala & Java Technologies.
- Expertise in tuning the Spark jobs by configuration file changes, analyzing the cluster resources, pool resources and applying distributed partitions like repartition & coalesce functions on spark RDD's.
- Hands on experience in developing Spark ETL process for different systems (Oracle, SQL Server, Teradata, Hive, Hbase, Cassandra).
- Expertise in working with large collection of data sets using Spark SQL and in-memory data frames, datasets etc.
- Hands on Experience in loading CSV, AVRO, XML, JSON file formats via Spark Scala programming.
- Created a POC to load data from flat file to splice machine using splice import procedures.
- Good expertise in implementing ETL processes via splice machine.
- Based on the business critical applications we switched from hive to splice machine.
- Dynamic Data Analytics leader with a successful track record of building Data Warehouses, Business Intelligence & Analytic Solutions that empowers companies to harness and monetize their data assets.
- Broad knowledge and perspective in data pipeline, data collection, data management, data engineering, reporting, analytics, and product/application development.
- Confident understanding of analytical and transactional databases with Teradata-EDW, Google Cloud & Hadoop systems.
- Excellent leadership skills with the ability to establish rapport build teams, foster relationships and communicate clearly across different organizations.
- Played a pivotal role in building centralized Enterprise Data Hub using Hadoop Platform that can cater to all the data analytical needs of an Enterprise.
- Good experience in architecting real time streaming applications and batch style large scale distributed computing applications using tools like Spark, Sqoop, Map-Reduce, and Hive etc.
- Good experience in creating complex data ingestion pipelines, data transformations, data management and data governance, real time streaming engines Confidential an Enterprise level.
- Good Experience in developing and implementing big data solutions and data mining applications on Hadoop using Hive, Pig, Hbase, Cassandra, Hue, Oozie workflows and designing and implementing Java Map Reduce programs.
- Extensive hands on experience in writing complex Map Reduce jobs, Pig Scripts and Hive data modeling.
- Have hands on experience in writing Map Reduce jobs using Java, Maven.
- Well versed in installing, configuring, supporting, managing and fine-tuning Peta Byte scale Hadoop Clusters.
- Experience in using Teradata Administrator, Teradata Manager, Teradata PMON,Teradata SQL Assistant and writing Teradata load/export scripts like BTEQ, Fast Load, Multi Load, TPUMP, TPT and Fast Export in UNIX/Windows environments.
- Supporting adhoc analysis requests from various business units with deadline. Conduct ETL using Talend on various RDBMS platforms and quantitative/hypothesis analysis delivering solutions for ads/promotion decisions.
- Expert in no-sql databases MongoDB, Cassandra, Hbase.
- Expert in Spark-Scala programming,Spark-Sql,Unix Scripting, Hive/Beeline,Pig,Sqoop,Flume, Splunk,HUE.
- Good experience in Java Spring Boot and REST APIs applications development.
TECHNICAL SKILLS:
Tools: Big Data & Hadoop Big Data Hadoop, Spark1.6, Map Reduce, Sqoop, Tez, Hue, Hive, Oozie, Flume and Pig.
Databases: Teradata V2R5/V2R6/12,13,Oracle, Microsoft SQL Server, Cassandra, Hbase, Google Cloud and Big Query
Programming Languages: Scala 2.10.5, C#, Java-Ant/Maven, UNIX Shell Scripting, SQL and PL/SQL
ETL Tools: Informatica 8.6/9.1,Talend, Teradata Tools & Utilities: BTEQ, Fast Load, Multi Load, Fast Export, T Pump and TPT
Data Modeling: Erwin
Reporting Tool: Tableau
Other Tools: Tivoli Work Scheduler, SVN, GitHub, Eclipse/Spring STS, Jenkins
Domain Knowledge: Retail, Banking, Finance and Telecom
PROFESSIONAL EXPERIENCE:
Confidential, GA
Sr. Software Engineer
Responsibilities:
- Worked closely with business to focus on “value added” features and added functionality Confidential regular and frequent intervals to encourage more usage and reduce development costs.
- Always try to build good relationship within the team, worked closely with business on multiple feeds, and delivered products successfully on time.
- Created end to end Spark applications using Scala to perform various data cleansing, validation, transformation and summarization activities according to the requirement.
- Developed Spark scala jobs to load Citi Feed Text/Json format to Hive parquet Tables in Production environment, which reduced disk space and minimize cost in Hadoop ecosystem.
- Developed Spark applications using Scala to pull data from Oracle, SQL Server, Teradata, MySQL database using JDBC connections.
- Involved in design and development, decommissioning existing legacy systems/applications such as Epiphany retirement and migrating the same feeds to Hadoop ecosystem and saved the cost to the company. This measure improved scalability and reliability in cost effective manner.
- As a part of Epiphany feeds (legacy system) retirement process, delivered and migrated Foresee, Acxiom Preferences & Email Ids Stamping feeds from Sql Server to Hadoop ecosystem and saved cost to the company.
- Created a POC to load data from flat file to splice machine using splice import procedures.
- Involved in the development of Home Depot applications: ECC, CCA, CGR and SVOC using spark-scala, Hive, Sqoop, Unix scripting, Java, Maven and Jenkins.
- Upgraded and repackaged CRM 14-Hadoop applications from Java6/Tomcat6 to Java7/Tomcat7 and deployed successfully onto Grid Stats & Hadoop Production Bastion Servers.
- Hadoop CGR changes, SVOC changes, ECC and CCA changes to process in Hadoop operational and marketing datasets.
- Delivered the Quality Project deliverables without any defects using TDD techniques onto Hadoop Production Environment on time.
- Ensure the whole business functionality tested thoroughly making sure 100% quality product with no defects before product/application go live into production.
- Expert in no-sql databases MongoDB, Cassandra, Hbase.
- Expert in Spark-Scala programming, Spark-Sql,Unix Scripting, Hive/Beeline,Pig,Sqoop,Flume, Splunk,HUE etc.
- Good experience in Java Spring Boot and REST APIs applications development.
Confidential, GA
Sr. Bigdata Engineer
Responsibilities:
- As part ofBig Data Center of Excellence (COE)responsible for creating technical guidance, road map and strategies in delivering various big data solutions throughout the Organization.
- Worked on data migration from existing data sources to Hadoop file system.
- Developed MES Ingestion framework using Spark scala programs for pushing daily source files into HDFS & Hive tables.
- Understand customer business use cases, able to translate them to analytical data applications and models to implement a solution.
- Created custom Database Encryption & Decryption UDF that could be plugged in while ingesting data to External Hive Tables for maintaining security Confidential table or column level.
- Worked on different applications such as MSP, CAPM, TELEGENCE, OPUS, COLUMBUS & Click Stream in Confidential &T.
- Developed Spark programs for different patterns of data on Hadoop cluster.
- Createddata ingestion plansfor loading the data from external sources usingSqoop, Teradata Fast Export and Data Router.
- Implemented dynamic partitions, bucketing and compression techniques in Hive External Tables and optimized worst performing hive queries.
- Developed ETL process for Data acquisition and Transformation using Spark.
- Troubleshoot issues during integration, testing & production readiness phases.
- Wrote technical design document, deployment document, supporting documents and release notes.
- Involved in the entire software development cycle, spanning requirements gathering, analysis, design,development, building, testing, and deployment.
- Developed real time API, Business Rules using Java and Unix Scripts.
Confidential
Bigdata Hadoop Developer
Responsibilities:
- Gathering requirements, builds logical models and provides quality documentation of detailed user requirements for the design and development of this project.
- Good experience is designing and implementing end to end Data Security and Governance within Hadoop Platform using LDAP/Kerberos etc.,
- Configured Hadoop in Linux and deploying application.
- Developed the scripts as per the requirement including Java Map Reduce Programs, Pig/Hive Scripts and Sqoop etc.
- Extracts data from Oracle to HDFS using Sqoop. Built Rules for different Product lines in Hive UDF for processing covered and uncovered product lines.
- Developed Ingestion framework by writing Unix automation shell scripts for pushing daily source files into HDFS.
- Developed Load and Extract ETLs and implemented History Load, Incremental Load scripts in Teradata and writing shell scripts for extract and load ETLs.
- Developed ETL logic as per Cisco standards from Source-Flat File, Flat-File-Stage, Stage-Work, Work-Work Interim tables and Work Interim tables- Target Tables using Bigdata.
- Fixing the issues from Source System to downstream datamart during the development process until the code goes live into production and providing post production support.
Confidential
Sr. Teradata & ETL Developer
Contribution
- Communicating with business users and analysts on business requirements. Gathering and documenting the technical and business Meta data.
- Prepared ETL Scripts for Data acquisition and Transformation. Developed the various mappings using transformation like source qualifier, joiner, filter, router, Expression and lookup transformations etc. in Informatica.
- Creating conceptual, logical and physical database models for different metadata tables, views or related database structures using ERWIN.
- Developing database architectural designs, modeling, and implementation of business requirements.
- Coding using BTEQ SQL of TERADATA, Implementing ETL logic using Informatica, transferring files using SSH-Client.
- Populate or refresh Teradata tables using Fast load, Multi load &fast export utilities/scripts for user Acceptance testing and loading history data into Teradata.
- Experience in creating and writing Unix Shell Scripts (Korn Shell Scripting - KSH).
- Preparing test cases and performing Unit Testing and integration testing.
- Performance tuning the long running queries. Worked on complex queries to map the data as per the requirements.
- Reduced Teradata space used by optimizing tables - adding compression where appropriate and ensuring optimum column definitions.
- Production Implementation and Post Production Support.
