Software Engineer Resume
Phoenix, AZ
SUMMARY
- Almost 6 years of IT experience in analysis, design, develop, implementation of applications running on various platforms.
- 3+ Years of experience on the Hadoop Eco System with a good knowledge on Map Reduce, YARN, Hdfs, Hive and Spark.
- Hands on Experience on development of Big data projects using Hadoop, Hive, Spark, Kafka and MapReduce open source tools/technologies.
- Hands on experience in writing Map Reduce Jobs
- Hands of experience with HiveQL.
- Developed Shell & Python Scripts for automation& monitoring Jobs.
- Have a Good understanding of the Machine Learning Libraries.
- Built Spark Streaming applications to receive real time data from the Kafka and store the data to HDFS.
- Managing scalable Hadoop clusters including Cluster designing, provisioning, custom configurations, monitoring and maintaining using different Hadoop distributions: Cloudera CDH, Hortonworks HDP.
- Have excellent knowledge in development of Java, J2EE platforms in n - tier applications.
- Working knowledge on Object Oriented Principles (OOP), Design & Development and have good understanding of programming concepts like data abstraction, concurrency, synchronization, multi-threading and thread communication, networking, security.
- Extensive experience in applying best practices where ever possible in the overall application development process such as using Model-View-Controller (MVC) approach for better control on the application components.
- Developed Hive Queries and automated those queries for analyzing on Hourly, Daily and Weekly Basis.
- Importing and exporting data into HDFS and Hive using Sqoop.
- Deep understanding of data warehouse approaches, industry standards and industry best practices Created Hive tables, loaded data and wrote Hive queries that run within the map.
- Hands on Experience in development of Big data projects using Hadoop, Hive, Oozie, Spark, Kafka and MapReduce, HDFS, Flume, Sqoop, Impala open source tools/technologies.
- Strong development experience in Apache Spark.
- Experience on Spark for handling large data processing in streaming process along with Scala.
- Skilled in creating workflows using Oozie for cron jobs.
- Experienced in writing custom UDFs and UDAFs for extending Hive and Pig core functionalities.
- Ability to develop Pig UDF'S to pre-process the data for analysis.
- Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems (RDBMS), Teradata and vice versa.
- Extensive knowledge with Relational &dimensional data modeling, star schema/snowflakes schema, fact and dimension tables and process mapping using the top-down and bottom-up approach.
- Experience in using Informatica Client Tools - Designer, Source Analyzer, Target Designer, Transformation Developer, Mapplet Designer, Mapping Designer, Workflow Manager and Workflow Monitor, Analyst, Developer tools.
- Experience in creating High Level Design and Detailed Design in the Design phase.
- Experience in integration of various data sources like Oracle, DB2, MS SQL Server and Flat Files.
- Experience in identifying Bottlenecks in ETL Processes and Performance tuning of the production applications using Database Tuning, Partitioning, Index Usage, Aggregate Tables, Session partitioning, Load strategies, commit intervals and transformation tuning.
TECHNICAL SKILLS
Programming Technologies: C, C++, JAVA 8, PYTHON
ETL Tools: Magellan (Amex Internal Tool), Informatica, SSIS
Big Data Technologies: Apache Spark 1.6/2.2, HDFS, Amazon S3, YARN, Apache Oozie, Apache Hive, Cloudera Impala, Apache Cassandra
Web Application Development: J2EE, Servlets, Java Server Pages (JSP), Spring, Hibernate
Markups: HTML, CSS, XML, XSL
Application Build Tools: Apache Maven 4
Automation/Scripting: Unix Shell Scripts
Application Servers: Apache Tomcat, BEA WebLogic, IBM WebSphere
Storage Technologies: SQL, PL/SQL, Stored Procedures, Triggers, CQL, Hive QL, Parquet
Databases: (RDBMS): Oracle 10g/11g, SQL Server, Amazon RDS/Aurora, My SQL, Amazon Redshift (NoSQL): Cassandra
XML Technologies: XML, XML Schema/XSD, XSLT
Transport Mechanisms: HTTP, SOAP, REST (JSON)
SDLC: Agile, Scrum, Iterated waterfall
IDE/Development Software Suite: Eclipse, NetBeans, Adobe Flex/Flash Builder, Microsoft Visual Studio, IntelliJ IDEA
Message Queues: Active MQ, WebLogic JMS, Amazon SNS/SQS, Apache Kafka
Version Control Systems: SVN, CVS, Git (Stash/Bitbucket)
Content Management System(CMS): Liferay, CQ 5/AEM 5.6
Continuous Integration Software: Hudson, Jenkins
Operating Systems: Microsoft Windows 2000/XP/Vista/7, Unix, Linux, OS X
PROFESSIONAL EXPERIENCE
Confidential, Phoenix, AZ
Software Engineer
Responsibilities:
- Involved designing the data flow architecture and build reusable components, services and applications which help gather data from the different systems into the platform and build applications and components which can transform data from cornerstone platform.
- Working on development of multiple variable to support Confidential complex data analytics.
- Monitoring the jobs in Event Engine Scheduler.
- Working in project design and development focused on automating jobs for different business analytic use cases by using Magellan, Hive, Event Engine and Scripts.
- Used Maven for dependency management.
- Applied design patterns and coding best practices to meet high code quality standards.
- Built Unit ( JUnit ) tests for all components to build a better-quality software and coordinated Integration and User Acceptance Testing.
- Implemented Jersey integration with Spring to generate and provide RESTful web services.
- Developed Service Java classes for commercial and personal clients to invoke the Web services for getting information from the External System.
- Developed app-tier using Java, J2EE, Eclipse and Tomcat.
- Used Spring Framework’s Dependency Injection (IoC) framework to configure application components and manage their lifecycle.
Environment: HDFS, Magellan, Java 8, Spring 4.x, XML, JSON, Apache Spark 1.6.0/2.2.0 , Apache Hive, REST, MySQL 5.6, Junit, Linux, Cloudera 5.x, Tomcat, Jenkins, Apache Kafka 0.9.x/0.10.x, Swagger, Parquet, Git, Maven, Intellij IDEA, Apache Oozie, Agile/Scrum, Beeline
Confidential, Phoenix, AZ
Data Engineer
Responsibilities:
- Developed applications to support Confidential complex data analytics.
- Develop, analyze, design and upgrade complex data applications by using Hive, Event Engine, Corner Stone, Magellan, Web Services, JSON, SQL, Jenkins, GIT and Shell Scripts.
- Built data Pipelines from various sources and ingested data into Big data(cornerstone) using in built tools like CDM, CLOAK, CMD and Event Engine.
- Used EVENT ENGINE to schedule workflows to run Hive, Spark jobs to transform data on a persistent schedule.
- Worked with Apache Spark SQL and Data frame functions to perform data transformations and aggregations on complex semi structured data.
- Experience in developing Spark Applications using Spark RDD, Spark-SQL and Dataframe APIs.
- Responsible for building solutions involving large data sets using SQL methodologies, Data Integration Tools in any database.
- Experience developing, deploying Shell, Python Scripts for automation/notification/monitoring.
- Used Bitbucket/Git for maintaining the component and for release and version management.
- Involved in loading and transforming large sets of structured, semi structured and unstructured data from relational databases into HDFS using Sqoop imports.
- Developed Sqoop scripts to import export data from relational sources and handled incremental loading on the customer, transaction data by date.
- Migrated existing java application into microservices using spring boot and spring cloud.
- Working knowledge in different IDEs like Eclipse, Spring Tool Suite.
- Working knowledge of using GIT, ANT/Maven for project dependency / build / deployment .
- Developed Spark code and Spark-SQL/Streaming for faster testing and processing of data.
- Import the data from different sources like HDFS/Hbase into Spark RDD.
- Experienced with batch processing of data sources using Apache Spark and Elastic search.
- Experienced in implementing Spark RDD transformations, actions to implement business analysis
- Worked on migrating Map Reduce programs into Spark transformations using Spark and Scala.
- Experience in developing Spark Applications using Spark RDD, Spark-SQL and Dataframe APIs.
- Used Apache Oozie to schedule workflows to run Spark jobs to transform data on a persistent schedule.
- Migrated Hive QL queries on structured into Spark QL to improve performance.
- Optimized MapReduce Jobs to use HDFS efficiently by using various compression mechanisms.
- Worked on partitioning HIVE tables and running the scripts in parallel to reduce run-time of the scripts.
- Worked on Data Serialization formats for converting Complex objects into sequence bits by using AVRO, PARQUET, JSON, CSV formats.
- Responsible for analyzing and cleansing raw data by performing Hive queries and running Pig scripts on data.
- Involved in Administration, installing, upgrading and managing distributions of Hadoop, Hive, Hbase.
- Involved in performance of troubleshooting and tuning Hadoop clusters.
- Created Hive tables, loaded data and wrote Hive queries that run within the map.
- Implemented business logic by writing Hive UDFs in Java.
- Wrote XML scripts to build OOZIE functionality.
- Used OOZIE Operational Services for batch processing and scheduling workflows dynamically.
- Worked on creating End-End data pipeline orchestration using Oozie.
Environment: HDFS, Magellan, Hive, Event Engine, Shell scripting, Lucy.