Senior Software Engineer Resume
Oakbrook, IllinoiS
SUMMARY;
- A collaborative big data engineering professional with 5+ years of experience in software development lifecycle, developing and executing software solutions for complex business problems involving data warehousing and streaming analytics.
- Known to utilize the Hadoop ecosystem’s components and architecture to effectively store, analyze, and process ample amounts of structured and unstructured data into competitive advantages.
- Has a strong willingness and passion for solving challenging problems and discovering solutions to convert complication into efficient simplicity
- Being a very fast learner, passionate to know new things, with dedication & commitment towards thework I can deliver high quality work.
- Strong understanding and experience in data management and storage using HDFS.
- Thorough conceptual understanding of Hadoop Framework, HDFS, YARN, Object Oriented Design
- (OOD), Kafka, Scala, Apache Spark.
- Hands - on experience in developing Hive tables to store structured and semi-structured data in varying formats with addition to performing HiveQL data processing operations to query and analyze datasets.
- Hands on experience working with Cloudera Distribution of Hadoop
- Experience working with Oracle 11g RDBMS databases and performing SQL CRUD operations in SQLDeveloper.
- Experience working in Oracle VM VirtualBox Linux environments such as CDH and Ubuntu.
- Good understanding of Hadoop processing framework such as Spark, Spark SQL and the streaming technologies such as Apache Kafka.
- Hands on experience developing project specific APIs using Scala.
- Import/export of data using Hadoop data management tool SQOOP and KAFKA.
- Partition/bucketing in HIVE tables and loaded data into HIVE tables.
- Loading data into HBase tables from Hive, using HBase shell and API
- Hive Storage format (Sequence files, Avro and RC Files), compression techniques (Snappy) and using SerDe's for insert/load operations.
- Spark Streaming Jobs.
- Real time data analysis using Apache Spark streaming.
- Apache Kafka to feed stream data to Spark streaming applications.
- Spark applications using Scala and writing RDDs.
- Spark SQL scripts for complex data analytic problems.
- Building projects using Eclipse and IntelliJ.
- Deploying Spark jobs in Local and Yarn mode
SKILLS & ABILITIES:
Big Data Technologies: Hadoop, Map Reduce, Pig - Latin, Hive, HBase, Cloudera Distribution of Hadoop, Apache Spark, Scala, Apache Kafka
Data Loading Techniques: SQOOP, FLUME, Kafka
Languages & APIs: Scala, Java, PIG, HIVE, HBase, MapReduce, Spark, SparkSQL, PL/SQL, Unix Shell
Frameworks: Apache Hadoop, Apache SparkMapReduce
Operating Systems: Linux ( Ubuntu ), Windows 10
RDBMS: Oracle, MySQL
NoSQL: Hbase, Cassandra
Application Development Tools: IntelliJ, Eclipse, TortoiseSVN, WinSCP, PuTTY, SQLDeveloper, HP ALM
Build Tool: Maven, Sbt
Methodologies: Agile, Scrum
EXPERIENCE:
Senior Software Engineer
Confidential, Oakbrook, Illinois
Responsibilities:
- Participated in information gathering sessions with client subject-matter experts to help improve accuracy in data understanding and to assist with development activities.
- Hands-on experience in developing Hive tables with partitions to store structured and semi-structured data in varying formats with addition to performing HiveQL data processing operations to query and analyze datasets.
- Hands on experience working with varied formats of datasets that includes Avro, JSON and Parquet.
- Hands on experience working with Cloudera Distribution of Hadoop, using hue, for navigating the HDFS, executing hive queries and making use of cloud era manager to monitor the running jobs.
- Experience working with Oracle 11g RDBMS databases and performing SQL CRUD operations in SQLDeveloper.
- Hands on experience developing project specific APIs using Scala utilizing the Spark SQL APIs.
- Experience in utilizing HP ALM for defect tracking and reporting purposes.
- Exposed to following an agile approach to project execution with daily Scrum meetings.
- Experience in preparing module specific technical design documentation.
- Understanding of Data extraction, transformation and loading (ETL). Familiar with Oracle Data Integrator and Oracle Golden Gate that is certified to capture and deliver to Oracle Exadata Storage Server to enable real-time data warehousing or data consolidation solutions.
- Experience in executing delivery tasks to clients involving large teams across multiple engagements.
Trainee
Confidential
Responsibilities:
- Understand the popularity of different products in different geographic locations at different times
- Feed these patterns into the ad engine it can show more relevant and useful ads to the specific user.
- The web server logs typically have details like IP Address, user details, product identifiers, timestamps, browser details, etc. we could get the geographic location from the IP Address.
- Simulator application was built using JAVA to generate logs similar to what might come from web server.
- Scala was used to code producer, consumer, Spark streaming application and analysis.
- Integrated the simulator application with Kafka producer and sent the logs to Kafka cluster.
- Zookeeper was used to maintain the offsets of the Kafka cluster.
- Spark streaming application was developed to consume data from the Kafka cluster.
- Regex was used for data extraction from the streamed data.
- Spark Broadcast object was created for the country code and IP Address mapping, which was used for further geographical analysis.
- Page views, Visits by Country and Status of views is extracted.
- The analyzed data was saved in Cassandra.
Confidential
Responsibilities:
- The data analysis was performed using Hive.
- Hive scripts have been written to perform the analysis
- Regex SerDe is used to load only the required columns into Hive table.
- Hive table columns are enabled for Block Compression.
- Custom UDFs and UDAFs (to calculate Mean of scores) and used in Hive scripts during Data Analysis.
- Hive table is Partitioned by year and clustered (buckets) by grade for efficient querying.
Assistant Software Engineer
Confidential
Responsibilities:
- Requirement analysis for enhancements.
- Development using the Nortel PeriProducer for call routing.
- Unit testing the code.
- Prepare and maintain system documentation.
- QA activities planning and coordination.
- Defect analysis and fixing production issues.
- Java log analysis to figure out actual errors reported and co-relate it with user issues reported.
- Track the project status for internal use and weekly status reporting to the management.
- Was recognized several times for the quality and quick approach of the work.
- An active member of Risk Team, responsible for the backup of critical data and disaster recovery team.