We provide IT Staff Augmentation Services!

Java/ Big Data Developer Resume

2.00/5 (Submit Your Rating)

TX

SUMMARY

  • Professional Big Data Engineer with 6 years of industry experience in analysis, design, development, documentation, implementing, deployment, testing and maintenance of software systems in server - side programming and distributed programming.
  • Initiative-taking, dependable analytical person, and troubleshooter with strong attention to detail.
  • Looking forward to solving the real time problems faced by your esteemed organization.
  • In depth experience in using Hadoop ecosystem tools like MapReduce, HDFS, Hive, Yarn, Sqoop, Spark, Oozie, and Zookeeper.
  • Excellent understanding of Hadoop architecture and various ecosystem components such as HDFS and MapReduce programming paradigm.
  • Good usage of Apache Hadoop along with enterprise versions of Hortonworks. Good Knowledge on MAPR distribution & Amazon’s EMR.
  • Good knowledge of Data modeling use case design and Object-oriented concepts.
  • Well versed in installation, configuration, supporting and managing of Big Data and underlying infrastructure of Hadoop Cluster.
  • Extensively worked on Spark Streaming to fetch live stream data.
  • Experience in converting Hive/SQL queries into RDD transformations using Apache Spark, Scala.
  • Implemented Dynamic Partitions and Buckets in HIVE for efficient data access.
  • Involved in integrating hive queries into spark environments using SparkSql.
  • Good working experience using Sqoop to import data into HDFS from RDBMS and vice-versa.
  • Good knowledge in developing data pipelines using Flume, Sqoop, and Pig to extract the data from weblogs and store in HDFS.
  • Inject data using Sqoop from various RDBMS like Oracle, MYSQL and Microsoft SQL Server into Hadoop HDFS.
  • Expertise in data modelling, administration and development using SQL, T-SQL, PL/SQL in Oracle (8i, 9i and 10g), MySQL, DB2, and SQL server environments.
  • Proficient with Hibernate and JDBC to connect databases like Oracle, MySQL and DB2 to store, delete, manipulate, and retrieve data from them in many applications.
  • Experience in manipulating the streaming data to clusters through Kafka and Apache Spark-Streaming.
  • Track record of increasing responsibility in the business software design, Microservices, systems analysis/development and full lifecycle project management.
  • Hands-on experience in using message brokers such as RabbitMQ.
  • Strong in source controller tools like Subversion (SVN), CVS, IBM Clear case, Perforce and GIT.
  • Experience with AWS services like EC2, VPC, Cloud Front, Elastic Beanstalk, Route 53, RDBMS and S3.

TECHNICAL SKILLS

Big Data Technologies: HDFS, Map Reduce, Hive, Sqoop, Oozie, Scala, Spark

Programming Languages: Java, C#, Python, SQL

Web Technologies: HTML5, CSS3, JSON

Cloud Platform: AWS

Web/Application Servers: Tomcat, JBoss, WebSphere, WebLogic

Databases: Oracle, MySQL, MS SQL Server 2012, Snowflake, HBase

IDE and development tools: Eclipse, NetBeans, IntelliJ,PyCharm, SQL Workbench

Build tools: ANT, MAVEN

Repositories: CVS, GitHub, SVN, GitLab

PROFESSIONAL EXPERIENCE

Data Engineer

Confidential, TX

Responsibilities:

  • Experience in developing Data Pipelines for Data Ingestion or Transformation using Java or Scala or Python
  • Developed data pipelines using Apache Spark and other ETL solutions to get data from various sources like workday and EFG to central warehouse.
  • Developed generic code modules and defined strategy to extract data from different vendors and build ETL logic on top of them to feed the data to the central data warehouse.
  • Performed data pre-processing, Data profiling, data modeling and created framework for validation and automation of regular checks.
  • Involved in daily operational activities in order to troubleshoot ad-hoc production and data issues and enhancement of infrastructure in the space of Big data and AWS cloud to provide better solutions to delegate the existing issues.
  • Build end to end automation using shell-scripting, AWS SNS, AROW and Pager Duty.
  • Used SQL window functions to build presentation layer on base tables to implement complex logic for business users.
  • Developed and Modified data pipelines to run across environments for testing
  • Experience in the Big Data frameworks such as Parquet, AVRO, ORC etc.
  • Developing applications with Monitoring, Build Tools, Version Control, Unit Test, TDD, Change Management to support DevOps
  • Experience in SQL and Shell Scripting.
  • Experience with software design and must understand cross systems usage and impact.
  • Expertise in Spark, Kafka, AWS, SQL, Python, PySpark
  • Worked on migration of Apache dagger scripts to tableau reports.
  • Design and build data processing pipelines using tools and frameworks in the Hadoop ecosystem
  • Design and build ETL pipelines to automate ingestion of structured and unstructured data
  • Design and Build pipelines to facilitate data analysis
  • Implement and configure big data technologies as well as tune processes for performance at scale
  • Manage, mentor, and grow a team of big data engineers
  • Proficiency in a programming language, ideally Python, Java, or Scala
  • Proficiency and knowledge of best practices with the Hadoop (YARN, HDFS, MapReduce)
  • Worked on scan remediations for any unmasked NPI data to check for any false positive and remediated any unmasked NPI data by masking the vulnerability in both snowflake and the parquet file present in the s3 bucket location.

Java/ Big Data Developer

Confidential, FL

Responsibilities:

  • Involved in requirements gathering and converting into technical specification documents.
  • Work directly with Product Owners and customers to deliver data products in a collaborative and agile environment.
  • Migrating an entire oracle database to BigQuery and using of power bi for reporting.
  • Collaborate in an agile scrum team with product owners and fellow software engineers to deliver upon most important business and technical priorities.
  • Build data pipelines in airflow in GCP for ETL related jobs using different airflow operators.
  • Develop generic data frameworks and data products using Apache Spark, Scala to maintain the highest availability, performance, and strive for simplicity.
  • Experience in GCP Dataproc, GCS, Cloud functions, BigQuery.
  • Used cloud shell SDK in GCP to configure the services Data Proc, Storage, BigQuery
  • Migrating the existing on-premises servers like Teradata and MS SQL server tables into Snowflake Data warehouse.
  • Coordinated with team and Developed framework to generate Daily adhoc reports and Extracts from enterprise data from BigQuery.
  • Involved in performance tuning of Spark Applications for setting right Batch Interval time correct level of Parallelism and memory tuning.
  • Designed an application for carts that handles large orders from the system and built features like network loss, data management and recovery and control multiple carts efficiently. Update modules based on client’s functionality.
  • Created BigQuery authorized views for row level security or exposing the data to other teams.
  • Good knowledge in using cloud shell for various tasks and deploying services.
  • Designed and Co-ordinated with Data Science team in implementing Advanced Analytical Models in Hadoop Cluster over large Datasets.
  • Wrote scripts in Hive SQL for creating complex tables with high performance metrics like partitioning, clustering and skewing.
  • Expertise in designing and deployment of Hadoop cluster and different Big Data analytic tools including Pig, Hive, SQOOP, Apache Spark, with Cloudera Distribution.
  • Experience in database design, data modeling and developing stored procedures, functions and triggers using SQL in Oracle and MySQL.
  • Work related to downloading BigQuery data into pandas or Spark data frames for advanced ETL capabilities.
  • Worked with google data catalog and other google cloud API’s for monitoring, query and billing related analysis for BigQuery usage.
  • Worked on creating POC for utilizing the ML models and Cloud ML for table Quality Analysis for the batch process.
  • Used spring framework web flow to navigate between the pages.
  • Developed SQL queries, Stored Procedures, and functions for incorporating business logic.

Hadoop/Java Developer

Confidential, MO

Responsibilities:

  • Involved in requirements gathering and converting into technical specification documents.
  • Work directly with Product Owners and customers to deliver data products in a collaborative and agile environment.
  • Collaborate in an agile scrum team with product owners and fellow software engineers to deliver upon most important business and technical priorities.
  • Develop generic data frameworks and data products using Apache Spark, Scala to maintain the highest availability, performance, and strive for simplicity.
  • Involve in working with Amazon Elastic MapReduce (EMR)and setting up environments on Amazon AWS EC2 Linux/Windows instances.
  • Optimize the EMR workloads for different types of data loads by choosing right compression, cluster type, instance type, storage type and EMRFS in order to analyze data with low cost and high scalability.
  • Creation of jobs for scheduling the batch processing and used AWS Lambda as a Scheduler.
  • Migrating the existing on-premises servers like Teradata and MS SQL server tables into Snowflake Data warehouse.
  • Import data from AWS S3 and Performed transformations utilizing Data Frames and Apache Spark SQL API also for faster testing and processing of data.
  • Created Apache Spark programs for Absolute Data Quality check before loading data into Snowflake.
  • Involved in performance tuning of Spark Applications for setting right Batch Interval time correct level of Parallelism and memory tuning.
  • Designed an application for carts that handles large orders from the system and built features like network loss, data management and recovery and control multiple carts efficiently. Update modules based on client’s functionality.
  • Developed a messaging system using kafka to push all data to the downstream teams. Encrypted the PII data using Voltage Encryption to send the data in secured format. The application was hosted in cloud environment.
  • Designed an asynchronous socket communication application using C# for the Warehouse Management Control System and implemented an interface for parsing data messages using multithreads and delegates.
  • Experience in database design, data modeling and developing stored procedures, functions and triggers using SQL in Oracle and MySQL.
  • Development and support of Restful Web services that support JSON using Spring Web services JAX-RS JSON Spring MVC Module.
  • Created the Spring MVC components like Dispatcher Servlets, Handler Mapping controller, configure Request mapping annotation controllers and view resolver controller.
  • Used spring framework web flow to navigate between the pages.
  • Developed SQL queries, Stored Procedures, and functions for incorporating business logic.
  • Developed the application using JSP and used JDBC for database connections.
  • Written EJBs including Session Beans for database using WebLogic Server.

Java Developer

Confidential

Responsibilities:

  • Worked as a full stack developer to develop web applications using spring, REST based Web Services providing OAuth Authentication.
  • Responsible for creating front end applications, user interactive (UI) web pages using web technologies like HTML5, CSS3, JavaScript, JSON, and Bootstrap.
  • Worked with web API’s to provide services to HTTP requests.
  • Developed websites with cross-browser compatibility using HTML, CSS and jQuery.
  • Implemented the project structure based on Spring MVC pattern using spring boot.
  • Created and maintained mapping files in Hibernate.
  • Configured web.xml and managed beans.
  • Integrated JSF, spring and Hibernate frameworks.
  • Used Maven as a build tool.
  • Developed JUnit tests for the modules.
  • Debug application.
  • Used multithreading for faster and parallel processing of the files.
  • Collaborated with testers and developers and prepared test plans for producing high quality applications.
  • Deployed a .war file that boot handles various requests from clients.
  • Implemented complete Maven build life cycle to achieve organized application structure and conflict free dependencies in pom.xml file.
  • Tested the DAOs and services by JUnit test cases.
  • Wrote automation test cases using TestNG to test UI behavior.
  • Deployed applications into continuous integration environments like Jenkins to integrate and deploy the code on CI environments for testing.
  • Developed SVN controls to track and maintain the different versions of the project.

We'd love your feedback!