Hadoop/scala Developer Resume
PA
SUMMARY:
- Professional IT experience of 9 years in software development including recent 3+ years’ work experience in Big Data related technologies.
- Hands on experience on major components in Hadoop Ecosystem including HDFS, YARN, Hive, HBase, PIG, Sqoop, Flume, Spark and Kafka.
- Set up standards and processes for Hadoop based application design and development.
- Experience in analyzing data using HIVEQL, PIG Latin and custom Map Reduce programs in JAVA. Extending HIVE and PIG core functionality by using custom UDF’s.
- Good experience in loading datasets using SQOOP from Teradata into HDFS.
- Experienced in developing Map Reduce programs using Apache Hadoop for working with Big Data.
- Experience in Designing, developing and implementing connectivity products that allow efficient exchange of data between our core database engine and Hadoop ecosystem.
- Worked on NoSQL database HBase for storing huge amounts of web logs.
- Experience in importing and exporting the data using Sqoop from HDFS to Relational Database systems/mainframe and vice - versa.
- Experienced in using Apache Spark with Scala and Spark SQL.
- Extensively used Spark for processing data and building data pipelines on terabytes of data for different sub applications.
- Also used kafka for streaming application and processed the same data using Spark
- Have experience using both Spark 1.6 and Spark 2.0
- Experience in application development using Java, RDBMS, and Linux shell scripting.
- Experience migrating data between HDFS and RDBMS using sqoop and exporting and importing using streaming platforms Flume and kafka.
- Designed/Developed Oracle DB/Application monitoring scripts using bash shell.
- Deployed Instances, and knowledge on provisioning S3 bucket, Security groups and Hadoop eco system.
- Experience in Data mining and Business Intelligence tools such as Tableau and Qlickview.
- Diverse experience utilizing Java tools in business, Web, and client-server environments including Java Platform, J2EE, EJB, JSP, Java Servlets, Struts, and Java database Connectivity (JDBC) technologies.
- Solid background in Object-Oriented analysis (OOAD) and design. Very good at various Design Patterns, UML and Enterprise Application Integration EAI.
- Major strengths are familiarity with multiple software systems, ability to learn quickly new technologies, adapt to new environments, self-motivated, team player, focused adaptive and quick learner with excellent interpersonal, technical and communication skills.
- Good communication skills, work ethics and the ability to work in a team efficiently with good leadership skills.
TECHNICAL SKILLS:
Big data/Hadoop: HDFS, Map Reduce, HIVE, PIG, HBase, Sqoop, Flume, Oozie, and Spark.
Java Technologies: Core Java, I8N, JFC (Java Foundation Classes, for building Graphical UI), Swing, Beans, Log4j, Reflection.
J2EE Technologies: Servlets, JSP (JavaServer Pages, for dynamic web pages), JDBC (Java Database Connectivity), JNDI (Java Naming and Directory Interface), Java Beans
Methodologies: Agile, UML, Design Patterns (Core Java and J2EE)
Monitoring& Reporting: Ganglia, Nagios, Custom Shell scripts
Frameworks: MVC Struts, Hibernate, spring
Programming Languages: C, Java, Scala, Ant scripts, Linux shell scripts
Database: Oracle 11g/10g/9i, MySQL, MS-SQL Server, Netezza, Teradata.
Web Servers: WebLogic, WebSphere, Apache Tomcat
Web Technologies: HTML, XML, JavaScript, AJAX, SOAP, WSDL
ETL Tools: Informatica, Qlickview, Talend and Cognos
Cloud Technologies: AWS, Azure
PROFESSIONAL EXPERIENCE:
Confidential, Pittsburgh, PA
Hadoop/Scala Developer
Responsibilities:
- Worked on Spark core, Spark Streaming, Spark SQL modules of Spark.
- Developing scripts to perform business transformations on the data using Hive and PIG.
- Developing UDFs in java for hive and pig, Worked on reading multiple data formats on HDFS using Scala.
- Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs (Resilient Distributed Dataset) and Scala.
- Developed multiple POCs using Scala and deployed on the Yarn cluster, compared the performance of Spark, with Hive and SQL/Teradata.
- Analyzed the SQL scripts and designed the solution to implement using Scala
- Data analysis through Pig, Map Reduce, Hive.
- Design and develop Data Ingestion component.
- Cluster coordination services through Zookeeper
- Import of data using Sqoop from Oracle to HDFS
- Import and export of data using Sqoop from or to HDFS and Relational DB Teradata.
- Developed POC on Apache-Spark and Kafka
- Implement Flume, Spark, Spark Stream framework for real time data processing.
- Hands on experience in installing, configuring and using eco-System components like Hadoop MapReduce, HDFS, Hbase, Pig, Flume, Hive and Sqoop.
- Developed analytical component using Scala, Spark and Spark Stream.
- Automated the entire CI/CD using Scripts, Git and Jenkins.
Environment: Java, Scala, Python, J2EE, Hadoop, Spark, HBase, Hive, Pig, Sqoop, MySQL, Teradata, GitHub, AWS.
Confidential, Reston, VA
Hadoop Developer
Responsibilities:
- Gathered the business requirements from the Business Partners and Subject Matter Experts.
- Involved in installing Hadoop Ecosystem components.
- Developed data pipeline using Flume, Sqoop, Pig and Java map reduce and Spark to ingest customer behavioral data and purchase histories into HDFS for analysis.
- Responsible for importing and exporting data from different sources like MySQL, Teradata databases into HDFS using SQOOP to save in file formats AVRO, JSON and ORC file formats.
- Involved in HDFS maintenance and loading of structured and unstructured data.
- Develop and run Map-Reduce jobs on a multi Peta byte YARN and Hadoop clusters which processes billions of events every day, to generate daily and monthly reports as per user's need.
- Weekly meetings with technical collaborators and active participation in code review sessions with senior and junior developers
- Used to manage and review the Hadoop log files, and Responsible to manage data coming from different sources.
- Supported Map Reduce Programs that are running on the cluster.
- Involved in importing the data from different data sources like Teradata into HDFS using Sqoop and applying various transformations using Hive, Spark and then loading data into Hive tables.
- Wrote MapReduce jobs using Java API.
- Delivering tuned, efficient and error free codes for new Big Data requirements using my technical knowledge in Hadoop and its Eco-system.
- Installed and configured Pig and written Pig Latin scripts.
- Involved in managing and reviewing Hadoop log files.
- Imported data using Sqoop to load data from MySQL and Oracle to HDFS on regular basis.
- Developing Scripts and Batch Job to schedule various Hadoop Program.
- Written Hive queries for data analysis to meet the business requirements.
- Creating Hive tables and working on them using Hive QL.
- Utilized Agile Scrum Methodology to help manage and organize a team of 4 developers with regular code review sessions.
- Used JUnit for unit testing and Continuum for integration testing.
- Developed a spark pipeline to transfer data from lake to Cassandra in cloud to make the data available for decision engine to publish customized offers real time
- Automated the entire CI/CD using Scripts, Git and Jenkins.
Environment: Cloudera, Hadoop, MapReduce, HDFS, Hive, Pig, Sqoop, Java, Red Hat Linux, XML, MySQL, Eclipse, JUnit
Confidential
Java Developer
Responsibilities:
- Actively participated in different phases of the Software Development Life Cycle (AGILE) and analyzed Use Case diagrams and Class diagrams based on requirement.
- Presentation Layer is created using JSP, HTML and Struts Tag Libraries.
- Configured Front end to Server side by using Struts Configuration.
- Validated user data using Struts Action Forms and user request is processed using Action Classes.
- Written Hibernate mapping file for each Java Object and configured with respective table in Hibernate Configuration.
- Designed and Developed Gemfire Spring Repositories with well-designed Domain Objects.
- Created test cases using JUNIT and Mockito.
- Design and implementation of Spring-Security for the application for authentication process against LDAP, J2EE pre-authentication, and Database
- Written HQL Queries to communicate with the Oracle Database.
- Developed DAO's (Data Access Objects) and performed O/R Mapping using Hibernate to access the database.
- Implemented Log4j for logging and developed test cases using JUnit.
- Performed Unit, Integration testing, worked on clearing issues at the time production and application support, worked on the maintenance and deployments of the project with patches and performed documentation of project.
- Used Rational Clear Case as Version Control for versioning and synchronizing the project code.
Environment: Java, J2EE, Struts 1.2, Hibernate, Oracle 9i, Web Sphere 5.0, JavaScript, RAD 6.0, Rational Clear Case.
Confidential
Software Developer
Responsibilities:
- Worked as a product client team developer on a variety of platforms including Windows, UNIX and Linux distribution using C.
- Performed unit testing for existing interfaces.
- Worked with a team of developers to analyze project requirements and add functionality to existing C applications and SQL databases, such as reading in claims in different formats from different sources, checking for errors, and converting them into a standard format.
- Reading existing source code in C to determine current programming logic.
- Working with SQL database including making enhancements to stored procedures.
- Responsible for fixing problems wherever they are: product functionality; reliability, performance of product installation and update; network protocols.
- Work with QA team on new features testing and bug fixing.
Environment: C, SQL, Windows, Unix, Linux.
