We provide IT Staff Augmentation Services!

Sr. Java/big Data Developer Resume

Nyc, NY


  • Over7+years of experience in Information Technology which includes experience inBig data,HADOOP Ecosystem,Core Java/J2EEand strong in Design, Software processes, Requirement gathering, Analysis and development of software applications
  • Excellent Hands on Experience in developingHadoopArchitecturein Windows and Linux platforms.
  • Experience in buildingbigdatasolutions usingLambda Architectureusing Cloudera distribution ofHadoop,MapReduce,Cascading,HIVE,PIGandSqoop.
  • Strong development experience inJava/JDK 7, JEE6, Maven, Jenkins, Jersey, Servlets, JSP, Struts, Spring, Hibernate, JDBC, Java Beans, JMS, JNDI, XML, XML Schema, Web Services, SOAP, JUnit, ANT, Log4j.
  • Experienced inJ2EE Design Patternssuch asMVC, Business Delegate, Service Locator, Singleton, Transfer Object, Singleton, Session Façade, and Data Access Object.
  • Worked onHadoop, Hive, JAVA, python, Scala Strutsweb framework.
  • Excellent working experience onBig Data Integrationand Analytics basedonHadoop,SOLR,Spark,Kafka,Stormand web Methods technologies.
  • Developed Python code to gather the data from HBase and designs the solution to implement using Pyspark
  • Experienced in designing and developing applications inSparkusingScalato compare the performance ofSparkwithHive and SQL/Oracle.
  • Worked on Google Cloud Platform(GCP) Services like Vision API, Instances
  • Hands on experience working onNoSQLdatabases includingHbase,MongoDB,Cassandraand its integration withHadoopcluster.
  • Strong Knowledge and experience on implementingBig DatainAmazon Elastic MapReduce (Amazon EMR)for processing, managingHadoopframework dynamically scalableAmazon EC2instances.
  • Hands on experience in writingAd - hoc Queriesfor moving data fromHDFStoHIVEand analyzing the data usingHIVEQL.
  • Good understanding on Cloud Based technologies such as GCP, AWS.
  • Hands on Experience on Snowflake and GCP.
  • Good knowledge inRDBMSconcepts(Oracle 11g, MS SQL Server 2000) and strong SQL, PL/SQLquery writing skills (by usingTOAD & SQL Developertools), Stored Procedures and Triggers.
  • Expertise inAmazon Web ServicesincludingElastic Cloud Compute (EC2) and Dynamo DB.
  • Expertise in Automating deployment of largeCassandra Clusters on EC2 using EC2 APIs
  • Experienced in development and utilization ofApacheSOLRwith Data Computations and Transformation for use by Down Stream Online Applications.
  • Excellent knowledge of database such asOracle 8i/9i/10g/11g, 12c,MicrosoftSQLServer,DB2,Netezza.
  • Good understanding and experience with Software Development methodologies likeAgileandWaterfall.
  • Experienced in importing and exporting data usingSqoopfromHDFS (Hive & HBase)to Relational Database Systems(Oracle &Teradata)and vice-versa.
  • Experienced in developing and designingWeb Services (SOAP and Restful Web services).
  • Expertise in variousJava/J2EEtechnologies likeJSP,Servlets,Hibernate,Struts,spring.


Confidential, NYC, NY

Sr. Java/Big Data Developer


  • DevelopedSparkcodeusingScalaandSpark-SQL/Streamingfor faster testing and processing of data.
  • UsedSparkAPIoverClouderaHadoopYARNto perform analytics on data in Hive.
  • As aBig DataDeveloper implemented solutions for ingesting data from various sources and processing the Data-at-Rest utilizing Big Data technologies such asHadoop, MapReduce Frameworks, MongoDB.
  • Developed a job server(REST API, spring boot, ORACLE DB)and job shell for job submission, job profile storage, job data (HDFS) query/monitoring.
  • Developed PySpark and SparkSQL code to process the data in Apache Spark on Amazon EMR to perform the necessary transformations based on the STMs developed
  • Created CustomUDF’sin JAVA to overcome HIVE limitations on Cloudera CDH5.
  • Explored with theSparkimproving the performance and optimization of the existing algorithms inHadoopusingSparkContext,Spark -SQL, Data Frame,PairRDD's,SparkYARN.
  • Deployed application toAWSand monitored the load balancing of differentEC2instances
  • Handled importing of data from various data sources, performed transformations usingHive, MapReduce, loaded data intoHDFSand Extracted the data fromSQL into HDFS using Sqoop.
  • Deployed application toAWSand monitored the load balancing of differentEC2 instances
  • InstalledHadoop, Map Reduce, and HDFSand developed multipleMapReducejobs inPIGandHivefordata cleaning and pre-processing.
  • Developed a POC for project migration from on premHadoopMapRsystem to GCP/Snowflake
  • Worked on implementing Spark Framework a Java based Web Frame work.
  • Worked onBig Data Integration&Analytics based onHadoop, SOLR, Spark, Kafka, Storm and web Methods.
  • Extensively worked onPythonand build the custom ingest framework and w orked onRest API using python.
  • DevelopedKafkaproducer and consumers,SparkandHadoopMapReducejobs.
  • Imported the data from different sources likeHDFS/HbaseintoSparkRDD.
  • Configured deployed and maintained multi-node Dev and TestKafkaClusters.
  • Strongly recommended to bring inElastic Searchand was responsible for installing, configuring and administration.
  • Created ElasticMap Reduce (EMR) clusters and Configured the Data pipeline with EMR clustersfor scheduling the task runner and provisioning ofEc2 Instanceson both Windows and Linux.
  • Worked onAWS Relational Database Services, AWS Security Groups and their rule andimplementedReporting, Notification services using AWS API.
  • Analyzed the SQL scripts and designed the solution to implement using Pyspark.
  • ImplementedAWS EC2, Key Pairs, Security Groups, Auto Scaling, ELB, SQS, and SNS using AWS API and exposed as the Restful Web services.
  • Involved in convertingMapReduceprograms intoSparktransformationsusingSpark RDD's on Scala.
  • DevelopedSparkscriptsby usingScalaShellcommands as per the requirement.
  • Implemented usingSCALA and SQLfor faster testing and processing of data. Real time streaming the data using withKAFKA.
  • Developed and designed automation framework usingPython and Shell scripting.
  • Involved in writingJava APIforAmazon Lambdato manage some of theAWS services.
  • Load the data intoSparkRDDand do in memory data Computation to generate the Output response.
  • DevelopedHive Scripts, Pig scripts, UNIX Shell scripts,programming for allETLloading processes and converting the files into parquet in theHadoop File System.
  • Developed and writtenApachePIGscriptsandHIVEscriptsto process theHDFSdata.
  • UsedHiveto find correlations between customer's browser logs in different sites and analyzed them to build risk profile for such sites.
  • UtilizedAgile Scrum Methodologyto help manage and organize a team of 4 developers with regular code review sessions.

Confidential, Nashville, TN

Java/Hadoop Developer


  • Worked on analyzingHadoop clusterusing different big data analytic tools includingKafka, Pig, HiveandMapReduce.
  • Proactively monitored systems and services, architecture design and implementation ofHadoopdeployment, configuration management, backup, and disaster recovery systems and procedures
  • ConfiguredSparkstreaming to receive real time data from theKafkaand store the stream data toHDFSusingScale.
  • Installed and configuredHadoop, MapReduce, HDFS (Hadoop Distributed File System),developed multipleMapReducejobs injavafor data cleaning and processing.
  • Designed and configured Flume servers to collect data from the network proxy servers and store toHDFS and HBASE.
  • Worked on implementingSparkusingScalaandSparkSQLfor faster analyzing and processing ofdata.
  • UtilizedJava and MySQLfrom day to day to debug and fix issues with client processes
  • UsedJAVA, J2EEapplication development skills with Object Oriented Analysis and extensively involved throughoutSoftware Development Life Cycle (SDLC)
  • ImplementedAWS EC2, Key Pairs, Security Groups, AutoScaling, ELB, SQS, and SNS using AWS APIand exposed as the Restful Web services.
  • Monitor Azkaban jobs in on-prem (Hortonworks distribution) and GCP (Google Cloud Platform).
  • Involved in launching and Setup ofHADOOP/ HBASECluster which includes configuring different components ofHADOOP and HBASE Cluster.
  • Hands-on experience ofWeb logic Application Server, Web Sphere Application Server, Web Sphere Portal Server, and J2EE applicationdeployment technology
  • Handled in Importing and exporting data intoHDFSandHiveusingSQOOPandKafka
  • Involved in creatingHive tables, loading the data and writinghivequeries, which will run internally in map reduce.
  • AppliedMapReduceframework jobs in java for data processing by installing and configuringHadoop, HDFS.
  • Involved in developingPigScriptsfor change data capture and delta record processing between newly arrived data and already existing data inHDFS.
  • Developed spark applications in python(PySpark) on distributed environment to load huge number of CSV files with different schema in to Hive ORC tables.
  • Worked on reading and writing multiple data formats like JSON,ORC,Parquet on HDFS using PySpark.
  • Involved inHDFSmaintenance andWEBUIit throughHadoop-Java API.
  • Implemented Reporting, Notification services usingAWS API and used AWS (Amazon Web services)compute servers extensively.
  • WrittenHivejobs to parse the logs and structure them in tabular format to facilitate effective querying on the log data.
  • Worked on Designing and DevelopingETLWorkflowsusingJavafor processing data inHDFS/HbaseusingOozie.
  • Wrote complexHivequeriesandUDFs.
  • Create Snapshots ofEBSVolumes. MonitorAWS EC2Instances usingCloud Watchand worked onAWSSecurity Groups and their rules
  • Involved in developingShellscriptsto easy execution of all other scripts(Pig, Hive, and MapReduce)and move the data files within and outside of HDFS.
  • Involved in convertingHive/SQLqueries into Spark transformations using SparkRDDs, PythonandScala.
  • Worked withNoSQLdatabases likeHbasein creating tables to load large sets of semi structureddata.
  • GeneratedJavaAPIsfor retrieval and analysis onNo-SQLdatabase such as HBase.
  • CreatedETLjobs to generate and distribute reports fromMySQLdatabase using PentahoDataIntegration.
  • Worked on loading data fromUNIXfile system toHDFS
  • Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.

Confidential, New Jersey

Big Data Analyst


  • Installed/Configured/Maintained ApacheHadoopclusters for application development andHadoop tools like Hive, Pig, HBase.
  • Involved in writing Client side Scripts usingJava Scriptsand Server Side scripts usingJava Beansand usedservletsfor handling the business.
  • CreatedElastic Map Reduce (EMR) clustersand Configured the Data pipeline withEMRclusters for scheduling the task runner.
  • DevelopedScalaprograms with Spark for data inHadoop ecosystem.
  • Extensively involved in Installation and configuration ofClouderadistributionHadoop 2, 3, NameNode, Secondary NameNode, JobTracker, TaskTrackers and DataNodes.
  • Developed another user basedWeb services (SOAP) through WSDLusing WebLogic application server andJAXBas binding framework to interact with other components.
  • Managed and reviewedHadoop Logfilesas a part of administration for troubleshooting purposes. Communicate and escalate issues appropriately.
  • Provisioning ofEc2Instances on both Windows and Linux and worked onAWSRelational Database Services,AWSSecurity Groups and their rules
  • Implemented Reporting, Notification services usingAWS API.
  • DevelopedMapReducejobs using apache commons components.
  • Used Service Oriented Architecture (SOA) basedSOAPandRESTWeb Services(JAX-RS)for integration with other systems.
  • Collected and aggregating large amounts of log data using Apache Flume and staging data inHDFSfor further analysis
  • Involved in designing and developing the application usingJSTL, JSP, Java script, AJAX, HTML, CSS and collection.
  • ImplementedAWS EC2, Key Pairs, Security Groups, Auto Scaling, ELB, SQS, and SNS using AWS APIand exposed as the Restful Web services.
  • CreatedHBasetablesto load large sets of structured, semi-structured and unstructured data coming fromUNIX,NoSQLand a variety of portfolios.
  • Solved performance issues in Hive and Pig scripts with understanding of Joins, Group and aggregation and translate to MapReduce jobs.
  • DevelopedUDFsinJavaas and when necessary to use inPIGandHIVEqueries.
  • Coordinated with various stakeholders such as the End Client, DBA Teams, Testing Team and Business Analysts.
  • DevelopedJavaWeb Applications usingJSPandServlets,Struts,Hibernate,spring,RestWebServices,SOAP.
  • Involved in gathering requirements and developing a project plan.
  • Involved in understanding requirements, functional specifications, designing documentations and testing strategies.
  • Involved in UI designing, Coding, Database Handling.
  • Involved inUnitTestingandBugFixing.
  • Worked over the entireSoftware Development Life Cycle (SDLC)as a part of a team as well as independently.
  • WrittenSQLqueriesto query the database and providing data extracts to users as per request.


Java/Scala Developer


  • Develop Web tier usingSpring MVCFramework.
  • Perform database operations on the consumer portal usingSpringJdbc template.
  • Implementeddesign patternsin Scala for the application.
  • Setting upinfrastructureImplementing Configuring ExternalizingHTTPDmod jkmod rewrite.mod proxy JNDI SSL etc.
  • Involved in converting Hive/SQL queries into Spark transformations using Spark RDD, Scala.
  • ImplementedRestfulservices in Spring.
  • Serialize and de-serialize objects usingPlayJson library.
  • Developing traits and case classes etc in scala.
  • Develop quality code adhering to Scala coding Standards and best practices.
  • Writing complex Sql queries.
  • Develop GUI usingJQueryJsonandJava script.
  • Unit testing Integration testing and bug fixing.


Data Analyst


  • Conducted thorough study to establish the relationship between various functionalities and worked on the change booking scenario.
  • Developed entity relation diagrams for the entire change booking functionality using MS Visio
  • Executed several complex SQL queries to access, update data from different data bases that has huge amounts of data
  • Visualized the data using tableau to design data according to customer requirements.
  • Communicated the findings of the analysis to the client and the team.
  • Helped development team in understanding the requirements of the client.
  • Participated in all scrum meetings and addressed the issues and concerns raised by client immediately



  • Developed a New Distribution Capability migration project to enhance usability.
  • The migration project was implemented for all the functionalities in Java language using Spring framework.
  • Ensured all the scenarios were passed by writing JUnits for the code.
  • Deployed in the local environment and tested in SoapUI by writing XML code.
  • Committed the code to Jenkins.
  • Increased annual revenue by 10% in 6 months.
  • Delivered three major change requirements within 2 months by proper planning and executing the tasks as module lead for theteam.
  • Promoted the project to production with zero defects by coordinating with 7 different teams.

Test Analyst


  • Understood and analyzed client requirements to prepare Traceability Matrix, Test Plans, Test Cases and Test Report that impacted the project deliverables.
  • Performed intensive testing with different test cases for a particular scenario to assure quality of deliverables
  • Identified different bugs and provided a detailed analysis for each bug which helped development team to resolve bugs faster.
  • Performed various analysis to gain in-sights for data-driven decision-making on numerous automation projects to identify feasibility and to optimize business processes.
  • Completed various levels of functional (using XML service requests and responses), non-functional and assisted thedevelopment team in fixing bugs.

Hire Now