Big Data/spark/scala Developer Resume
Corvallis, OR
SUMMARY:
- Currently working in a Big Data Capacity with the help of Hadoop Eco System across internal and cloud - based platforms.
- Above 8+ years of experience as Big Data/Hadoop and JavaDeveloper with skills in analysis, design, development, testing and deploying various software applications.
- Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa.
- Good knowledge in using Hibernate for mapping Java classes with database and using Hibernate Query Language (HQL).
- Hands on experience in configuring and working with Flume to load the data from multiple sources directly into HDFS.
- Good experience in developing MapReduce jobs in J2EE /Java for datacleansing, transformations, pre-processing and analysis.
- Good Knowledge in Amazon Web Service (AWS) concepts like EMR and EC2webservices which provides fast and efficient processing of TeradataBigData Analytics.
- Experience in collection of LogData and JSON data into HDFS using Flume and processed the data using Hive/Pig.
- Extensive experience on developing SparkStreaming jobs by developing RDD's (Resilient Distributed Datasets) and used SparkSQL as required.
- Experience on developing JAVAMapReduce jobs for datacleaning and data manipulation as required for the business.
- Strong knowledge on Hadoopeco systems including HDFS, Hive, Oozie, HBase, Pig, Sqoop, Zookeeper etc.
- Extensive experience with advanced J2EE Frameworks such as spring, Struts, JSF and Hibernate.
- Expertise in JavaScript, JavaScriptMVC patterns, ObjectOrientedJavaScriptDesign Patterns and AJAX calls.
- Installation, configuration and administration experience in Big Data platforms Cloudera Manager of Cloudera, MCS of MapR.
- Experience working with Horton works and Cloudera environments.
- Good knowledge in implementing various data processing techniques using ApacheHBase for handling the data and formatting it as required.
- Excellent experience in installing and running various Oozieworkflows and automating parallel job executions.
- Experience on Spark and SparkSQL, SparkStreaming, SparkGraphX, SparkMlib.
- Extensively development experience in different IDE like Eclipse, NetBeans, IntelliJ and STS.
- Strong experience in coreSQL and Restfulwebservices (RWS).
- Strong knowledge in NOSQLcolumn oriented databases like HBase and its integration with Hadoopcluster.
- Good experience in Tableau for DataVisualization and analysis on large datasets, drawing various conclusions.
- Experience in using Python, R for statisticalanalysis.
- Good knowledge of coding using SQL, SQLPlus, T-SQL, PL/SQL, Stored Procedures/Functions.
- Worked on Bootstrap, AngularJS and NodeJS, knockout, ember, Java Persistence Architecture (JPA).
- Well versed working with Relational Database Management Systems as Oracle12c, MSSQL, MySQLServer.
- Experience with all stages of the SDLC and Agile Development model right from the requirement gathering to Deployment and production support.
- Experience in using PL/SQL to write Stored Procedures, Functions and Triggers.
TECHNICAL SKILLS:
Big Data Ecosystems: Hadoop, Pig, Hive, Sqoop, Oozie, Spark, Spark SQLMapReduce, HDFS
Languages and Frameworks: Python (Preferred), JAVA, HiveQL, SQL, C, Scala
Database and Web Technologies: MySQL, Oracle 11g, PL/SQL, ESQL, Hadoop Map-Reduce
Operating system: Linux, Windows
Machine Learning Libraries: scikit-learn, NumPy, matplotlib, NetworkX, LibSVM
Tools: MATLAB, Weka, PyCharm, Confidential Integration BusWebSphere MQ, Confidential Data Studio, Eclipse, Maven, NetBeans
PROFESSIONAL EXPERIENCE:
Confidential, Corvallis, OR
Big Data/Spark/Scala developer
Responsibilities:
- Involved in communicating with clients during the development phase.
- Importing/Exporting data to/from HDFS and Hive using Sqoop.
- Defining multiple data validation rules and creating the corresponding Hive queries.
- Loading the data into Hive managed tables using partitions and buckets.
- Built data pipeline using Pig and Java/Scala Map Reduce to store onto HDFS.
- Validate the test cases using Spark SQL.
- Develop Scala and Spark applications to execute the Hive queries using Hive Context in Spark for faster data processing than standard MapReduce programs.
- Developed Oozie workflow for Spark jobs.
- Worked on migrating Map Reduce programs into Spark transformations using Spark and Scala.
- Provide support data analysis in running Pig and Hive queries.
- Good knowledge of writing Hive UDF's as well as of Partitioning and Bucketing.
- Importing and exporting Data from MySQL/Oracle to HiveQL Using SQOOP.
- Developed Spark applications using Scala utilizing Data frames and Spark SQL API for faster processing of data.
- Responsible for defining the data flow within Hadoop ecosystem and direct the team in implementing them.
- Involved in managing and reviewing the Hadoop log files.
Environment: Hadoop, Hive, Zookeeper, Map Reduce, Sqoop, Pig, HDFS, Spark, Scala, Flume, DB2, HBase.
Confidential, Boca Raton, FL
Big Data/Hadoop Developer
Responsibilities:
- Worked on analyzing Hadoop cluster using different Big Data analytic tools including Flume, Sqoop, Spark, Pig, Hive and Map Reduce.
- Involved in loading data from LINUX file system to HDFS.
- Importing and exporting data into HDFS and Hive using Sqoop.
- Experience working on processing unstructured data using Pig and Hive.
- Developed Pig Latin scripts to extract data from the web server output files to load into HDFS.
- Implemented Partitioning, Dynamic Partitions, Buckets in Hive.
- Developed Hive queries, Pig scripts, and Spark SQL queries to analyze large datasets.
- Exported the result set from Hive to MySQL using Sqoop.
- Developed Spark code to using Scala and Spark-SQL for faster processing and testing.
- Develop ETL Process using Spark, Scala, hive and Hbase.
- Created and maintained technical documentation for launching Hadoop clusters and for executing Hive queries and Pig Scripts.
- Worked on debugging, performance tuning of Hive & Pig Jobs.
- Gained experience in managing and reviewing Hadoop log files.
- Involved in scheduling Oozie workflow engine to run multiple Hive and pig jobs.
- Used NoSQL database with Hbase
- Actively involved in code review and bug fixing for improving the performance.
Environment: Hadoop, HDFS, Pig, Hive, MapReduce, Sqoop, Spark, Scala, Flume, LINUX, Hbase, Java
Hadoop/Big DataDeveloper
Confidential, Columbus, IN
Responsibilities:
- Currently working in a Big Data Capacity with the help of Hadoop Eco System across internal and cloud - based platforms.
- Above 8+ years of experience as Big Data/Hadoop and JavaDeveloper with skills in analysis, design, development, testing and deploying various software applications.
- Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa.
- Good knowledge in using Hibernate for mapping Java classes with database and using Hibernate Query Language (HQL).
- Hands on experience in configuring and working with Flume to load the data from multiple sources directly into HDFS.
- Good experience in developing MapReduce jobs in J2EE /Java for datacleansing, transformations, pre-processing and analysis.
- Good Knowledge in Amazon Web Service (AWS) concepts like EMR and EC2webservices which provides fast and efficient processing of TeradataBigData Analytics.
- Experience in collection of LogData and JSON data into HDFS using Flume and processed the data using Hive/Pig.
- Extensive experience on developing SparkStreaming jobs by developing RDD's (Resilient Distributed Datasets) and used SparkSQL as required.
- Experience on developing JAVAMapReduce jobs for datacleaning and data manipulation as required for the business.
- Strong knowledge on Hadoopeco systems including HDFS, Hive, Oozie, HBase, Pig, Sqoop, Zookeeper etc.
- Extensive experience with advanced J2EE Frameworks such as spring, Struts, JSF and Hibernate.
- Expertise in JavaScript, JavaScriptMVC patterns, ObjectOrientedJavaScriptDesign Patterns and AJAX calls.
- Installation, configuration and administration experience in Big Data platforms Cloudera Manager of Cloudera, MCS of MapR.
- Experience working with Horton works and Cloudera environments.
- Good knowledge in implementing various data processing techniques using ApacheHBase for handling the data and formatting it as required.
- Excellent experience in installing and running various Oozieworkflows and automating parallel job executions.
- Experience on Spark and SparkSQL, SparkStreaming, SparkGraphX, SparkMlib.
- Extensively development experience in different IDE like Eclipse, NetBeans, IntelliJ and STS.
- Strong experience in coreSQL and Restfulwebservices (RWS).
- Strong knowledge in NOSQLcolumn oriented databases like HBase and its integration with Hadoopcluster.
- Good experience in Tableau for DataVisualization and analysis on large datasets, drawing various conclusions.
- Experience in using Python, R for statisticalanalysis.
- Good knowledge of coding using SQL, SQLPlus, T-SQL, PL/SQL, Stored Procedures/Functions.
- Worked on Bootstrap, AngularJS and NodeJS, knockout, ember, Java Persistence Architecture (JPA).
- Well versed working with Relational Database Management Systems as Oracle12c, MSSQL, MySQLServer.
- Experience with all stages of the SDLC and Agile Development model right from the requirement gathering to Deployment and production support.
- Experience in using PL/SQL to write Stored Procedures, Functions and Triggers.
Confidential
Software Developer
Responsibilities:
- Developed web services in Confidential Integration Bus (WebSphere Message Broker v9.0) and integrated the services with back-end System of Records and Confidential Business Process Manager (BPM).
- Performed Unit testing of the service and provided support during integration testing.
- Experience in 24X7 on-call production support and troubleshooting problems related to WebSphere Application Servers.
- Worked close with application team to figured out application related issues.
- Good Team player possessing excellent communicational skills, self-starter and self-motivated.
- Knowledge transfer and conduct technical sessions within team.
- Won an ‘Eminence and Excellence Orion Award’ for being a valuable team player and effectively collaborating with clients and team members.