Sr. Big Data/Scala Developer Resume Ohio - Hire IT People

SUMMARY

Big Data developer with over 8+ years of professional IT experience, which includes 4years’ experience in the field of Big Data.
Extensive experience in working with various distributions of Hadoop like enterprise versions of Cloudera, Hortonworks and good knowledge on MAPR distribution and Amazon’s EMR.
In depth experience in using various Hadoop Ecosystem tools like HDFS, MapReduce, Yarn, Pig, Hive, Sqoop, Spark, Storm, Kafka, Oozie, Elastic search, HBase, and Zookeeper.
Extensive knowledge of Hadoop architecture and its components.
Good knowledge in installing, configuring, monitoring and troubleshooting Hadoop cluster and its eco - system components.
Exposure to Data Lake Implementation using Apache Spark.
Developed Data pipe lines and applied business logics using Spark.
Well-versed in spark components like Spark SQL, MLib, Spark streaming and GraphX.
Extensively worked on Spark streaming and Apache Kafka to fetch live stream data.
Used Scala and Python to convert Hive/SQL queries into RDD transformations in Apache Spark.
Experience in integrating Hive queries into Spark environment using Spark SQL.
Expertise in performing real time analytics on big data using HBase and Cassandra.
Handled importing data from RDBMS into HDFS using Sqoop and vice-versa.
Extensive experience in importing and exporting streaming data into HDFS using stream processing platforms like Flume and Kafka.
Experience in developing data pipeline using Pig, Sqoop, and Flume to extract the data from weblogs and store in HDFS.
Created User Defined Functions (UDFs), User Defined Aggregated Functions (UDAFs) in PIG and Hive.
Hands-on experience in tools like Oozie and Airflowto orchestrate jobs.
Proficient in NoSQL databases including HBase, Cassandra, MongoDB and its integration with Hadoop cluster.
Expertise in Cluster management and configuring Cassandra Database.
Great familiarity with creating Hive tables, Hive joins & HQL for querying the databases eventually leading to complex Hive UDFs.
Accomplished developing Pig Latin Scripts and using Hive Query Language for data analytics.
Worked on different compression codecs (ZIO, SNAPPY, GZIP) and file formats (ORC, AVRO, TEXTFILE, PARQUET)
Experience in practical implementation of cloud-specific AWS technologies including IAM, Amazon Cloud Services like Elastic Compute Cloud (EC2), ElastiCache, Simple Storage Services (S3), Cloud Formation, Virtual Private Cloud (VPC), Route 53, Lambda, EBS.
Built AWS secured solutions by creating VPC with public and private subnets.
Worked on data warehousing and ETL tools like Informatica, Talend, and Pentaho.
Expertise working in JAVA J2EE, JDBC, ODBC, JSP, Java Eclipse, Java Beans, EJB, Servlets.
Developed web page interfaces using JSP, Java Swings, and HTML scripting languages.
Developed web page interfaces using JSP, Java Swings, and HTML scripting languages.
Experience working with Spring and Hibernate frameworks for JAVA.
Worked on various programming languages using IDEs like Eclipse, NetBeans, and Intellij.
Excelled in using version control tools like PVCS, SVN, VSS and GIT.
Used web-based UI development using JavaScript, jquery UI, CSS, jquery, HTML, HTML5, XHTML and JavaScript.
Development experience in DBMS like Oracle, MS SQL Server, Teradata, and MYSQL.
Developed stored procedures and queries using PL/SQL.
Experience with best practices of Web services development and Integration (both REST andSOAP).
Experienced in using build tools like Ant, Gradle, SBT, Maven to build and deploy applications into the server.
Knowledge in Unified Modeling Language (UML) and expertise in Object Oriented Analysis and Design (OOAD) and knowledge
Experience in complete Software Development Life Cycle (SDLC) in both Waterfall and Agile methodologies
Knowledge in Creating dashboards and data visualizations using Tableau to provide business insights
Excellent communication skills, interpersonal skills, problem-solving skills and very good team player along with a can do attitude and ability to effectively communicate with all levels of the organization such as technical, management and customers.

TECHNICAL SKILLS

Languages/Tools: Java, C, C++, C#, Scala, VB, XML, HTML/XHTML, HDML, DHTML.

Big Data: HDFS, MapReduce, HIVE, PIG, HBase, SQOOP, Oozie, Zookeeper, Spark, Mahout, Kafka, Storm, Cassandra, Solr, Impala, Greenplum, MongoDB

J2EE Standards: JDBC, JNDI, JMS, Java Mail & XML Deployment Descriptors.

Web/Distributed Technologies: J2EE,Servlets 2.1/2.2, JSP 2.0, Struts 1.1, Hibernate 3.0, JSF, JSTL1.1,EJB 1.1/2.0, RMI,JNI, XML,JAXP,XSL,XSLT, UML, MVC,STRUTS,Spring 2.0, Corba, Java Threads.

Operating System: Windows 95/98/NT/2000/XP, MS-DOS, UNIX, multiple flavors of Linux.

Databases / NO SQL: Oracle 10g, MS SQL Server 2000, DB2, MS Access & MySQL. Teradata, Cassandra, Greenplum and MongoDB

Browser Languages: HTML, XHTML, CSS, XML, XSL, XSD, XSLT.

Browser Scripting: Java script, HTML DOM, DHTML, AJAX.

App/Web Servers: IBM Websphere 5.1.2/5.0/4.0/3.5 , BEA Web logic 5.1/7.0, Jdeveloper, Apache Tomcat, JBoss.

GUI Environment: Swing, AWT, Applets.

Messaging & Web Services Technology: SOAP, WSDL,UDDI, XML, SOA, JAX-RPC, IBM WebSphere MQ v5.3, JMS.

Networking Protocols: HTTP, HTTPS, FTP, UDP, TCP/IP, SNMP, SMTP, POP3.

Testing &Case Tools: Junit, Log4j, Rational Clear case, CVS, ANT, Maven, JBuilder.

Version Control Systems: Git, SVN, CVS

PROFESSIONAL EXPERIENCE

Confidential - Ohio

Sr. Big Data/Scala Developer

Responsibilities:

Involved in analyzing business requirements and prepared detailed specifications that follow project guidelines required for project development.
Used Sqoop to import data from Relational Databases like MySQL, Oracle.
Involved in importing structured and unstructured data into HDFS.
Responsible for fetching real time data using Kafka and processing using Spark and Scala.
Worked on Kafka to import real time weblogs and ingested the data to Spark Streaming.
Developed business logic using Kafka Direct Stream in Spark Streaming and implemented business transformations.
Worked on Building and implementing real-time streaming ETL pipeline using Kafka Streams API.
Worked on Hive to implement Web Interfacing and stored the data in Hive tables.
Migrated Map Reduce programs into Spark transformations using Spark and Scala.
Experienced with Spark Context, Spark-SQL, Spark YARN.
By using Vertica Columnar relational database management system used for data warehousing and Big data analytics.
Strengthened T-SQL coding skills.
Created schema, table, and T-SQL scripts to archive Vertica resources usage data for trend analysis.
Implemented Spark Scripts using Scala, Spark SQL to accesshivetables into spark for faster processing of data.
Loaded the data into Spark RDD and do in memory data Computation to generate the Output response.
Implemented data quality checks using Spark Streaming and arranged passable and bad flags on the data.
Implemented Hive Partitioning and Bucketing on the collected data in HDFS.
Involved in Data Querying and Summarization using Hive and Pig and created UDF’s, UDAF’s and UDTF’s.
Implemented Sqoop jobs for large data exchanges between RDBMS and Hive clusters.
Extensively used Zookeeper as a backup server and job scheduled for Spark Jobs.
Developed traits and case classes etc in Scala.
Developed Spark scripts using Scala shell commands as per the business requirement.
Worked on Cloudera distribution and deployed on AWS EC2 Instances.
Experienced in loading the real-time data to NoSQL database like Cassandra.
Well versed in using Data Manipulations, Compactions, in Cassandra.
Experience in retrieving the data present in Cassandra cluster by running queries in CQL (Cassandra Query Language).
Worked on connecting Cassandra database to the Amazon EMR File System for storing the database in S3.
Implemented usage of Amazon EMR for processing Big Data across aHadoop Clusterof virtual servers on Amazon Elastic Compute Cloud (EC2) and Amazon Simple Storage Service (S3).
Deployed the project on Amazon EMR with S3 connectivity for setting a backup storage.
Well versed in using of Elastic Load Balancer for Auto scaling in EC2 servers.
Configured work flows that involves Hadoop actions using Oozie.
Used Python for pattern matching in build logs to format warnings and errors.
Coordinated with SCRUM team in delivering agreed user stories on time for every sprint.

Environment: Hadoop YARN, Spark SQL, Spark-Streaming, AWS S3, AWS EMR, Spark-SQL, GraphX, Scala, Python, Kafka, Hive, Pig, Sqoop, Cassandra, Cloudera, Oracle 10g, Linux.

Confidential - Coraopolis, PA

Hadoop/Big Data Analyst

Responsibilities:

Developed MapReduce programs to parse and filter the raw data store the refined data in partitioned tables in the Hbase.
Created Hive queries that helped market analysts spot emerging trends by comparing fresh data with Hbase reference tables and historical metrics.
Responsible for creatingHivetables, loading the structured data resulted from MapReduce jobs into the tables and writinghivequeries to further analyze the logs to identify issues and behavioral patterns.
Involved in running MapReduce jobs for processing millions of records.
Built reusable Hive UDF libraries for business requirements which enabled users to use these UDF's in Hive Querying.
Developed a data pipeline using Scala to store data into HDFS.
Experienced in migratingHiveQL into Impala to minimize query response time.
Responsible for Data Modeling in Hbase as per our requirement.
Shared responsibility for administration of Hadoop, Hive and Pig.
Managing and scheduling Jobs on a Hadoop cluster using Nifi jobs.
Installed and configured Hadoop MapReduce, HDFS, Developed multiple MapReduce jobs in java for data cleaning and preprocessing.
Created UDFs to calculate the pending payment for the given Residential or Small Business customer, and used in Pig and Hive Scripts.
Deployed and built the application usingMaven.
Maintain Hadoop, Hadoop ecosystems, third party software, and database(s) with updates/upgrades, performance tuning and monitoring using Ambari
Obtained good experience with NOSQL database Hbase.
UsedCassandraCQL with Java API's to retrieve data fromCassandratables.
Experience in managing and reviewing Hadoop log files.
Experienced in moving data from Hive tables intoHbasefor real time analytics on Hive tables.
Handled importing of data from various data sources, performed transformations usingHive. (External tables, partitioning).
Involved in NoSQL (DataStaxCassandra) database design, integration and implementation.
Implemented CRUD operations involving lists, sets and maps in DataStaxCassandra.
Responsible for data modeling inHbasein order to load data which is coming as structured as well as unstructured data.
Unstructured files like XML's, JSON files are processed using custom built Java API and pushed intomongodb.
Participated in development/implementation ofClouderaHadoopenvironment.
Created tables, inserted data and executed variousCassandraQuery Language (CQL 3) commands on tables from java code and using cqlsh command line client .
Wrote test cases in MRunit for unit testing of Mapreduce Programs.
Extensively worked on User Interface for few modules using JSPs, JavaScript and Ajax.
Created Business Logic using Servlets, Session beans and deployed them on Web logic server.
Involved in templates and screens in HTML and JavaScript.
Developed the XML Schema and Web services for the data maintenance and structures
Provided Technical support for production environments resolving the issues, analyzing the defects, providing and implementing the solution defects.
Built and deployed Java applications into multiple Unix based environments and produced both unit and functional test results along with release notes.

Environment: HDFS, MapReduce, Hive, Pig, Cloudera, Impala, Nifi, Hbase, Cassandra, Kafka, Storm, Maven,CloudManager, NagiOS, Ambari, JDK, J2EE, Struts,JSP, Servlets, ElasticSearch, WebSphere, HTML, XML, JavaScript, MRunit.

Confidential - Peoria, IL

Hadoop/Big Data Analyst

Responsibilities:

Exported data to a MySQL from HDFS using Sqoop and NFS mount approach.
Moved data from HDFS to Cassandra using Map Reduce and BulkOutputFormat class.
Developed Map Reduce programs for applying business rules on the data.
Developed and executed hive queries for denormalizing the data.
Works with ETL workflow, analysis of big data and loaded them into Hadoop cluster.
Installed and configured Hadoop Cluster for development and testing environment.
Implemented Fair scheduler on the Job tracker to share the resources of the cluster for the map reduces jobs given by the users.
Involved in creating Shell scripts to simplify the execution of all other scripts (Pig, Hive, Sqoop, Impala and MapReduce) and move the data inside and outside of HDFS
Automated the workflow using shell scripts.
Performance tuning of the Hive queries, written by other developer.
Mastered major Hadoop distros HDP/CDH and numerous Open Source projects
Prototype various applications that utilize modern Big Data tools.

Environment: Linux, Java, Map Reduce, HDFS, DB2, Cassandra, Hive, Pig, Sqoop, FTP.

Confidential - SanFrancisco, CA

Hadoop Developer

Responsibilities:

Worked on analyzing Hadoop cluster using different big data analytic tools including Pig, Hive, and MapReduce.
Worked on debugging, performance tuning of Hive&Pig Jobs.
Created Hbase tables to store various data formats of PII data coming from different portfolios.
Implemented test scripts to support test driven development and continuous integration.
Worked on tuning the performance Pig queries.
Involved in loading data from LINUX file system to HDFS.
Importing and exporting data into HDFS and Hive using Sqoop.
Experience working on processing unstructured data using Pig and Hive.
Experienced in running Hadoop streaming jobs to process terabytes of xml format data.
Gained experience in managing and reviewing Hadoop log files.
Involved in scheduling Oozie workflow engine to run multiple Hive and pig jobs.
Developed Pig Latin scripts to extract data from the web server output files to load into HDFS.
Extensively used Pig for data cleansing.
Created and maintained Technical documentation for launching HADOOP Clusters and for executing Hive queries and Pig Scripts.
Implemented SQL, PL/SQL Stored Procedures.
Actively involved in code review and bug fixing for improving the performance.
Developed screens using JSP, DHTML, CSS, AJAX, JavaScript, Struts, Spring, Java and XML.

Environment: Hadoop, HDFS, Pig, Hive, MapReduce, Sqoop, LINUX, Cloudera, Big Data, Java APIs, Java collection, SQL, AJAX.

Confidential

Java Developer

Responsibilities:

Extensively involved in the design and development of JSP screens to suit specific modules.
Converted the application’s console printing of process information to proper logging technology using log4j.
Developed the business components (in core Java) used in the JSP screens.
Involved in the implementation of logical and physical database design by creating suitable tables,views and triggers.
Developed related procedures and functions used by JDBC calls in the above components.
Extensively involved in performance tuning of Oracle queries.
Created components to extract application messages stored in xml files.
Executed UNIX shell scripts for command line administrative access to oracle database and for scheduling backup jobs.
Created war files and deployed in web server.
Performed source and version control using VSS.
Involved in maintenance support.

Environment: JDK, HTML, JavaScript, XML, JSP, Servlets, JDBC, Oracle 9i, Eclipse, Toad, UNIX Shell Scripting, MS Visual SourceSafe, Windows 2000.

Confidential

Junior JAVA Developer

Responsibilities:

Involved in the analysis, design, implementation, and testing of the project.
Implemented the presentation layer with HTML, XHTML and JavaScript.
Developed web components using JSP, Servlets and JDBC.
Designed tables and indexes.
Extensively worked on JUnit for testing the application code of server-client data transferring.
Developed and enhanced products in design and in alignment with business objectives.
Used SVN as a repository for managing/deploying application code.
Involved in the system integration and user acceptance tests successfully.
Developed front end using JSTL, JSP, HTML, and Java Script.
Wrote complex SQL queries and stored procedures.
Involved in fixing bugs and unit testing with test cases using JUnit.
Actively involved in the system testing.
Involved in implementing service layer using Spring IOC module.
Prepared the Installation, Customer guide and Configuration document which were delivered to the customer along with the product.

Environment: Java, JSP, JSTL, HTML, JAVAScript, Servlets, JDBC, JavaScript, MySQL, JUnit, Eclipse IDE.

We provide IT Staff Augmentation Services!

Sr. Big Data/scala Developer Resume

OhiO

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship