We provide IT Staff Augmentation Services!

Sr. Java/hadoop/python Developer Resume

5.00/5 (Submit Your Rating)

SUMMARY

  • 10 years of experience in the field of IT including four years of experience in Hadoop ecosystem and four years of experience as a Java developer with good object oriented programming skills.
  • Expertise in design, development and Testing of various web and enterprise applications using Type safe technologies like Scala, Akka, Play framework, Slick.
  • Experienced in using Scala, Java tools like Intelli J, Eclipse.
  • Good knowledge of Hadoop Architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node, MapReduce concepts responsible for writing MapReduce programs and setting up standards and processes for Hadoop - based application design and implementation.
  • Expertise with different tools in Hadoop Environment including Pig, Hive, HDFS, MapReduce, Sqoop, Spark, Kafka, Yarn, Oozie, and Zookeeper.
  • Extensive work in ETL process consisting of data transformation, data sourcing, mapping, conversion and loading using Informatica.
  • Expertise in developing data driven applications usingPython2.7,Python3.0 on Pycharm and Anaconda Spyder IDE's.
  • Hands on experience in configuring and working with Flume to load the data from multiple sources directly into HDFS.
  • Expertise in MapReduce programs in HIVE and PIG to validate and cleanse the data in HDFS, obtained from heterogeneous data sources, to make it suitable for analysis.
  • Analyzed or transformed stored data by writing MapReduce jobs based on business requirements.
  • Experienced in coding Web Services with JAX-WS (SOAP) and JAX-RS (Restful).
  • Experience in developing Pig scripts and Hive Query Language.
  • Managing and scheduling batch Jobs on a Hadoop Cluster using Oozie.
  • Hands on experience working with NoSQL database including MongoDB and HBase.
  • Experience in optimizing MapReduce jobs to use HDFS efficiently by using various compression mechanisms.
  • Experience in developing Pig Latin scripts to extract the data from the web server output files to load into HDFS.
  • Hands on experience in using Sqoop to import data into HDFS from RDBMS and vice-versa.
  • Hands on experience in application development using Java, RDBMS, and Linux shell scripting.
  • Expertise in Web technologies using Core Java, J2EE, Servlets, EJB, JSP, JDBC, Java Beans, Apache, and Design Patterns.
  • Detailed understanding of Software Development Life Cycle (SDLC) and sound knowledge of project implementation methodologies including Waterfall and Agile.
  • Ability to adapt to evolving technology, a strong sense of responsibility and accomplishment. Willing to relocate: Anywhere.

TECHNICAL SKILLS

Operating Systems: Windows 8/7/XP, Unix, Ubuntu 13.X, Mac OSX

Hadoop Eco System: Hadoop 1.x/2.x(Yarn), HDFS, Map Reduce, Mongo, HBase, Hive, Impala, PIG, Zookeeper, Sqoop, Oozie, Flume, Storm, HDP, AWS, Eclipse, Cloudera-desktop and SVN.

Java Tools: Java, Map Reduce, J2EE (JSP, Servlets, EJB, JDBC, JMS, JNDI, RMI), Struts, Hibernate, AJAX, XSLT, HTML, JavaScript, CSS, JUnit, JAVA, JSP, JSON, J2EE, Web Services, DHTML, Javascript, DOM, SAX, Jquery, XML, XSLT.

API’s: Servlets, EJB, Java Naming, and Directory Interface(JNDI), MapReduce, RESTful.

Development Tools: Eclipse, RAD/RSA (Rational Software Architect), IBM DB2 Command Editor, SQL Developer, Microsoft Suite (Word, Excel, PowerPoint, Access), Open Office Suite (Editor, Calc etc..), VM Ware.

Languages: Scala, Java, Java EE, JSP, Python

No SQL Databases: HBase, Cassandra, Monod

Servers: Web sphere (WAS) 6.x/7.0, Web Logic 10-12c, Apache.

PROFESSIONAL EXPERIENCE

Confidential

Sr. Java/Hadoop/Python Developer

Responsibilities:

  • Responsible for developing efficient MapReduce on AWS cloud programs for more than 20 years' worth of claim data to detect and separate fraudulent claims.
  • Worked with the advanced analytics team to design fraud detection algorithms and then developed MapReduce programs to efficiently run the algorithm on the huge datasets.
  • Ran data formatting scripts inpythonand created terabyte csv files to be consumed byHadoop MapReduce jobs.
  • Performed Kafka analysis, feature selection, feature extraction using Apache Spark Machine Learning streaming libraries inPython.
  • Developed Python code usingversioncontroltools likeGIThub and SVN on vagrant machines.
  • Created Hive tables to store data into HDFS, loading data and writing hive queries that will run internally in map-reduce way.
  • Involved in building the ETL architecture and Source to Target mapping to load data into Data warehouse.
  • Uploaded and processed terabytes of data from various structured and unstructured sources into HDFS (AWS cloud) using Sqoop and Flume.
  • Involved in Cluster coordination services through Zookeeper.
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
  • Played a key role in installation and configuration of the various Hadoop ecosystem tools such as Solr, Pig, HBase and Cassandra.
  • Developed an information pipeline utilizing Kafka and Storm to store data into HDFS.
  • Loading spilling data using Kafka, Flume and real time Using Spark and Storm.
  • Implemented various hive optimization techniques like Dynamic Partitions, Buckets, Map Joins, Parallel executions in Hive.
  • Worked with Pre-Session and Post-Session Linux scripts for automation of ETL jobs and to perform operations like gunzip, remove and archive files.
  • CreatedTalendjobs to copy the files from one server to another and utilizedTalendFTP components.
  • Created Joblets and Parent child jobs inTalend.
  • Designed & Developed the ETL Jobs usingTalendIntegration Suite by using various transformations as per the business requirements and based on ETL Mapping Specifications.
  • Extracted meaningful data from dealer csv files, text files, and mainframe files and generatedPythonpanda's reports for data analysis.
  • UtilizedPythonto run scripts, generate tables, and reports.
  • Coordinates withAgileteamto effectively meet all Confidential commitments.
  • Designed and Maintained Oozie workflows to manage the flow of jobs in the cluster.
  • Parse Json files through Spark core to extract schema for the production data using SparkSQL and Scala.
  • Actively updated the upper management with daily updates on the progress of project that include the classification levels that were achieved on the data.

Environment: Java, J2EE, Hadoop, HDFS, Pig, Nifi, Hive, MapReduce, Sqoop, Kafka, CDH3, Cassandra, Python, Oozie, collection, Scala, AWS cloud, storm, AbInitio, Apache, SQL, NoSQL, Bitbucket, HBase, Flume, spark, Solr, Zookeeper, ETL, Talend, Centos, Eclipse, Agile.

Confidential - Overland Park, KS

Java/Hadoop Developer

Responsibilities:

  • Involved in making Hive tables, stacking the information and composing Hive queries that will run inside in MapReduce.
  • Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs,Pythonand Scala.
  • WrotePythonmodules to view and connect the Apache Cassandra instance
  • Involved in writing MapReduce jobs.
  • DevelopedRESTfulweb services interface to Java-based runtime engine and accounts.
  • CustomizedRESTfulWeb Service using SpringRESTfulAPI, sending JSON format data packets between front-end and middle-tier controller.
  • Real-time streaming the information utilizing Spark with Kafka.
  • Responsible for creating information pipeline utilizing flume, Sqoop, and pig to remove the information from weblogs and store in HDFS.
  • Involved in emitting processed information from Hadoop to relational databases or external frameworks utilizing Sqoop, HDFS GET or CopyToLocal.
  • Developed data pipeline utilizing Flume, Sqoop, Pig and Java MapReduce to ingest client behavioral information and money related histories into HDFS for analysis.
  • Experienced in managing and assessing Hadoop log records.
  • Used Pig to do changes like event joins, filter boot traffic and some pre-aggregations before storing the information onto HDFS.
  • Written Hive inquiries for data to meet the business requirements.
  • Importing and sending out information into HDFS and Hive utilizing Sqoop and Kafka.
  • Created various Parser programs to extract data from Autosys, Tibco Business Objects, XML, Informatica, Java, and database views usingScala.
  • Involved in creating Hive Tables, loading with data and writing Hive queries which will invoke and run MapReduce jobs in the backend.
  • Implemented the workflows using Apache Oozie framework to automate tasks.
  • Developed scripts and automated data management from end to end and sync up between all the clusters.

Environment: Java, Hadoop, Scala, MapReduce, MongoDB, SQL, Apache, Yarn, Hive, Pig, HBase, Oozie, Sqoop, Flume, Core Java, HDP, HDFS, Eclipse, Kafka.

Confidential

Software Engineer

Responsibilities:

  • Installed and configured Hadoop MapReduce, HDFS, Developed multiple MapReduce jobs in java for data cleaning and preprocessing.
  • Involved in installing and updating and managing Environment.
  • Collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis.
  • Involved in running Hadoop streaming jobs to process terabytes of XML format data.
  • Participated in requirement gathering and analysis phase of the project in documenting the business requirements by conducting workshops/meetings with various business users.
  • Involved in Sqoop, HDFS Put or CopyFromLocal to ingest data.
  • Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
  • Created and maintained technical documentation for launching Hadoop clusters and for executing Hive queries and Pig Scripts.
  • Implemented test scripts to support test-driven development and continuous integration.
  • Implemented SQL, PL/SQL Stored Procedures.
  • Involved in developing Shell scripts to orchestrate the execution of all other scripts (Pig, Hive, and MapReduce) and move the data files within and outside of HDFS.
  • Involved in developing Hive UDFs for the needed functionality that is not out of the box available from Apache Hive.
  • Actively updated the upper management with daily updates on the progress of a project that include the classification levels that were achieved on the data.

Environment: Core Java, J2ee, Hadoop, MapReduce, NoSQL, Hive, Pig, Sqoop,, Apache, HDP, HDFS, Eclipse.

We'd love your feedback!