We provide IT Staff Augmentation Services!

Hadoop Developer/admin Resume

2.00/5 (Submit Your Rating)

Atlanta, GA

SUMMARY

  • 7+Years of experience with emphasis on Big Data Technologies, Development, and Design of Java based enterprise applications.
  • Three years of experience in Hadoop Development and four years of Java Application Development.
  • Hands on experience in usingHadoopTechnologies such as HDFS, HIVE, PIG, SQOOP, HBASE, Impala, Flume, Spark, Oozie, Mapreduce.
  • Implemented in setting up standards and processes for Hadoop based application design and implementation.
  • Logical Implementation and interaction with HBase.
  • Developed a data pipeline usingKafkaand Strom to store data into HDFS.
  • Experience in using Scoop, Zookeeper, and cloud - based computing Manager.
  • Services through Zookeeper.
  • Experience inNOSQLdatabase HBase.
  • Responsible to manage data coming from different sources.
  • Worked with different distributions ofHadooplike Hortonworks andCloudera.
  • Worked on tuning the performance Pig queries.
  • Worked on creating Map Reduce scripts for processing the data.
  • Developed Java code that stream the Packet tracer data into Hive using rest full services.
  • Experience in maintaining the big data platform using open source technologies such as Spark.
  • Experience on ETL development using Kafka, Flume, and Sqoop.
  • Involved in performance tuning of spark jobs using Cache and using complete advantage of cluster environment.
  • Proficient in performance analysis, monitoring and SQL query tuning using EXPLAIN PLAN, Collect Statistics, Hints and SQL Trace both inTeradataas well as Oracle.
  • Loading data from different source (database & files) into Hive usingTalendtool.
  • Redesigned the existing InformaticaETLmappings & workflows using Spark python.
  • Worked with business teams and created Hive queries for ad hoc access.
  • Load the data into Spark RDD and performed in-memory data computation to generate the output response.
  • Prepared, arranged and tested Splunk search strings and operational strings
  • Developed Spark code and Spark-SQL/Streaming for faster testing and processing of data.
  • Evaluate and propose new tools and technologies to meet the needs of the organization.
  • An excellent team player and self-starter with good communication skills and proven abilities to finish tasks before target deadlines.
  • Outstanding communication and presentation skills, willing to learn, adapt to new technologies and third party products.
  • Flexible and ability to balance multiple projects Confidential one time in a fast-paced environment.

TECHNICAL SKILLS

Big Data: HDFS, MapReduce, Hive, Pig, HBase, Sqoop, Flume, Oozie, Zookeeper, MongoDB, Spark, Cloudera, Hortonworks, Splunk.

Programming: Java, C/C++, Sas, Java Script, Python, R, PL/SQL, UNIX Shell scripting.

Operating Systems: Windows, Ubuntu, Red Hat Linux, UNIX.

PROFESSIONAL EXPERIENCE

Hadoop Developer/Admin

Confidential, Atlanta, GA

Responsibilities:

  • Developing Hive andPythonscripts for data validity checks and updates using Oozie to manage workflows.
  • Worked on Data capacity planning and node forecasting.
  • Experience in managingHadoopJobs and logs of all the scripts.
  • Worked on migrating data from Mongo DB toHadoop.
  • Supported the daily/weeklyawsbatches in the Production environment.
  • DevelopKafkaproducer and consumers, HBase clients, Spark andHadoopMapReduce jobs along with components on HDFS, Hive.
  • Import data using Sqoop to load data from Oracle/SQL Server to HDFS on regular basis.
  • Design and implement map reduce jobs to support distributed processing using java, Hive and Apache Pig.
  • Worked on Big Data Integration &Analytics based onHadoop, SOLR, Spark, Kafka, Storm and web Methods.
  • Processing large data sets in parallel across theHadoopcluster for pre-processing.
  • Experienced with batch processing of data sources using Apache Spark,ElasticSearch.
  • Use Spark SQL to process the huge amount of structured data.
  • ETL processing using Pig and Sqoop and application programming using Hive, Java andpython.
  • Executed Map Reduce programs to cleanse data in HDFS gathered from heterogeneous data sources to make it suitable for ingestion into Hive schema for analysis.
  • Wrote Spark applications in Scala utilizing the data frame and spark sql api.
  • Strong Knowledge on Architecture of Distributed systems and parallel processing, In-depth understanding of MapReduce programming paradigm.
  • Installation and administeringSplunkfor monitoring, analyzing and visualizing machine data.
  • Working withSplunkdevelopers' team for gettingHadoopdashboards created with various metrics.
  • Importing data into HDFS using Sqoop, which includes incremental loading.
  • Design and develop MapReduce jobs to process logs and feed Data Warehouse, load Hive tables for analytics and to store daily feed of data on HDFS for other team's use.
  • Experience in analyzing Log files for HDFS.
  • Used ApacheNififor loading PDF Documents from Microsoft SharePoint to HDFS
  • Written Java scripts that execute different MongoDB queries.
  • Implemented Data loading using Spark, Storm, Kafka,ElasticSearch.
  • Manage and review Hadoop log files.
  • Configured deployed and maintained multi-node Dev and TestKafkaClusters.
  • Import and export data between the environments like MySQL, HDFS.
  • Develop code according to the task assigned for the user story.
  • DevelopedSparkscripts by using Scala shell commands as per the requirement.no
  • DevelopedSparkcode andSpark-SQL/Streaming for faster testing and processing of data.
  • Work with Hadoop designers in troubleshooting map reduce job failures and issues.

Environment: ApacheHadoop2.6.0/Hadoop1.2.0, HDFS, Spark,Map Reduce, Hive,Splunk,Mango DB, HBase, Sqoop, Zookeeper, Oozie, Kerberos, My SQL, Linux, Unix scripts,Python,Putty, Kafka.

Hadoop Developer/Admin

Confidential, Alpharetta, GA

Responsibilities:

  • Worked on analyzing, writing Hadoop MapReduce jobs using Pig and Hive.
  • Involved in loading data from edge node to HDFS using shell scripting.
  • Configured MySQL Database to store Hive metadata.
  • Push data as delimited files into HDFS usingTalendBig data studio.
  • Very good understanding of Partitions, Bucketing concepts in Hive and designed both Managed and External tables in Hive for optimized performance
  • Created Hive tables, loaded data and wrote Hive QL queries to further analyze the data.
  • End-to-end performance tuning ofHadoopclusters andHadoopMapReduce routines against very large data sets.
  • Performed transformations, cleaning and filtering on imported data using Hive, Map Reduce, and loaded final data into HDFS.
  • Design and developData Ingestioncomponent.
  • Load the data into Spark RDD and performed in-memory data computation to generate the output response.
  • Good knowledge in using apacheNiFito automate the data movement between differentHadoop systems.
  • Load and transform data into HDFS from large set of structured data /Oracle/Sql server usingTalendBig data studio.
  • Created a self-managedPythonscript to deploy testing of the technologies, and calculate statistics. The script was designed such that future tests on other techs could easily be integrated.
  • Designed and developedbigdatasystems to perform variousETLand Map/Reduce jobs using R andLoading data into HBase using Bulk Load and Non-bulk load.
  • Responsible for building scalable distributed data solutions using Hadoop.
  • Involved in loading data from LINUX file system to HDFS.
  • Developed UDFs using JAVA, PIG and HIVE queries.
  • Good understanding ofNoSQLdatabases.
  • Started using apacheNiFito copy the data from local file system to HDFS.
  • Working on Hive/Hbase vs RDBMS, imported data to Hive, HDP created tables, partitions, indexes, views, queries and reports for BI data analysis.
  • Wrote the Map Reduce jobs in java to parse the web logs, which are stored in HDFS and used MRUnit to test and debug MapReduce programs.
  • Worked on tuning the performance Pig queries.
  • Very good experience with both MapReduce 1 (Job Tracker) and MapReduce 2 (YARN).
  • Involved in loading data from UNIX file system to HDFS.
  • Written Map Reduce code to process and parsing the data from various sources and storing parsed data into HBase and Hive using HBase-Hive Integration.
  • Load and transform large sets of structured, semi structured, and unstructured data
  • Experience in managing and reviewing Hadoop log files.
  • Job management using Fair scheduler.

Environment: Red hat, ApacheHadoop2.6.0/Hadoop1.2.0, ETL, Map Reduce, Hive, HBase, Pig, SqoopZookeeper, Oozie, Python, Horton works, My SQL, Unix, Linux, Winscp, Yarn, Talend.

Hadoop Developer/Admin

Confidential, Sunnyvale, CA

Responsibilities:

  • InstalledHadoop, Map Reduce, HDFS, AWS and developed multiple MapReduce jobs in PIG and Hive for data cleaning and pre-processing.
  • Involved in schedulingTeradataand UNIX objects to run the jobs on daily/weekly basis depending on business requirement.
  • Experience in ingesting data intoCassandraand consuming the ingested data fromCassandratoHadoop.
  • Upgraded the Cloudera Hadoop ecosystems in the cluster using Cloudera distribution packages.
  • Automated second layer of MAPRFS backup process toAzureCIFS.
  • Developed PIG Latin scripts to extract the data from the web server output files to load into HDFS.
  • Assisted in upgrading, configuration and maintenance of variousHadoopinfrastructures like Pig, Hive, and HBase.
  • Solved performance issues in Hive and Pig scripts with understanding of Joins, Group and aggregation and how does it translate to MapReduce jobs.
  • Worked on Database designing, Stored Procedures, and oracle.
  • Created concurrent access for hive tables with shared/exclusive locks enabled by implementing Zookeeper in cluster.
  • Participated in knowledge transfer sessions to Production support team on business rules,Teradataobjects and on scheduling jobs.
  • Experience in NoSQL databases such as HBase andCassandra.
  • Involved in debugging Map Reduce job using MR Unit framework and optimizing Map Reduce.
  • Developed Hive Scripts, Pig scripts, Unix Shell scripts, programming for all ETL loading processes and converting the files into parquet in theHadoopFile System.
  • Responsible for gathering requirements, process workflow, data modelling, architecture and design and led application development using Scrum.
  • ImplementedClouderaManager on existing cluster.
  • Generated Java APIs for retrieval and analysis on No-SQL database such as HBase andCassandra.
  • Created scripts for importing data into HDFS/Hive using Sqoop from DB2.
  • Loading data into HBase using Bulk Load and Non-bulk load.
  • Extensively worked withClouderaDistributionHadoop.
  • Extensively worked on Installation and configuration of Cloudera distribution for Hadoop(CDH).
  • Exported the analyzed data to relational databases using Sqoop for visualization and to generate reports.

Environment: Hadoop, Map Reduce, Hive, HBase, Pig, Sqoop, Zookeeper, Oozie,, Linux,SQL,Cloudera, Cassandra, Teradata.

Java developer

Confidential

Responsibilities:

  • Architected a JSF, Web sphere, Oracle, spring, and Hibernate based 24x7 Web application.
  • Built an end to end vertical slice for a JEE based billing application using popular frameworks like Spring, Hibernate, JSF, Facelets, XHTML, Maven2, and Ajax by applying OO design concepts, JEE & GoF design patterns, and best practices.
  • Integrated other sub-systems like loans application, equity markets online application system, and documentation system with the structured products application through JMS,Web Sphere MQ, SOAP based Web services, and XML.
  • Designed the logical and physical data model, generated DDL scripts, and wrote DML scripts for Oracle 9i database.
  • Tuned SQL statements, hibernate mapping, and Web Sphere application server to improve performance, and consequently met the SLAs.
  • Gathered business requirements and wrote functional specifications and detailed design documents.
  • Improved the build process by migrating it from Ant to Maven2.
  • Built and deployed Java applications into multiple Unix based environments and produced both unit and functional test results along with release notes.

Environment: Java 1.5, JSF Sun RI, Facelets, Ajax4JSF, Richfaces, Spring, XML, XSL,XSD, XHTML,Hibernate Oracle 9i, PL/SQL, MINA, Spring-ws, SOAP Web service, Web,Sphere, JMX, ANT, Maven2Continuum, JUnit, SVN, TDD, and XP.

Java developer

Confidential

Responsibilities:

  • Designed and developed UI using Struts view tags (HTML, Bean, Logic and Nested), JSP, HTML, CSS and Struts Tiles.
  • Configured Struts, Spring, hibernate for the development environment.
  • Designed and Developed parsers classes using SAX Parsers.
  • Involved in Developing SOAP messages as part of Web Services testing.
  • Developed XSD schemas.
  • Developed XML beans using XSD schemas.
  • Used Web Logic Workshop for development Environment.
  • Configured Data Source in Web Logic Server.
  • Used Subversion as Source code control.
  • Developed Ant scripts for generating the XML Beans.
  • Generated DAO, POJO classes and Hibernate mapping files using Hibernate tools.
  • Modified DAO classes and Hibernate mapping files as per the application standard.
  • Used log4j, JUnit for logging and testing the application.

Environment: Xml, xml beans, sax parsing, http sessions, Eclipse, Web logic Workshop, SOAP messages, Oracle 10g, XSD, ANT Scripts.

Software Developer

Confidential

Responsibilities:

  • Designed architecture, requirements specifications, use case diagrams and sequence diagrams using UML.
  • Developed Java based GUI for storing temperature data.
  • Created code to connect VME with PC via RS-232 cable.
  • Wrote code to reboot VxWorks, start acquisition of data from IP modules and store it into text files.
  • Wrote code for calculating minimum, maximum and average measurements from 94 sensors. Based on the temperature, data was sent to cryogenics to cool overheated magnets using C++.
  • Wrote code to show online graph in Java Swings and Graphics.
  • Wrote code to show offline graph based on user requirements.
  • Wrote code for getting channel data through Lab View using socket connection.
  • Documented for the project.

Environment: VME (Versa Modulo Europa), IP- modules, VxWorks/Tornado RTOS, J2EE, C++, Socket programming, Core Java, Multithreading, Swing, UML, HTML, JDBC, TCP/IP, Tornado.

We'd love your feedback!