Hadoop Developer/Admin Resume Atlanta, GA - Hire IT People

SUMMARY

7+Years of experience with emphasis on Big Data Technologies, Development, and Design of Java based enterprise applications.
Three years of experience in Hadoop Development and four years of Java Application Development.
Hands on experience in usingHadoopTechnologies such as HDFS, HIVE, PIG, SQOOP, HBASE, Impala, Flume, Spark, Oozie, Mapreduce.
Implemented in setting up standards and processes for Hadoop based application design and implementation.
Logical Implementation and interaction with HBase.
Developed a data pipeline usingKafkaand Strom to store data into HDFS.
Experience in using Scoop, Zookeeper, and cloud - based computing Manager.
Services through Zookeeper.
Experience inNOSQLdatabase HBase.
Responsible to manage data coming from different sources.
Worked with different distributions ofHadooplike Hortonworks andCloudera.
Worked on tuning the performance Pig queries.
Worked on creating Map Reduce scripts for processing the data.
Developed Java code that stream the Packet tracer data into Hive using rest full services.
Experience in maintaining the big data platform using open source technologies such as Spark.
Experience on ETL development using Kafka, Flume, and Sqoop.
Involved in performance tuning of spark jobs using Cache and using complete advantage of cluster environment.
Proficient in performance analysis, monitoring and SQL query tuning using EXPLAIN PLAN, Collect Statistics, Hints and SQL Trace both inTeradataas well as Oracle.
Loading data from different source (database & files) into Hive usingTalendtool.
Redesigned the existing InformaticaETLmappings & workflows using Spark python.
Worked with business teams and created Hive queries for ad hoc access.
Load the data into Spark RDD and performed in-memory data computation to generate the output response.
Prepared, arranged and tested Splunk search strings and operational strings
Developed Spark code and Spark-SQL/Streaming for faster testing and processing of data.
Evaluate and propose new tools and technologies to meet the needs of the organization.
An excellent team player and self-starter with good communication skills and proven abilities to finish tasks before target deadlines.
Outstanding communication and presentation skills, willing to learn, adapt to new technologies and third party products.
Flexible and ability to balance multiple projects Confidential one time in a fast-paced environment.

TECHNICAL SKILLS

Big Data: HDFS, MapReduce, Hive, Pig, HBase, Sqoop, Flume, Oozie, Zookeeper, MongoDB, Spark, Cloudera, Hortonworks, Splunk.

Programming: Java, C/C++, Sas, Java Script, Python, R, PL/SQL, UNIX Shell scripting.

Operating Systems: Windows, Ubuntu, Red Hat Linux, UNIX.

PROFESSIONAL EXPERIENCE

Hadoop Developer/Admin

Confidential, Atlanta, GA

Responsibilities:

Developing Hive andPythonscripts for data validity checks and updates using Oozie to manage workflows.
Worked on Data capacity planning and node forecasting.
Experience in managingHadoopJobs and logs of all the scripts.
Worked on migrating data from Mongo DB toHadoop.
Supported the daily/weeklyawsbatches in the Production environment.
DevelopKafkaproducer and consumers, HBase clients, Spark andHadoopMapReduce jobs along with components on HDFS, Hive.
Import data using Sqoop to load data from Oracle/SQL Server to HDFS on regular basis.
Design and implement map reduce jobs to support distributed processing using java, Hive and Apache Pig.
Worked on Big Data Integration &Analytics based onHadoop, SOLR, Spark, Kafka, Storm and web Methods.
Processing large data sets in parallel across theHadoopcluster for pre-processing.
Experienced with batch processing of data sources using Apache Spark,ElasticSearch.
Use Spark SQL to process the huge amount of structured data.
ETL processing using Pig and Sqoop and application programming using Hive, Java andpython.
Executed Map Reduce programs to cleanse data in HDFS gathered from heterogeneous data sources to make it suitable for ingestion into Hive schema for analysis.
Wrote Spark applications in Scala utilizing the data frame and spark sql api.
Strong Knowledge on Architecture of Distributed systems and parallel processing, In-depth understanding of MapReduce programming paradigm.
Installation and administeringSplunkfor monitoring, analyzing and visualizing machine data.
Working withSplunkdevelopers' team for gettingHadoopdashboards created with various metrics.
Importing data into HDFS using Sqoop, which includes incremental loading.
Design and develop MapReduce jobs to process logs and feed Data Warehouse, load Hive tables for analytics and to store daily feed of data on HDFS for other team's use.
Experience in analyzing Log files for HDFS.
Used ApacheNififor loading PDF Documents from Microsoft SharePoint to HDFS
Written Java scripts that execute different MongoDB queries.
Implemented Data loading using Spark, Storm, Kafka,ElasticSearch.
Manage and review Hadoop log files.
Configured deployed and maintained multi-node Dev and TestKafkaClusters.
Import and export data between the environments like MySQL, HDFS.
Develop code according to the task assigned for the user story.
DevelopedSparkscripts by using Scala shell commands as per the requirement.no
DevelopedSparkcode andSpark-SQL/Streaming for faster testing and processing of data.
Work with Hadoop designers in troubleshooting map reduce job failures and issues.

Environment: ApacheHadoop2.6.0/Hadoop1.2.0, HDFS, Spark,Map Reduce, Hive,Splunk,Mango DB, HBase, Sqoop, Zookeeper, Oozie, Kerberos, My SQL, Linux, Unix scripts,Python,Putty, Kafka.

Hadoop Developer/Admin

Confidential, Alpharetta, GA

Responsibilities:

Worked on analyzing, writing Hadoop MapReduce jobs using Pig and Hive.
Involved in loading data from edge node to HDFS using shell scripting.
Configured MySQL Database to store Hive metadata.
Push data as delimited files into HDFS usingTalendBig data studio.
Very good understanding of Partitions, Bucketing concepts in Hive and designed both Managed and External tables in Hive for optimized performance
Created Hive tables, loaded data and wrote Hive QL queries to further analyze the data.
End-to-end performance tuning ofHadoopclusters andHadoopMapReduce routines against very large data sets.
Performed transformations, cleaning and filtering on imported data using Hive, Map Reduce, and loaded final data into HDFS.
Design and developData Ingestioncomponent.
Load the data into Spark RDD and performed in-memory data computation to generate the output response.
Good knowledge in using apacheNiFito automate the data movement between differentHadoop systems.
Load and transform data into HDFS from large set of structured data /Oracle/Sql server usingTalendBig data studio.
Created a self-managedPythonscript to deploy testing of the technologies, and calculate statistics. The script was designed such that future tests on other techs could easily be integrated.
Designed and developedbigdatasystems to perform variousETLand Map/Reduce jobs using R andLoading data into HBase using Bulk Load and Non-bulk load.
Responsible for building scalable distributed data solutions using Hadoop.
Involved in loading data from LINUX file system to HDFS.
Developed UDFs using JAVA, PIG and HIVE queries.
Good understanding ofNoSQLdatabases.
Started using apacheNiFito copy the data from local file system to HDFS.
Working on Hive/Hbase vs RDBMS, imported data to Hive, HDP created tables, partitions, indexes, views, queries and reports for BI data analysis.
Wrote the Map Reduce jobs in java to parse the web logs, which are stored in HDFS and used MRUnit to test and debug MapReduce programs.
Worked on tuning the performance Pig queries.
Very good experience with both MapReduce 1 (Job Tracker) and MapReduce 2 (YARN).
Involved in loading data from UNIX file system to HDFS.
Written Map Reduce code to process and parsing the data from various sources and storing parsed data into HBase and Hive using HBase-Hive Integration.
Load and transform large sets of structured, semi structured, and unstructured data
Experience in managing and reviewing Hadoop log files.
Job management using Fair scheduler.

Environment: Red hat, ApacheHadoop2.6.0/Hadoop1.2.0, ETL, Map Reduce, Hive, HBase, Pig, SqoopZookeeper, Oozie, Python, Horton works, My SQL, Unix, Linux, Winscp, Yarn, Talend.

Hadoop Developer/Admin

Confidential, Sunnyvale, CA

Responsibilities:

InstalledHadoop, Map Reduce, HDFS, AWS and developed multiple MapReduce jobs in PIG and Hive for data cleaning and pre-processing.
Involved in schedulingTeradataand UNIX objects to run the jobs on daily/weekly basis depending on business requirement.
Experience in ingesting data intoCassandraand consuming the ingested data fromCassandratoHadoop.
Upgraded the Cloudera Hadoop ecosystems in the cluster using Cloudera distribution packages.
Automated second layer of MAPRFS backup process toAzureCIFS.
Developed PIG Latin scripts to extract the data from the web server output files to load into HDFS.
Assisted in upgrading, configuration and maintenance of variousHadoopinfrastructures like Pig, Hive, and HBase.
Solved performance issues in Hive and Pig scripts with understanding of Joins, Group and aggregation and how does it translate to MapReduce jobs.
Worked on Database designing, Stored Procedures, and oracle.
Created concurrent access for hive tables with shared/exclusive locks enabled by implementing Zookeeper in cluster.
Participated in knowledge transfer sessions to Production support team on business rules,Teradataobjects and on scheduling jobs.
Experience in NoSQL databases such as HBase andCassandra.
Involved in debugging Map Reduce job using MR Unit framework and optimizing Map Reduce.
Developed Hive Scripts, Pig scripts, Unix Shell scripts, programming for all ETL loading processes and converting the files into parquet in theHadoopFile System.
Responsible for gathering requirements, process workflow, data modelling, architecture and design and led application development using Scrum.
ImplementedClouderaManager on existing cluster.
Generated Java APIs for retrieval and analysis on No-SQL database such as HBase andCassandra.
Created scripts for importing data into HDFS/Hive using Sqoop from DB2.
Loading data into HBase using Bulk Load and Non-bulk load.
Extensively worked withClouderaDistributionHadoop.
Extensively worked on Installation and configuration of Cloudera distribution for Hadoop(CDH).
Exported the analyzed data to relational databases using Sqoop for visualization and to generate reports.

Environment: Hadoop, Map Reduce, Hive, HBase, Pig, Sqoop, Zookeeper, Oozie,, Linux,SQL,Cloudera, Cassandra, Teradata.

Java developer

Confidential

Responsibilities:

Architected a JSF, Web sphere, Oracle, spring, and Hibernate based 24x7 Web application.
Built an end to end vertical slice for a JEE based billing application using popular frameworks like Spring, Hibernate, JSF, Facelets, XHTML, Maven2, and Ajax by applying OO design concepts, JEE & GoF design patterns, and best practices.
Integrated other sub-systems like loans application, equity markets online application system, and documentation system with the structured products application through JMS,Web Sphere MQ, SOAP based Web services, and XML.
Designed the logical and physical data model, generated DDL scripts, and wrote DML scripts for Oracle 9i database.
Tuned SQL statements, hibernate mapping, and Web Sphere application server to improve performance, and consequently met the SLAs.
Gathered business requirements and wrote functional specifications and detailed design documents.
Improved the build process by migrating it from Ant to Maven2.
Built and deployed Java applications into multiple Unix based environments and produced both unit and functional test results along with release notes.

Environment: Java 1.5, JSF Sun RI, Facelets, Ajax4JSF, Richfaces, Spring, XML, XSL,XSD, XHTML,Hibernate Oracle 9i, PL/SQL, MINA, Spring-ws, SOAP Web service, Web,Sphere, JMX, ANT, Maven2Continuum, JUnit, SVN, TDD, and XP.

Java developer

Confidential

Responsibilities:

Designed and developed UI using Struts view tags (HTML, Bean, Logic and Nested), JSP, HTML, CSS and Struts Tiles.
Configured Struts, Spring, hibernate for the development environment.
Designed and Developed parsers classes using SAX Parsers.
Involved in Developing SOAP messages as part of Web Services testing.
Developed XSD schemas.
Developed XML beans using XSD schemas.
Used Web Logic Workshop for development Environment.
Configured Data Source in Web Logic Server.
Used Subversion as Source code control.
Developed Ant scripts for generating the XML Beans.
Generated DAO, POJO classes and Hibernate mapping files using Hibernate tools.
Modified DAO classes and Hibernate mapping files as per the application standard.
Used log4j, JUnit for logging and testing the application.

Environment: Xml, xml beans, sax parsing, http sessions, Eclipse, Web logic Workshop, SOAP messages, Oracle 10g, XSD, ANT Scripts.

Software Developer

Confidential

Responsibilities:

Designed architecture, requirements specifications, use case diagrams and sequence diagrams using UML.
Developed Java based GUI for storing temperature data.
Created code to connect VME with PC via RS-232 cable.
Wrote code to reboot VxWorks, start acquisition of data from IP modules and store it into text files.
Wrote code for calculating minimum, maximum and average measurements from 94 sensors. Based on the temperature, data was sent to cryogenics to cool overheated magnets using C++.
Wrote code to show online graph in Java Swings and Graphics.
Wrote code to show offline graph based on user requirements.
Wrote code for getting channel data through Lab View using socket connection.
Documented for the project.

Environment: VME (Versa Modulo Europa), IP- modules, VxWorks/Tornado RTOS, J2EE, C++, Socket programming, Core Java, Multithreading, Swing, UML, HTML, JDBC, TCP/IP, Tornado.

We provide IT Staff Augmentation Services!

Hadoop Developer/admin Resume

Atlanta, GA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship