Programmer Analyst Resume
Dayton, OH
SUMMARY
- Overall 8+ years of professional IT experience in data analysis, data modeling and implementation of enterprise class technologies with 3 years of experience in Big Data implementations using Hadoop and Hadoop eco system.
- Technical expertise in Big data/Hadoop HDFS, Map Reduce, Apache Hive, Apache Pig, Sqoop, HBase, Flume, Storm, Kafka, Spark, Oozie, Zookeeper, NoSQL Data bases HBase, Cassandra, MongoDB.
- Good at developing big data based Solutions using Hadoop and Spark, Information Retrieval.
- Experienced in implementing Real - Time streaming and analytics using various technologies i.e. Spark Streaming, Storm and Kafka.
- Worked on NoSQL DB (Cassandra and HBase) for support enterprise production
- Extensive experience in handling semi structured/unstructured data using Map Reduce Programs.
- Experience in handling different file formats like text files, Sequence files, Avro data files using Map Reduce programs.
- Extensive experience working in Oracle, DB2, SQL Server and My SQL database.
- Experienced in performing analytics on structured data using Hive queries, operations.
- Experience in handling different file formats like text files, Sequence files, Avro data files using different SerDe's in Hive.
- Knowledge of Data Warehousing and Analytics, Data migration, Data profiling, ETL Processes.
- Experienced in writing custom Hive UDF's to in corporate business logic with Hive queries.
- Extensive experience in Informatica Big Data Edition.
- Experience in developing Map Reduce Programs using Apache Hadoop for analyzing the big data as per the requirement.
- Extensive experience in developing PIG Latin Scripts and using Hive Query Language for data analytics.
- Worked on HBase tables.
- Experience in Monitoring workload, job performance and node health using Cloudera Manager.
- Worked on loading the production log files data into HDFS using FLUME.
- Hands on experience in creating Hive scripts, HIVE tables, UDAFs, UDFs, UDTFs, Partition/ Bucketing
- Hands on experience in creating Apache Spark RDD transformations on Data sets in Hadoop Data Lake.
- Experience in reporting tools like Tableau and BI.
- Extensive experience working in Teradata, Oracle, SQL Server and MySQL database.
- Very good experience in complete project life cycle (design, development, testing and implementation).
- Experience with Oozie Scheduler in setting up workflow jobs with Map/Reduce and Pig jobs
- Good working experience using Sqoop to import data into HDFS from RDBMS.
- Strong experience in database design, writing complex SQL queries and stored procedures using PL/SQL.
- Experienced with implementing SOAP based web services and Rest Web services.
- Determined, committed, hardworking with strong communication, interpersonal and organizational skills.
- Experienced with handling different phases in big data environments.
- Excellent communication, verbal skills to interact with client, onsite-offshore coordination.
- Ability to adapt to evolving technology, strong sense of responsibility and accomplishment.
PROFESSIONAL EXPERIENCE
Confidential, Dayton, OH
Programmer Analyst
Responsibilities:
- Worked with the business users to gather, define business requirements and analyze the possible technical solutions.
- Worked on customized built tool Data Migration Framework for ingesting data from external and internal sources into Hadoop using Sqoop, Shell script, Hive and Pig
- Developed Reconciliation Script (Audit Process) to reconcile the data imported.
- Collected the data from different Sources and created a data ware house.
- Used Sqoop to export and import data from and to HDFS.
- Extracted the HIE data in XML format and parsed the data using Spark.
- Used Spark Streaming to pull the real-time ADT and LAB data from TIBCO JMS QUEUE.
- Used Spark Streaming to get data from Kafka topics.
- Mapped the existing Hive Job with Spark using Spark SQL in Scala.
- Worked on Hive with TEZ.
- Worked on Apache Nifi for processing real time data.
- Implemented Hive Generic UDF's to implement business logic.
- Used SED, AWK and PIG scripts to clean and scrub the data before put into Data Lake.
- Creating Hive tables, dynamic partitions, buckets for sampling and worked on them using Hive QL.
- Using Hive tables and Hive SerDe’s to store data in tabular format.
- Experienced with optimizing techniques to get better performance from Hive queries.
- Developed Map Reduce programs to process the Avro files.
- Developed Map Reduce Code to handle data with no primary keys.
- Designed ETLProcess using Spark and Hive to load data from CVS Pharmacy flat files to target tables.
- Implemented helper classes that access HBase/Hive directly from Java using Java API.
- Managing and scheduling jobs on a Hadoop cluster using Tidal Job Scheduler.
- Monitoring and managing the Hadoop cluster using Ambari.
- Extensively worked on writing the shell scripts for using command line calls for hive, pig and Sqoop.
- Used Spark stream processing to get data into in-memory, implemented RDD transformations, actions to process as units.
- Worked on Avro and Parquet and ORC file formats.
- Experienced with different kind of compression techniques to save data and optimize data transfer over networks using Lzo, Snappy, and Deflate etc.
Confidential, Houston, TX
Sr. Hadoop Developer
Responsibilities:
- Worked with the business users to gather, define business requirements and analyze the possible technical solutions.
- Collected the data from different Sources and created a data ware house.
- Collected the Raw Dynatrace data and parsed the data using PIG and created Hive external tables.
- Extracted feeds from social media sites such as Twitter rest API calls.
- Extracted the omniture data and created hive external tables and created Tableau reports.
- Used Sqoop to export and import data from and to HDFS.
- Extracted the webchat data in XML format and parsed the data using informatica BDE and Exported to oracle using Sqoop.
- Mapped the existing Hive Job with Spark using Spark SQL and Scala/Python.
- Developing custom aggregate functions using Spark SQL and performed interactive querying.
- Worked on Hive with TEZ.
- Creating Hive tables, dynamic partitions, buckets for sampling and worked on them using Hive QL.
- Using Hive tables and Hive SerDe’s to store data in tabular format.
- Experienced with optimizing techniques to get better performance from Hive queries.
- Developed Map Reduce programs to process the Avro files. Used Map side joins and Skew Join operations.
- Collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis.
- Worked on Performing ETL on data from heterogeneous data sources to a data ware house on which UI reports were built.
- Worked on ETL Informatica for parsing the data, and then the parsed data is loaded to HDFS.
- Designed ETLMappings and Sessions to load data from source flat files to target tables.
- Working with NoSQL databases like Hbase to create Hbase tables to load large sets of semi structured data from various sources.
- Debugging/Troubleshoot issues on UDF’s in Hive.
- Implemented helper classes that access HBase/Hive directly from Java using Java API.
- Worked on POC on neo4j and Mongodb.
- Managing and scheduling jobs on a Hadoop cluster using Oozie work flow.
- Created Hive tables on JSON format data.
- Worked on Apache Ranger for security in Hadoop.
- Importing of data from various sources, performing transformations using Pig and loaded data into HDFS and extracted data from MySQL to HDFS using Sqoop.
- Continuous monitoring and managing the Hadoop cluster using Cloudera manager.
- Worked on Oozie to automate data loading into HDFS and Pig to pre-process the data.
- Developed Map Reduce programs to process the Avro files and to get the results by performing some calculations on data and also performed map side joins.
- Worked on loading data into data base using Linux.
- Extensively worked on writing the shell scripts for using command line calls for hive,pig and sqoop.
- Integration withApacheSOLRfor search for keywords in the raw social media data.
- Created Hive tables using serde LWStorageHandler for Solr search.
- Integrated Apache Storm with infrastructure that includes HBase and HDFS to create highly scalable data platform.
- Used Spark stream processing to get data into in-memory, implemented RDD transformations, actions to process as units.
- Worked on Kafka to rebuild user activity tracking pipeline for publish-subscribe feeds.
- Implemented Storm spouses, bolts to get data from Kafka sources, process logic by using Storm topologies.
- Worked on Avro and Parquet and ORC file formats.
- Experienced with different kind of compression techniques to save data and optimize data transfer over networks using Lzo, Snappy, Deflate etc.,
- Worked on debugging and troubleshooting zookeeper issues.
- Worked on installing cluster, commissioning and decommissioning of data node, name node high availability, capacity planning and slots configuration.
Environment: Hadoop, HDFS, Hive, Tez,Aws,Pig, Flume, Sqoop, Spark, Map Reduce, Solr, kafka, CloudEra, Avro, Snappy, Zookeeper, CDH, NoSQL, HBase, Cassandara, Java (JDK 1.6), Linux, Eclipse, MySQL and Ubuntu, Informatica BDE, Powercenter.
Confidential, Gaithersburg, MD
Hadoop Developer
Responsibilities:
- Managing and scheduling Jobs on a Hadoop cluster using Oozie work flows
- Handled importing of data from various data sources, performed transformations using Pig and loaded data into HDFS and Extracted the data from MySQL into HDFS using Sqoop.
- Worked on installing cluster, commissioning & decommissioning of data node, name node recovery, capacity planning, and slots configuration.
- Continuous monitoring and managing the Hadoop cluster using HortonWorks Ambari.
- Developed Map Reduce programs to process the Avro files and to get the results by performing some calculations on data. And performed Map side joins and other operations.
- Developed Map Reduce Program for searching the production log files for application issues and download performance.
- Experienced in implementing Different kind of joins to integrate data from different data sets like Map and Reduce side join.
- Implemented Map Reduce programs to handle semi/ unstructured data like xml, json, Avro data files and sequence files for log files.
- Wrote MapReduce job/ Hive QL/ Pig Latin to process the source data to structured data and store in relational databases or NoSQL database (HBase, Cassandra).
- Experienced with using different kind of compression techniques to save data and optimize data transfer over network using Lzo, Snappy, etc.
- Responsible for performing extensive data validation using HIVE.
- Involved in loading data from UNIX file system to HDFS.
- Analysis using R Studio.
- Creating Hive tables, dynamic partitions, buckets for sampling, and working on them using Hive QL.
- Stored the data in tabular formats using Hive tables and Hive SerDe's.
- Implemented various business requirements by writing Hive UDFs.
- Experienced with optimizing techniques to get better performance from Hive Queries.
- Experienced on loading and transforming of large sets of semi structured and unstructured data using Pig Latin operations.
- Create ETL Mapping with Talend Integration Suite to pull data from Source and load data into target database.
- Developed mappings /Transformation and designed ETL Jobs/Packages using Talend Integration Suite (TIS)
- Experience in handling web server logs data (stream) data to analyze user actions using flume.
- Responsible to manage data coming from different sources.
- Exported the analysed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
- Worked on OOZIE to automate data loading into HDFS and PIG to pre-process the data.
- Experienced in debugging jobs in local mode, distribution mode using Web UI and Ambari.
Environment: Horton works, Hadoop, HDFS, Map Reduce, Hive, Flume, Sqoop, PIG, Java (JDK 1.6), Eclipse, MySQL and Ubuntu, Zookeeper, Talend, CDH, Java Eclipse, SQL Server, Shell Scripting.
Confidential, Kansas, MO
Hadoop Developer
Responsibilities:
- Involved in review of functional and non-functional requirements.
- Launching and Setup of HADOOP/ HBASE Cluster which includes configuring different components of Cluster.
- Wrote Junit test cases to test and debug Map Reduce programs in local machine.
- Installed and configured Pig and also written Pig Latin scripts.
- Involved in creating Hive tables, loading data and running hive queries in those data.
- Involved in ETL, Data Integration and Migration.
- Imported data using Sqoop to load data from Oracle to HDFS on regular basis.
- Developing Scripts and Batch Job to schedule various Hadoop Program.
- Written Hive queries for data analysis to meet the business requirements.
- Implemented Partitioning, Dynamic Partitions, Buckets in HIVE.
- Developed Pig UDFs to pre-process data for analysis.
- Developed a custom File System plug-in for Hadoop so it can access files on Data Platform.
- The custom File System plug-in allows Hadoop Map Reduce programs, HBase, Pig and Hive to work unmodified and access files directly.
- Implemented helper classes that access HBase directly from java using Java API to perform CRUD operations.
- Integrated Map Reduce with HBase to import bulk amount of data into HBase using Map Reduce Programs.
- Handled different time series data using HBase to perform store data and perform analytics based on time to improve queries retrieval time.
- Experienced in converting ETL operations to Hadoop system using Pig Latin operations, transformations and functions.
- Extracted feeds form social media sites such as Face book, Twitter using Python scripts.
- Successfully loaded files to Hive and HDFS from HBase.
- Performance tuning of Hive Queries.
- Used Flume to stream through the log data from various sources.
Environment: Hadoop, Map Reduce, HDFS, Hive, Java, Java Eclipse, Horton works, Ambari, Pig, HBase, Linux, PL/SQL, Toad.
Confidential, Dallas, TX
Lead Java/J2EE Software Developer
Responsibilities:
- Involved in complete requirement analysis, design, coding and testing phases of the project.
- Developed business logic/back-end code using Java and ADF Business Components/BC4J.
- Participated in JAD meetings to gather the requirements and understand the End Users System.
- Lead for technical team.
- Developed user interfaces using JSP, HTML, XML and JavaScript.
- Generated XML Schemas and used XML Beans to parse XML files.
- Created Stored Procedures & Functions. Used JDBC to process database calls for DB2/AS400 and SQL Server databases.
- Developed the code which will create XML files and Flat files with the data retrieved from Databases and XML files.
- Created Data sources and Helper classes which will be utilized by all the interfaces to access the data and manipulate the data.
- Developed web application called iHUB (integration hub) to initiate all the interface processes using Struts Framework, JSP and HTML.
- Developed the interfaces using Eclipse 3.1.1 and JBoss 4.1 Involved in integrated testing, Bug fixing and in Production Support
Environment: Java, Servlets, JSPs, Java Script, HTML, MySQL 2.1, Swing, Java Web Server 2.0, JBoss 2.0, RMI, Rational Rose, Red Hat Linux 7.1.
Confidential
Java/J2EE developer
Responsibilities:
- Involved in designing Class and Sequence diagrams with UML and Data flow diagrams.
- Implemented MVC architecture using Struts framework to get the Free Quote.
- Designed and developed front end using JSP, Struts (tiles), XML, JavaScript, and HTML.
- Used Struts tag libraries to create JSP.
- Implemented Spring MVC, dependency Injection (DI) and aspect oriented programming (AOP) features along with Hibernate.
- Experienced with implementing navigation using Spring MVC.
- Used Hibernate for object-relational mapping persistence.
- Implemented message driven beans to get from queues to send again to support team using MSend commands.
- Experienced with hibernate core interfaces like configuration, session factory, transactional and criteria interfaces.
- Reviewed the requirements and Involved in database design for new requirements
- Wrote Complex SQL queries to perform various database operations using TOAD.
- Java Mail API was used to notify the Agents about the free quote and for sending Email to the Customer with Promotion Code for validation.
- Involved in testing using Junit.
- Performed application development using Eclipse and Web Sphere Application Server for deployment.
- Used SVN for version control.
Environment: Java, Spring, Hibernate, Jms, Web Services, Ejb, Sql/PlSql, Html, Css, Jsp, java script, Ant, Junit, Web sphereEducation