Sr. Bigdata Developer/engineer Resume
New York, NY
EXPERIENCE SUMMARY:
- Nearly 8 years of extensive Professional IT experience with ETL Processes, Database and Hadoop/Bigdata experience and capable of processing large sets of structured, semi - structured and unstructured data and supporting systems application architecture.
- Cloudera certified Hadoop professional.
- Experience in analyzing data using HiveQL, Pig Latin, HBase and custom Map Reduce programs in Java.
- Hands on experience in Big Data ecosystem components like Hive, Pig, Impala, Hbase, Zookeeper, Cascading,Sqoop,Cassandra,Teradata connector,Avro,Splunk.
- Worked on integration of Hbase and Solrcloud using Lily Hbase Indexer.
- Hands on experience in data management tools like Zaloni Bedrock.
- Experience on working on SparkSQL.
- Hands on experience on shell scripting.
- Experience on workingon Kerberos authentication system.
- Experience in working with various distributions Cloudera,Hortonworks, MapR,Pivotal HD and BigInsights.
- Expertise in reviewing client requirement prioritize requirement, creating project proposal.
- Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa.
- Extensive hold over Hive and Pig core functionality by writing custom UDFs.
- Experience in building, maintaining multiple Hadoop clusters (prod, dev. etc.,) of different sizes and configuration and setting up the rack topology for large clusters.
- Experience in installation, configuration, supporting and managing - Cloudera’s Hadoop platform along with CDH clusters.
TECHNICAL EXPERTISE:
Languages: Java, Json,C, T - SQL, PL/SQL
Bigdata: Hadoop,MapReduce,Hive,Pig,YARN,SparkSql,Cascading, Splunk,Impala,Lilly Indexer,Avro,HectorApi
NoSQL: Hbase,Cassandra
SQL: ORACLE 9i/10g,PostgreSQL
Scripting: UNIX Shell Script
IDE Tools: Eclipse, Intellij
Search Engine: Solr
Operating Systems: MS Windows, UNIX, SUN SOLARIS LINUX, Ubuntu
DataManagement tools: Zaloni Bedrock
WORK EXPERIENCE:
Confidential, New York, NY
Sr. Bigdata Developer/Engineer
Responsibilities:
- Responsible for pulling data from Teradata and other RDBMS systems to Hadoop cluster.
- Responsible for creating domain and staging data models.
- Created hive tables,loaded the data and analysed data using hive queries.
- Responsible for writing Pig queries and SparkSQL queries .
- Responsible for writing complex impala queries.
- Have wriiten custom mapreduce programs.
- Worked hands on with ETL process.
- Responsible for creataing hbase tables and loading aggregated data into them using pig.
- Developed Pig UDFs to make customize various functions and make them reusable.
- Created Solr Collections and have written various optimisations.
- Configured Hbase and Solr using Lilly Indexer using morphlines.
- Replicated the SCD (Slowly Changing Dimensions)concepts using Pig queries.
- Responsible for optimizing Solr queries and compressing the data over the networks.
- Resposible for scheduling Tivoli workflows for dailydelta loads.
- Developed shell scripts for integrating all the compenents like hive queries,mapreduce jobs, pig files and other components.
- Guided the Team for their day to day activities and preparing them to reach the deadlines.
- Collaborate with infrastructure and security architects to integrate enterprise information architecture into overall enterprise architecture
- Used git as version control tools to maintain the code repository.
- Provide the documentation and train the teams,build effective cross team communications to ensure accuracy, consistency, problem solving, conflict resolution and on time project completion.
- Communicate to the senior management to provide status,to discuss strategic plans, develop road maps and identify critical success factors.
Environment: CDH5, HADOOP Eco System, PIG, HIVE, Sqoop,SOLRCloud, Impala, Oozie, Teradata Connector, SparkSQL,Hbase,Lilly Indexer,Morphlines,Tivoli.
Confidential, Hopkins MN
Hadoop Developer/Engineer
Responsibilities:
- Supported code/design analysis, strategy development and project planning.
- Gathering business requirements from Subject Matter Experts and understand the current system
- Created file patterns for input files to ingest them into Hadoop environment using bedrock.
- Built DAO objets for the files/Tables residing in hdfs.
- Developed multiple MapReduce jobs in Java for data cleaning, schema validation, record count validation,watermark tagging etc.
- Developed complex mapreduce jobs for joining the tables.Used both map-side and reduce-join functionalties depending on the requirement.
- Schema evolution with Hive and Avro is used for joining the tables with dynamic schema.
- Performed unit testing and prepared test case documents.
- Implemented the cascading programs for higher job dependency control.
- Evalauted and documented the performanace of IBM CDC tool in the project.
- Assisted with data capacity planning and node forecasting.
- Collaborated with the infrastructure, network, database, application and BI teams to ensure data quality and availability.
- Monitored workload, job performance and capacity planning using Cloud era Manager.
- Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
- Work closely with business partners to understand business requirements and design solutions based on those requirements.
- Follow the software development life cycle specifications and ensure all the deliverables meet the Hadoop specifications.
- Used SVN as version control tools to maintain the code repository.
Environment: PivotalHD,MapReduce, Hive,Cascading,Avro,IBM CDC tool, Bedrock,Cognos,Hawk,Sql.
Confidential, Phoenix, AZ
Hadoop Developer/System Engineer
Responsibilities:
- Give extensive presentations about the Hadoop ecosystem, best practices, data architecture in Hadoop.
- Data warehouse, Business Intelligence architecture design and develop. Designed the ETL process from various sources in to Hadoop/HDFS for analysis and further processing.
- Provide review and feedback for existing physical architecture, data architecture, analysis, designs and code. Designed the next generation architecture for unstructured data.
- Debug and solve issues as the subject matter expert focusing issues around data sciences and processing.
- Responsible for writing Pig Latin and pig UDFs and optimisedthe code.
- Worked on Data archival model on Hadoop framework.
- Responsible for writing the HectorAPI codefor cassandra
- Develop Information Strategy in alignment with all agency strategy for master data management, data integration, data virtualization, metadata management, data quality and profiling, data modeling and data governance.
- Created Hive tables,loaded the data and analysed data using hive queries.
- Worked on hive ranking algorithm to classify the patterns.
- Define business and technical requirements, design Proof of Concept for evaluating afms agencies data evaluation criteria and scoring and select data integration and information management.
- Captured and documented the volumetric analysis of CDC module with Informatica.
- Generated huge records of data for volumetric testing.
- Collaborate with infrastructure and security architects to integrate enterprise information architecture into overall enterprise architecture.
Environment: HortonWorks,Cassandra,Hector API,Hdfs,Mapreduce,Pig,Hive,Informatica,Shell scripting
Confidential, Phoenix, AZ
Programmar Analyst
Responsibilities:
- UI development using Struts 2.0 frameworks.
- Applying the CSS styles for the struts 2.0 components
- Involved proactively in the Development of Java Server Pages (JSP), Servlets.
- Write the Data Access Object Classes (DAO Classes) to access the database for data tier.
- Involved in Developing form beans.
- Involved in Developing Acton Classes.
- Developing the internal coding to carry out the business logic.
- Developing presentation components using JSP.
- Involved in developing of Reporting Module, which consists of Forms & Reports for a customer.
Environment: Java programming, Eclipse,Struts,JDBC,SDL Tridion
