Spring Batch – CSV to XML

Spring Batch – CSV to XML ”; Previous Next In this chapter, we will create a simple Spring Batch application which uses a CSV Reader and an XML Writer. Reader − The reader we are using in the application is FlatFileItemReader to read data from the CSV files. Following is the input CSV file we are using in this application. This document holds data records which specify details like tutorial id, tutorial author, tutorial title, submission date, tutorial icon and tutorial description. 1001, “Sanjay”, “Learn Java”, 06/05/2007 1002, “Abdul S”, “Learn MySQL”, 19/04/2007 1003, “Krishna Kasyap”, “Learn JavaFX”, 06/07/2017 Writer − The Writer we are using in the application is StaxEventItemWriter to write the data to XML file. Processor − The Processor we are using in the application is a custom processor which just prints the records read from the CSV file. jobConfig.xml Following is the configuration file of our sample Spring Batch application. In this file, we will define the Job and the steps. In addition to these, we also define the beans for ItemReader, ItemProcessor, and ItemWriter. (Here, we associate them with respective classes and pass the values for the required properties to configure them.) <beans xmlns = ” http://www.springframework.org/schema/beans” xmlns:batch = “http://www.springframework.org/schema/batch” xmlns:xsi = “http://www.w3.org/2001/XMLSchema-instance” xsi:schemaLocation = “http://www.springframework.org/schema/batch http://www.springframework.org/schema/batch/spring-batch-2.2.xsd http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-3.2.xsd”> <import resource = “../jobs/context.xml” /> <bean id = “report” class = “Report” scope = “prototype” /> <bean id = “itemProcessor” class = “CustomItemProcessor” /> <batch:job id = “helloWorldJob”> <batch:step id = “step1”> <batch:tasklet> <batch:chunk reader = “cvsFileItemReader” writer = “xmlItemWriter” processor = “itemProcessor” commit-interval = “10”> </batch:chunk> </batch:tasklet> </batch:step> </batch:job> <bean id = “cvsFileItemReader” class = “org.springframework.batch.item.file.FlatFileItemReader”> <property name = “resource” value = “classpath:resources/report.csv” /> <property name = “lineMapper”> <bean class = “org.springframework.batch.item.file.mapping.DefaultLineMapper”> <property name = “lineTokenizer”> <bean class = “org.springframework.batch.item.file.transform.DelimitedLineTokenizer”> <property name = “names” value = “tutorial_id, tutorial_author, Tutorial_title, submission_date” /> </bean> </property> <property name = “fieldSetMapper”> <bean class = “ReportFieldSetMapper” /> </property> </bean> </property> </bean> <bean id = “xmlItemWriter” class = “org.springframework.batch.item.xml.StaxEventItemWriter”> <property name = “resource” value = “file:xml/outputs/tutorials.xml” /> <property name = “marshaller” ref = “reportMarshaller” /> <property name = “rootTagName” value = “tutorials” /> </bean> <bean id = “reportMarshaller” class = “org.springframework.oxm.jaxb.Jaxb2Marshaller”> <property name = “classesToBeBound”> <list> <value>Tutorial</value> </list> </property> </bean> </beans> Context.xml Following is the context.xml of our Spring Batch application. In this file, we will define the beans like job repository, job launcher, and transaction manager. <beans xmlns = “http://www.springframework.org/schema/beans” xmlns:jdbc = “http://www.springframework.org/schema/jdbc” xmlns:xsi = “http://www.w3.org/2001/XMLSchema-instance” xsi:schemaLocation = “http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-3.2.xsd http://www.springframework.org/schema/jdbc http://www.springframework.org/schema/jdbc/spring-jdbc-3.2.xsd”> <!– stored job-meta in database –> <bean id = “jobRepository” class = “org.springframework.batch.core.repository.support.JobRepositoryFactoryBean”> <property name = “dataSource” ref = “dataSource” /> <property name = “transactionManager” ref = “transactionManager” /> <property name = “databaseType” value = “mysql” /> </bean> <bean id = “transactionManager” class = “org.springframework.batch.support.transaction.ResourcelessTransactionManager” /> <bean id = “jobLauncher” class = “org.springframework.batch.core.launch.support.SimpleJobLauncher”> <property name = “jobRepository” ref = “jobRepository” /> </bean> <bean id = “dataSource” class = “org.springframework.jdbc.datasource.DriverManagerDataSource”> <property name = “driverClassName” value = “com.mysql.jdbc.Driver” /> <property name = “url” value = “jdbc:mysql://localhost:3306/details” /> <property name = “username” value = “myuser” /> <property name = “password” value = “password” /> </bean> <!– create job-meta tables automatically –> <jdbc:initialize-database data-source = “dataSource”> <jdbc:script location = “org/springframework/batch/core/schema-drop-mysql.sql” /> <jdbc:script location = “org/springframework/batch/core/schema-mysql.sql” /> </jdbc:initialize-database> </beans> CustomItemProcessor.java Following is the Processor class. In this class, we write the code of processing in the application. Here, we are printing the contents of each record. import org.springframework.batch.item.ItemProcessor; public class CustomItemProcessor implements ItemProcessor<Tutorial, Tutorial> { @Override public Tutorial process(Tutorial item) throws Exception { System.out.println(“Processing…” + item); return item; } } TutorialFieldSetMapper.java Following is the TutorialFieldSetMapper class which sets the data to the Tutorial class. import org.springframework.batch.item.file.mapping.FieldSetMapper; import org.springframework.batch.item.file.transform.FieldSet; import org.springframework.validation.BindException; public class TutorialFieldSetMapper implements FieldSetMapper<Tutorial> { @Override public Tutorial mapFieldSet(FieldSet fieldSet) throws BindException { //Instantiating the report object Tutorial tutorial = new Tutorial(); //Setting the fields tutorial.setTutorial_id(fieldSet.readInt(0)); tutorial.setTutorial_author(fieldSet.readString(1)); tutorial.setTutorial_title(fieldSet.readString(2)); tutorial.setSubmission_date(fieldSet.readString(3)); return tutorial; } } Tutorial.java class Following is the Tutorial class. It is a simple Java class with setter and getter methods. In this class, we are using annotations to associate the methods of this class with the tags of the XML file. import javax.xml.bind.annotation.XmlAttribute; import javax.xml.bind.annotation.XmlElement; import javax.xml.bind.annotation.XmlRootElement; @XmlRootElement(name = “tutorial”) public class Tutorial { private int tutorial_id; private String tutorial_author; private String tutorial_title; private String submission_date; @XmlAttribute(name = “tutorial_id”) public int getTutorial_id() { return tutorial_id; } public void setTutorial_id(int tutorial_id) { this.tutorial_id = tutorial_id; } @XmlElement(name = “tutorial_author”) public String getTutorial_author() { return tutorial_author; } public void setTutorial_author(String tutorial_author) { this.tutorial_author = tutorial_author; } @XmlElement(name = “tutorial_title”) public String getTutorial_title() { return tutorial_title; } public void setTutorial_title(String tutorial_title) { this.tutorial_title = tutorial_title; } @XmlElement(name = “submission_date”) public String getSubmission_date() { return submission_date; } public void setSubmission_date(String submission_date) { this.submission_date = submission_date; } @Override public String toString() { return ” [Tutorial id=” + tutorial_id + “, Tutorial Author=” + tutorial_author + “, Tutorial Title=” + tutorial_title + “, Submission Date=” + submission_date + “]”; } } App.java Following is the code which launches the batch process. In this class, we will launch the batch application by running the JobLauncher. import org.springframework.batch.core.Job; import org.springframework.batch.core.JobExecution; import org.springframework.batch.core.JobParameters; import org.springframework.batch.core.launch.JobLauncher; import org.springframework.context.ApplicationContext; import org.springframework.context.support.ClassPathXmlApplicationContext; public class App { public static void main(String[] args) throws Exception { String[] springConfig = { “jobs/job_hello_world.xml” }; // Creating the application context object ApplicationContext context = new ClassPathXmlApplicationContext(springConfig); // Creating the job launcher JobLauncher jobLauncher = (JobLauncher) context.getBean(“jobLauncher”); // Creating the job Job job = (Job) context.getBean(“helloWorldJob”); // Executing the JOB JobExecution execution = jobLauncher.run(job, new JobParameters()); System.out.println(“Exit Status : ” + execution.getStatus()); } } On executing this application, it will produce the following output. May 08, 2017 10:10:12 AM org.springframework.context.support.ClassPathXmlApplicationContext prepareRefresh INFO: Refreshing org.springframework.context.support.ClassPathXmlApplicationContext@3d646c37: startup date [Mon May 08 10:10:12 IST 2017]; root of context hierarchy May 08, 2017 10:10:12 AM org.springframework.beans.factory.xml.XmlBeanDefinitionReader loadBeanDefinitions May 08, 2017 10:10:15 AM org.springframework.jdbc.datasource.init.ScriptUtils executeSqlScript INFO: Executing step: [step1] Processing… [Tutorial id=1001, Tutorial Author=Sanjay, Tutorial Title=Learn Java, Submission Date=06/05/2007] Processing… [Tutorial id=1002, Tutorial Author=Abdul S, Tutorial Title=Learn MySQL, Submission Date=19/04/2007] Processing… [Tutorial id=1003, Tutorial Author=Krishna Kasyap, Tutorial Title=Learn JavaFX, Submission

Spring Batch – Application

Spring Batch – Application ”; Previous Next Almost all the examples in this tutorial contain the following files − Configuration file (XML file) Tasklet/processor (Java class) Java class with setters and getters (Java class (bean)) Mapper class (Java class) Launcher class (Java class) Configuration File The configuration file (XML) contains the following − The job and step definitions. Beans defining readers and writers. Definition of components like JobLauncher, JobRepository, Transaction Manager, and Data Source. In our examples, for better understanding, we have divided this in to two files the job.xml file (defines job, step, reader and writer) and context.xml file (job launcher, job repository, transaction manager and data source). Mapper Class The Mapper class, depending upon the reader, implements interfaces such as row mapper, field set mapper, etc. It contains the code to get the data from the reader and to set it to a Java class with setter and getter methods (Java Bean). Java Bean Class A Java class with setters and getters (Java bean) represents data with multiple values. It acts as a helper class. We will pass the data from one component (reader, writer, processer) to other in the form of object of this class. Tasklet/processor The Tasklet/processor class contains the processing code of the Spring Batch application. A processor is a class which accepts an object that contains the data read, processes it, and returns the processed data (in the form object). Launcher class This class (App.java) contains the code to launch the Spring Batch application. Print Page Previous Next Advertisements ”;

Readers, Writers & Processors

Spring Batch – Readers, Writers & Processors ”; Previous Next An Item Reader reads data into the spring batch application from a particular source, whereas an Item Writer writes data from Spring Batch application to a particular destination. An Item processor is a class which contains the processing code which processes the data read in to the spring batch. If the application reads n records the code in the processor will be executed on each record. A chunk is a child element of the tasklet. It is used to perform read, write, and processing operations. We can configure reader, writer, and processors using this element, within a step as shown below. <batch:job id = “helloWorldJob”> <batch:step id = “step1”> <batch:tasklet> <batch:chunk reader = “cvsFileItemReader” writer = “xmlItemWriter” processor = “itemProcessor” commit-interval = “10”> </batch:chunk> </batch:tasklet> </batch:step> </batch:job> Spring Batch provides readers and writers to read and write data form various file systems/databases such as MongoDB, Neo4j, MySQL, XML, flatfile, CSV, etc. To include a reader in your application, you need to define a bean for that reader, provide values to all the required properties within the bean, and pass the id of such bean as a value to the attribute of the chunk element reader (same for writer). ItemReader It is the entity of a step (of a batch process) which reads data. An ItemReader reads one item a time. Spring Batch provides an Interface ItemReader. All the readers implement this interface. Following are some of the predefined ItemReader classes provided by Spring Batch to read from various sources. Reader Purpose FlatFIleItemReader To read data from flat files. StaxEventItemReader To read data from XML files. StoredProcedureItemReader To read data from the stored procedures of a database. JDBCPagingItemReader To read data from relational databases database. MongoItemReader To read data from MongoDB. Neo4jItemReader To read data from Neo4jItemReader. We need to configure the ItemReaders by creating the beans. Following is an example of StaxEventItemReader which reads data from an XML file. <bean id = “mysqlItemWriter” class = “org.springframework.batch.item.xml.StaxEventItemWriter”> <property name = “resource” value = “file:xml/outputs/userss.xml” /> <property name = “marshaller” ref = “reportMarshaller” /> <property name = “rootTagName” value = “Tutorial” /> </bean> <bean id = “reportMarshaller” class = “org.springframework.oxm.jaxb.Jaxb2Marshaller”> <property name = “classesToBeBound”> <list> <value>Tutorial</value> </list> </property> </bean> As observed, while configuring, we need to specify the respective class name of the required reader and we need to provide values to all the required properties. ItemWriter It is the element of the step of a batch process which writes data. An ItemWriter writes one item a time. Spring Batch provides an Interface ItemWriter. All the writers implement this interface. Following are some of the predefined ItemWriter classes provided by Spring Batch to read from various sources. Writer Purpose FlatFIleItemWriter To write data into flat files. StaxEventItemWriter To write data into XML files. StoredProcedureItemWriter To write data into the stored procedures of a database. JDBCPagingItemWriter To write data into relational databases database. MongoItemWriter To write data into MongoDB. Neo4jItemWriter To write data into Neo4j. In same way, we need to configure the ItemWriters by creating the beans. Following is an example of JdbcCursorItemReader which writes data to an MySQL database. <bean id = “dbItemReader” class = “org.springframework.batch.item.database.JdbcCursorItemReader” scope = “step”> <property name = “dataSource” ref = “dataSource” /> <property name = “sql” value = “select * from tutorialsdata” /> <property name = “rowMapper”> <bean class = “TutorialRowMapper” /> </property> </bean> Item Processor ItemProcessor: An ItemProcessor is used to process the data. When the given item is not valid it returns null, else it processes the given item and returns the processed result. The interface ItemProcessor<I,O> represents the processor. Tasklet class − When no reader and writer are given, a Tasklet acts as a processor for SpringBatch. It processes only single task. We can define a custom item processor by implementing the interface ItemProcessor of the package org.springframework.batch.item.ItemProcessor. This ItemProcessor class accepts an object and processes the data and returns the processed data as another object. In a batch process, if “n” records or data elements are read, then for each record, it will read the data, process it, and writes the data in the writer. To process the data, it relays on the processor passed. For example, let’s suppose you have written code to load a particular PDF document, create a new page, write the data item on to the PDF in a tabular format. If you execute this application, it reads all the data items from the XML document, stores them in the MySQL database, and prints them in the given PDF document in individual pages. Example Following is a sample ItemProcessor class. import org.springframework.batch.item.ItemProcessor; public class CustomItemProcessor implements ItemProcessor<Tutorial, Tutorial> { @Override public Tutorial process(Tutorial item) throws Exception { System.out.println(“Processing…” + item); return item; } } Print Page Previous Next Advertisements ”;