Apache POI – Home

Apache POI Tutorial Job Search PDF Version Quick Guide Resources Discussion This tutorial provides a basic understanding of Apache POI library and its features. Audience This tutorial is designed for all enthusiastic readers working on Java and especially those who want to create, read, write, and modify Excel files using Java. Prerequisites A general awareness of Java programming with JDK1.5 or later versions and IO concepts in Java are the only prerequisites to understand this tutorial. Print Page Previous Next Advertisements ”;

Apache POI – Quick Guide

Apache POI – Quick Guide ”; Previous Next Apache POI – Overview Many a time, a software application is required to generate reports in Microsoft Excel file format. Sometimes, an application is even expected to receive Excel files as input data. For example, an application developed for the Finance department of a company will be required to generate all their outputs in Excel. Any Java programmer who wants to produce MS Office files as output must use a predefined and read-only API to do so. What is Apache POI? Apache POI is a popular API that allows programmers to create, modify, and display MS Office files using Java programs. It is an open source library developed and distributed by Apache Software Foundation to design or modify Microsoft Office files using Java program. It contains classes and methods to decode the user input data or a file into MS Office documents. Components of Apache POI Apache POI contains classes and methods to work on all OLE2 Compound documents of MS Office. The list of components of this API is given below. POIFS (Poor Obfuscation Implementation File System) − This component is the basic factor of all other POI elements. It is used to read different files explicitly. HSSF (Horrible Spreadsheet Format) − It is used to read and write xls format of MS-Excel files. XSSF (XML Spreadsheet Format) − It is used for xlsx file format of MS-Excel. HPSF (Horrible Property Set Format) − It is used to extract property sets of the MS-Office files. HWPF (Horrible Word Processor Format) − It is used to read and write doc extension files of MS-Word. XWPF (XML Word Processor Format) − It is used to read and write docx extension files of MS-Word. HSLF (Horrible Slide Layout Format) − It is used for read, create, and edit PowerPoint presentations. HDGF (Horrible DiaGram Format) − It contains classes and methods for MS-Visio binary files. HPBF (Horrible PuBlisher Format) − It is used to read and write MS-Publisher files. This tutorial guides you through the process of working on Excel files using Java. Therefore the discussion is confined to HSSF and XSSF components. Note − Older versions of POI support binary file formats such as doc, xls, ppt, etc. Version 3.5 onwards, POI supports OOXML file formats of MS-Office such as docx, xlsx, pptx, etc. Like Apache POI, there are other libraries provided by various vendors for Excel file generation. These include Aspose cells for Java by Aspose, JXL by Commons Libraries, and JExcel by Team Dev. Apache POI – Java Excel APIs This chapter takes you through some of the flavors of Java Excel API and their features. There are many vendors who provide Java Excel related APIs; some of them are considered in this chapter. Aspose Cells for Java Aspose Cells for Java is a purely licensed Java Excel API developed and distributed by the vendor Aspose. The latest version of this API is 8.1.2, released in July 2014. It is a rich and heavy API (combination of plain Java classes and AWT classes) for designing the Excel component that can read, write, and manipulate spreadsheets. The common uses of this API are as follows − Excel reporting, build dynamic Excel reports High-fidelity Excel rendering and printing Import and export data from Excel spreadsheets Generate, edit, and convert spreadsheets JXL JXL is a third-party framework designed for Selenium that supports data driven automation on web browsers (auto-update of data on web browsers). However it is also used as a common support library for JExcel API because it has basic features to create, read, and write spreadsheets. The basic features are as follows − Generate Excel files Import data from workbooks and spreadsheets Obtain the total number of rows and columns Note − JXL supports only .xls file format and it cannot handle large data volume. JExcel JExcel is a purely licensed API provided by Team Dev. Using this, programmers can easily read, write, display, and modify Excel workbooks in both .xls and .xlsx formats. This API can be easily embedded with Java Swing and AWT. The latest version of this API is Jexcel-2.6.12, released in 2009. The main features are as follows − Automate Excel application, workbooks, spreadsheets, etc Embed workbooks in a Java Swing application as ordinary Swing component Add event listeners to workbooks and spreadsheets Add event handlers to handle the behavior of workbook and spreadsheet events Add native peers to develop custom functionality Apache POI – Environment This chapter takes you through the process of setting up Apache POI on Windows and Linux based systems. Apache POI can be easily installed and integrated with your current Java environment following a few simple steps without any complex setup procedures. User administration is required while installation. System Requirements JDK Java SE 2 JDK 1.5 or above Memory 1 GB RAM (recommended) Disk Space No minimum requirement Operating System Version Windows XP or above, Linux Let us now proceed with the steps to install Apache POI. Step 1 – Verify your Java Installation First of all, you need to have Java Software Development Kit (SDK) installed on your system. To verify this, execute any of the two commands depending on the platform you are working on. If the Java installation has been done properly, then it will display the current version and specification of your Java installation. A sample output is given in the following table. Platform Command Sample Output Windows Open

Apache POI – Overview

Apache POI – Overview ”; Previous Next Many a time, a software application is required to generate reports in Microsoft Excel file format. Sometimes, an application is even expected to receive Excel files as input data. For example, an application developed for the Finance department of a company will be required to generate all their outputs in Excel. Any Java programmer who wants to produce MS Office files as output must use a predefined and read-only API to do so. What is Apache POI? Apache POI is a popular API that allows programmers to create, modify, and display MS Office files using Java programs. It is an open source library developed and distributed by Apache Software Foundation to design or modify Microsoft Office files using Java program. It contains classes and methods to decode the user input data or a file into MS Office documents. Components of Apache POI Apache POI contains classes and methods to work on all OLE2 Compound documents of MS Office. The list of components of this API is given below. POIFS (Poor Obfuscation Implementation File System) − This component is the basic factor of all other POI elements. It is used to read different files explicitly. HSSF (Horrible Spreadsheet Format) − It is used to read and write xls format of MS-Excel files. XSSF (XML Spreadsheet Format) − It is used for xlsx file format of MS-Excel. HPSF (Horrible Property Set Format) − It is used to extract property sets of the MS-Office files. HWPF (Horrible Word Processor Format) − It is used to read and write doc extension files of MS-Word. XWPF (XML Word Processor Format) − It is used to read and write docx extension files of MS-Word. HSLF (Horrible Slide Layout Format) − It is used for read, create, and edit PowerPoint presentations. HDGF (Horrible DiaGram Format) − It contains classes and methods for MS-Visio binary files. HPBF (Horrible PuBlisher Format) − It is used to read and write MS-Publisher files. This tutorial guides you through the process of working on Excel files using Java. Therefore the discussion is confined to HSSF and XSSF components. Note − Older versions of POI support binary file formats such as doc, xls, ppt, etc. Version 3.5 onwards, POI supports OOXML file formats of MS-Office such as docx, xlsx, pptx, etc. Like Apache POI, there are other libraries provided by various vendors for Excel file generation. These include Aspose cells for Java by Aspose, JXL by Commons Libraries, and JExcel by Team Dev. Print Page Previous Next Advertisements ”;

Apache POI – Core Classes

Apache POI – Core Classes ”; Previous Next This chapter explains a few classes and methods under the Apache POI API that are critical to work on Excel files using Java programs. Workbook This is the super-interface of all classes that create or maintain Excel workbooks. It belongs to the org.apache.poi.ss.usermodel package. The two classes that implement this interface are as follows − HSSFWorkbook − This class has methods to read and write Microsoft Excel files in .xls format. It is compatible with MS-Office versions 97-2003. XSSFWorkbook − This class has methods to read and write Microsoft Excel and OpenOffice xml files in .xls or .xlsx format. It is compatible with MS-Office versions 2007 or later. HSSFWorkbook It is a high-level class under the org.apache.poi.hssf.usermodel package. It implements the Workbook interface and is used for Excel files in .xls format. Listed below are some of the methods and constructors under this class. Class Constructors Sr.No. Constructor & Description 1 HSSFWorkbook() Creates a new HSSFWorkbook object from scratch. 2 HSSFWorkbook(DirectoryNode directory, boolean preserveNodes) Creates a new HSSFWworkbook objectinside a specific directory. 3 HSSFWorkbook(DirectoryNode directory, POIFSFileSystem fs, boolean preserveNodes) Given a POIFSFileSystem object and a specific directory within it, it creates an SSFWorkbook object to read a specified workbook. 4 HSSFWorkbook(java.io.InputStream s) Creates a new HSSFWorkbook object using an input stream. 5 HSSFWorkbook(java.io.InputStream s, boolean preserveNodes) Constructs a POI file system around your input stream. 6 HSSFWorkbook(POIFSFileSystem fs) Constructs a new HSSFWorkbook object using a POIFSFileSystem object. 7 HSSFWorkbook(POIFSFileSystem fs, boolean preserveNodes) Given a POIFSFileSystem object, it creates a new HSSFWorkbook object to read a specified workbook. The frequently used parameters inside these constructors are − directory − It is the POI filesystem directory to process from. fs − It is the POI filesystem that contains the workbook stream. preservenodes − This is an optional parameter that decides whether to preserve other nodes like macros. It consumes a lot of memory as it stores all the POIFileSystem in memory (if set). Note − The HSSFWorkbook class contains a number of methods; however they are compatible with xls format only. In this tutorial, the focus is on the latest version of Excel file formats. Hence, the class methods of HSSFWorkbook are not listed here. If you require these class methods, then refer POI-HSSFWorkbook class API at https://poi.apache.org/apidocs/org/apache/poi/hssf/usermodel/HSSFWorkbook.html. XSSFWorkbook It is a class that is used to represent both high and low level Excel file formats. It belongs to the org.apache.xssf.usemodel package and implements the Workbook interface. Listed below are the methods and constructors under this class. Class Constructors Sr.No. Constructor & Description 1 XSSFWorkbook() Creates a new XSSFworkbook object from scratch. 2 XSSFWorkbook(java.io.File file) Constructs an XSSFWorkbook object from a given file. 3 XSSFWorkbook(java.io.InputStream is) Constructs an XSSFWorkbook object, by buffering the whole input stream into memory and then opening an OPCPackage object for it. 4 XSSFWorkbook(java.lang.String path) Constructs an XSSFWorkbook object given the full path of a file. Class Methods Sr.No. Method & Description 1 createSheet() Creates an XSSFSheet for this workbook, adds it to the sheets, and returns the high level representation. 2 createSheet(java.lang.String sheetname) Creates a new sheet for this Workbook and returns the high level representation. 3 createFont() Creates a new font and adds it to the workbook”s font table. 4 createCellStyle() Creates a new XSSFCellStyle and adds it to the workbook”s style table. 5 createFont() Creates a new font and adds it to the workbook”s font table. 6 setPrintArea(int sheetIndex, int startColumn, int endColumn, int startRow,int endRow) Sets the print area of a given sheet as per the specified parameters. For the remaining methods of this class, refer the complete API document at − https://poi.apache.org/apidocs/org/apache/poi/xssf/usermodel/XSSFWorkbook.html. for the complete list of methods. Sheet Sheet is an interface under the org.apache.poi.ss.usermodel package and it is a super-interface of all classes that create high or low level spreadsheets with specific names. The most common type of spreadsheet is worksheet, which is represented as a grid of cells. HSSFSheet This is a class under the org.apache.poi.hssf.usermodel package. It can create excel spreadsheets and it allows to format the sheet style and sheet data. Class Constructors Sr.No. Constructor & Description 1 HSSFSheet(HSSFWorkbook workbook) Creates new HSSFSheet called by HSSFWorkbook to create a sheet from scratch. 2 HSSFSheet(HSSFWorkbook workbook, InternalSheet sheet) Creates an HSSFSheet representing the given sheet object. XSSFSheet This is a class which represents high level representation of excel spreadsheet. It is under org.apache.poi.hssf.usermodel package. Class Constructors Sr.No. Constructor & Description 1 XSSFSheet() Creates new XSSFSheet − called by XSSFWorkbook to create a sheet from scratch. 2 XSSFSheet(PackagePart part, PackageRelationship rel) Creates an XSSFSheet representing the given package part and relationship. Class Methods Sr.No. Method & Description 1 addMergedRegion(CellRangeAddress region) Adds a merged region of cells (hence those cells form one). 2 autoSizeColumn(int column) Adjusts the column width to fit the contents. 3 iterator() This method is an alias for