Reading Excel files in Java

prachi sharma
Posted on 12th Apr 2024 6:19 PM | 10 min Read | 60 min Implementation

What is Apache POI and how we use it to read excel files.


Apache POI, an API developed by the Apache Foundation, offers an extensive range of Java libraries.The name "POI" originally stood for "Poor Obfuscation Implementation." This was a humorous take on the perceived complexity and lack of clarity in the file formats used by Microsoft Office applications, which the Apache POI library aims to interact with. It enables the reading, writing, and manipulation of various Microsoft file formats, including Excel spreadsheets, PowerPoint presentations, and Word documents.

With Apache POI, we can extract data from Excel files, manipulate the contents, and perform various operations like reading cell values, formatting, and handling different types of data efficiently in Java applications.Here we are trying to read the contents of an xlsx files.



In the below flow diagram,we are illustrating how Apache POI comprises of various Java interfaces and classes designed for the purpose of reading XLS and XLSX format files.both XLS and XLSX are file formats for Microsoft spreadsheet.XLS is the default file format from Excel 97 to Excel 2003 while XLSX is the default file format for Excel 2007 and later.

HSSF (Horrible SpreadSheet Format) is Apache POI's Java-based version of the Excel '97 to 2007 file format, while XSSF (XML SpreadSheet Format) is its Java-based counterpart for Excel 2007's OOXML (.xlsx) format.Both HSSF and XSSF offer tools to handle spreadsheets: creating, modifying, reading, and writing XLS spreadsheets.


Different Interfaces and classes under Pache POI:





In our code we are using POI-XSSF Java API To Access Microsoft Excel Format Files.


Below is the representation of an Excel sheet and its components:










Process Outline:


Below is the outline of how an excel sheet is read using Apache POI API.



If you are making Maven Project,below is the list of dependencies you will need:


<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.Teche</groupId>
<artifactId>Filereading</artifactId>
<version>0.0.1-SNAPSHOT</version>
<dependencies>
</dependency>
<dependency>
<groupId>org.apache.poi</groupId>
<artifactId>poi</artifactId>
<version>5.2.5</version>
</dependency>
<dependency>
<groupId>commons-io</groupId>
<artifactId>commons-io</artifactId>
<version>2.16.1</version>
</dependency>
<dependency>
<groupId>org.apache.poi</groupId>
<artifactId>poi-ooxml</artifactId>
<version>5.2.5</version>
</dependency>
<dependency>
<groupId>org.apache.poi</groupId>
<artifactId>poi-ooxml-schemas</artifactId>
<version>4.1.2</version>
</dependency>
<dependency>
<groupId>org.apache.poi</groupId>
<artifactId>poi-scratchpad</artifactId>
<version>5.2.5</version>
</dependency>
<dependency>
<groupId>org.apache.poi</groupId>
<artifactId>openxml4j</artifactId>
<version>1.0-beta</version>
</dependency>
</dependencies>
</project>

In addition to above dependencies I have also added one .jar file You need to add the log4j-core jar to the classpath


Below is the code snippet we write after following the above steps:


import java.io.FileInputStream;
import java.io.IOException;

import org.apache.poi.xssf.usermodel.XSSFCell;
import org.apache.poi.xssf.usermodel.XSSFRow;
import org.apache.poi.xssf.usermodel.XSSFSheet;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;

public class ExcelFileRead {
public static void main(String[] args) throws IOException {
String excelFilePath="C:\\MyProject\\ReadFile.xlsx";
FileInputStream fis=new FileInputStream(excelFilePath);
XSSFWorkbook workbook=new XSSFWorkbook(fis);
XSSFSheet sheet=workbook.getSheetAt(0);

int rows=sheet.getLastRowNum();
int cols=sheet.getRow(1).getLastCellNum();

for(int r=0;r<rows;r++) //This is the outer loop for reading rows
{
XSSFRow row=sheet.getRow(r); //returns the row objects

for(int c=0;c<cols;c++) //This will run to fetch cell objects
{
XSSFCell cell=row.getCell(c); // returns the cell object
//After entering cells we need to find out what is the type of cell data
switch(cell.getCellType())
{
case STRING:
System.out.print(cell.getStringCellValue());
break;
case NUMERIC:
System.out.print(cell.getNumericCellValue());
break;
case BOOLEAN:
System.out.print(cell.getBooleanCellValue());
break;
case BLANK:
System.out.print("BLANK\t");
break;
default:
System.out.print("DEFAULT\t");

}
System.out.print(" | ");
}
System.out.println();

}
}
}


The classes and methods we are using in the above code are as follows:


Another shortcut approach to read excel file is as follows:

In the below coding snippet we are using DataFormatter class to read and display contentsof the sheet.

import java.io.FileInputStream;
import java.io.IOException;
import org.apache.poi.ss.usermodel.DataFormatter;
import org.apache.poi.xssf.usermodel.XSSFCell;
import org.apache.poi.xssf.usermodel.XSSFRow;
import org.apache.poi.xssf.usermodel.XSSFSheet;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;

public class ExcelReadingDemo {
public static void main(String[] args) throws IOException {
String path="C:\\MyProject\\ReadFile.xlsx";
FileInputStream fil=new FileInputStream(path);
XSSFWorkbook workbook=new XSSFWorkbook(fil);
XSSFSheet sheet=workbook.getSheetAt(0);

int rows=sheet.getLastRowNum();
int cols=sheet.getRow(0).getLastCellNum();

for(int r=0;r<rows;r++)
{
XSSFRow row=sheet.getRow(r);
for(int c=0;c<cols;c++)
{
XSSFCell cell=row.getCell(c);
//
DataFormatter format = new DataFormatter();
String fValue = format.formatCellValue(cell);
System.out.print(fValue);
//
// }
System.out.print(" | ");
}
System.out.println();
}

}
}


All Comments ()
Do You want to add Comment in this Blog? Please Login ?