Example of processing large amount of data from the datasource

Usually, developers work with high abstractions that exposes Spring for us. Developers forget about underlying technologies and could have problems in unusual cases. One of the cases - reading/processing large data sets. Here I demonstrate something for you. The solution is based on the ResultSet abstraction.

Wrong ways to process large data set

use pagination with LIMIT and OFFSET due to performance issue

Run project

This is elementary:

First you need to set up local postgres instance. To do that:

cd docker
docker compose up

Then we are going to run the app:

./mvnw clean install
./mvwn spring-boot:run

The app available now: localhost:8082.

Using

Send GET request by /data/all path - it will trigger full read (select *). This will cause OutOfMemory exception.
Then send GET request by /data/all/batch path - it will trigger same query, but process the data with 100_000 chunk size. No OutOfMemory expected here.

⚠ From author ⚠

To make ResultSet fetch data set in portions, we need:

create statement with ResultSet.TYPE_FORWARD_ONLY type (by default)
set automommit for ResultSet to false value

Otherwise, postgre driver fetches all data at once.

To calculate how many times jdbc driver trip to the database you need to divide total rows on your fetch size and round up. In my example calculation will be: 1_000_000 / 100_000 = 10 trips.

Try to play with different settings of Connection and Statement abstractions, you will see difference in data set processing in logs.

Ref

Postgre sql driver, issuing a query - link

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.mvn/wrapper		.mvn/wrapper
docker		docker
src		src
.gitignore		.gitignore
README.md		README.md
mvnw		mvnw
mvnw.cmd		mvnw.cmd
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Example of processing large amount of data from the datasource

Wrong ways to process large data set

Run project

Using

⚠ From author ⚠

Ref

About

Releases

Packages

Languages

kuza2010/read_large_data_example

Folders and files

Latest commit

History

Repository files navigation

Example of processing large amount of data from the datasource

Wrong ways to process large data set

Run project

Using

⚠ From author ⚠

Ref

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages