Apache Derby a fosil or a real option for database ?

Apache Derby 

Currently this blog is on a software/infra/solutions architecture era in which questions such as "is this db good for me" or "is this OS easy enough for handling this or that" kind of questions arrise every week. So today is not the exception, so we are questioning is Derby something we should use ? and of course this question comes from the base of the blog philosofi itself that is we need to be aware of every technology that could exist as software engineers since we dont know what a legacy system might be written on, of course I feel the need to check about Apache Derby. 

Before going deeper I need to show tell you, this technology is no longer the standars but it appears on most legacy or long live java applications/servers and God knows where else(proabably washing machines) , so if you arrived here I would assume you are probably looking for alternatives or just want to wash out your frustation. And yes please use something else as this DB is no longer receiving updates and most likely super insecure. 

So lets starts with the bad news, as of October 2025, Apache Derby has officially been retired and moved to "read-only" status by the Apache Software Foundation. While you can still use it, it is no longer receiving security updates or new features.

Now lets get into the detils about what is it and how 

Apache Derby was created to be a pure Java, lightweight, and embeddable relational database. Its goal was to solve a specific problem for developers: the need for a full-featured SQL database that could live inside an application without requiring a separate, complex installation.





The Original Purpose (The "Cloudscape" Era)

In 1996, a startup called Cloudscape Inc. was founded in Oakland, California. They saw that while Java was becoming the go-to language for "write once, run anywhere" portability, most databases still required heavy, platform-specific installations.

  • The Goal: Build a database engine that was 100% Java so it could run on any device (even tiny handhelds or PDAs) as a simple .jar file.

  • The Result: They launched JBMS (later renamed Cloudscape), which became the gold standard for "embedded" databases—meaning the database engine actually runs in the same memory space as the application itself.

The IBM Contribution

IBM eventually acquired the technology (through their purchase of Informix) and renamed it IBM Cloudscape. However, to encourage more Java development and counter other open-source databases, IBM made a strategic move:

  • The Donation: In 2004, IBM donated the entire 500,000+ lines of code to the Apache Software Foundation.

  • The Rebrand: The project was renamed Derby, and it became an open-source project.

Why it WAS popular (and why Sun/Oracle used it)

Because Derby was small (about 3.5 MB) and written in Java, it was extremely convenient for testing and lightweight apps.

  • Java DB: Sun Microsystems (and later Oracle) loved it so much they bundled it with the Java Development Kit (JDK) under the name Java DB. This meant millions of developers had a database ready to go the moment they installed Java.




FeatureWhy it matters
Pure JavaNo native code; it runs wherever Java runs (Windows, Mac, Linux, Mainframes).
EmbeddableNo separate server process; the DB starts and stops with your app.
SQL StandardsSupports modern SQL and JDBC, so you can code for Derby and later move to "big" DBs like DB2 or Oracle with minimal changes.
Zero AdminNo DBA (Database Administrator) is needed to manage it.

Derby Professional Applications and use cases 


While Apache Derby (also known as Java DB) is no longer the "hot new thing" in 2026, it carved out a massive niche in professional software due to its zero-administration nature.

Here are the primary professional applications and cases where Derby has been a staple:

1. Embedded "Shadow" Databases

Many high-end enterprise tools use Derby as an internal "worker" database that the user never even sees.

  • Application: WebSphere Application Server (IBM). It uses Derby to store internal configuration data and messaging engine states.

  • Why? IBM needed a database that would install automatically alongside their server without requiring the user to set up a separate Oracle or MySQL instance.

2. Desktop & Offline Client Software

For professional desktop applications that need to work without an internet connection, Derby acts as the local "filing cabinet."

  • Case: Financial and Medical Desktop Tools. Professionals in these fields often need to store complex, relational data locally on a laptop while traveling or in areas with poor connectivity.

  • Usage: It provides the power of a full SQL server but saves everything into a simple folder on the user's hard drive.

3. Integrated Development Environments (IDEs)

If you are a programmer, you have likely used Derby without realizing it.

  • Case: NetBeans and Eclipse. These tools often use Derby as a default database for developers to practice with or to store metadata about the code projects they are working on.

  • Value: It allows a junior developer to learn SQL immediately upon installing their coding tools, with "zero configuration."

4. Edge Computing and IoT

Before the rise of modern specialized IoT databases, Derby was a go-to for "Edge" devices (like smart factory controllers or specialized kiosks).

  • Application: Point-of-Sale (POS) Systems. Many older retail systems use Derby to manage inventory and transactions locally at the cash register, syncing with a main server only once a day.

5. Unit Testing and Prototyping

In the professional "DevOps" world, Derby is famous for being a "disposable" database.

  • Case: Automated Testing Pipelines. When developers write code, they need to test if it interacts with a database correctly. Setting up a real production database for a 10-second test is overkill.

  • Usage: They use Derby in "In-Memory" mode. It creates a database in the computer's RAM, runs the tests, and disappears instantly when the test is over—leaving no mess behind.






The "State of the Industry" in 2026

It is important to note that since October 2025, Apache Derby has been moved to "Retirement" status. While it is still used in "legacy" professional systems (software that has been running for 10+ years), most modern professional projects are migrating to these alternatives:


AlternativeWhy professionals are switching
H2 DatabaseFaster, more features, and still pure Java.
SQLiteThe global standard for embedded files (though not pure Java).
DuckDBUsed for modern "Big Data" analytics on local machines.


Places where Derby SHOULD NOT be there



Because Apache Derby is so easy to set up (literally one file or one line of code), it is frequently used in scenarios where it eventually fails. It is a "Goldilocks" database: perfect for small things, but disastrous for big ones.

Here are the most common ways Derby is misused in professional projects:

1. The "Production Trap" (High Traffic)

The most common misuse is keeping Derby as the database when a project moves from "Development" to "Production."

  • The Scenario: A startup builds a web app. During development, Derby is great because it requires zero setup. They launch, and suddenly 1,000 users are hitting the site at once.

  • Why it's a mistake: Derby struggles with high concurrency. It uses "Table-level locking" in many scenarios, meaning if one user is writing to a table, everyone else has to wait in a queue. On a high-traffic website, this leads to the app freezing or timing out.

  • Better choice: PostgreSQL or MySQL.

2. Multi-User Access (Embedded Mode)

Derby is often misused in "Embedded Mode" when multiple separate applications need to talk to the same data.

  • The Scenario: You have a Java application running on a server, and you try to open a separate "Database Viewer" tool at the same time to look at the data.

  • Why it's a mistake: In embedded mode, Derby locks the database files to a single Java Virtual Machine (JVM). If a second app tries to connect, it will crash with an "Access Denied" or "Database already booted" error.

  • The Fix: You’d need to switch to "Network Server Mode," but at that point, you’ve lost the simplicity that made you choose Derby in the first place.

3. Large Data Sets (The "Size Ceiling")

Derby is not designed for "Big Data" or even "Medium Data."

  • The Scenario: A company uses Derby to store logs or historical transaction data over several years.

  • Why it's a mistake: As the database grows into the tens of gigabytes, performance degrades sharply. Derby lacks advanced optimization features (like sophisticated partitioning or parallel query execution) found in enterprise databases. Additionally, its indexing on large objects (BLOBs/CLOBs) is limited.

  • Better choice: Microsoft SQL Server, Oracle, or a specialized data warehouse.

4. High-Availability (No "Failover")

Using Derby for "Mission Critical" systems that can never go down.

  • The Scenario: A hospital or emergency dispatch system uses Derby to store active records.

  • Why it's a mistake: Derby does not have built-in, easy-to-configure clustering or automatic failover. If the server hardware fails, the database is gone until you manually restore a backup. It is a "single point of failure" by design.

  • Better choice: Any database with robust replication (MongoDB, Cassandra, or clustered SQL).

5. Non-Java Environments

Sometimes teams try to force Derby into a "Polyglot" environment.

  • The Scenario: A team has a Java backend using Derby, but they want to write a new microservice in Python or Node.js to access that same data.

  • Why it's a mistake: Because Derby is "Pure Java," accessing it from other languages is a nightmare. You usually have to set up a bulky "bridge" or run it in Network Mode, which defeats the purpose of its lightweight, Java-centric design.

  • Better choice: SQLite (which has libraries for every language on earth).


When to "Eject" from Derby

If you see any of these signs in a project, Derby is being misused:

  • Concurrency: More than 5–10 people writing data at the exact same time.

  • Size: The database file is approaching 1GB or more.

  • Language: You need to access the data with something other than Java.

  • Uptime: The business will lose significant money if the database is offline for 30 minutes.



What should we use NOW 




Since Apache Derby is now officially retired (as of October 2025), moving to a more modern and supported system is essential. The "best" replacement depends entirely on whether you need it to stay embedded (hidden inside your app) or if you are ready for a production-grade server.

1. Best Embedded Database Options

If you want to keep the "one-file" or "zero-install" experience that Derby offered, these are the top contenders:


DatabaseBest ForWhy it's better than Derby
H2 DatabaseJava ApplicationsThe spiritual successor to Derby. It is pure Java, significantly faster, and supports much more modern SQL syntax. It has an excellent "In-Memory" mode for testing.
SQLiteMobile & DesktopThe most used database in the world. It is incredibly stable and cross-platform. While not pure Java (it uses native libraries), it is the industry standard for local data storage.
DuckDBLocal AnalyticsIf your embedded app needs to perform heavy calculations (like processing 1 million rows for a chart), DuckDB is 10x–100x faster than Derby because it uses "columnar" storage.



2. Best Production Transactional Databases

If your project is growing and you need to handle multiple users, high traffic, and "mission-critical" data, you should move to a Client-Server model.

The "Gold Standard": PostgreSQL

PostgreSQL is the most recommended replacement for Derby in a production environment.

  • Reliability: It is famous for "data integrity"—it is almost impossible to corrupt your data if the power goes out.

  • Concurrency: Unlike Derby, which can lock up when two people write at once, Postgres uses MVCC (Multi-Version Concurrency Control), allowing thousands of simultaneous users.

  • JSON Support: It handles "NoSQL" style data beautifully, giving you the best of both worlds.

The "Web Standard": MySQL / MariaDB

If you are building a standard web application (like an e-commerce site or a CMS), MySQL is the go-to.

  • Speed: Highly optimized for "Read-Heavy" workloads (lots of people looking at data).

  • Ecosystem: Every hosting provider and cloud service (AWS, Google, Azure) has a "Managed" version of MySQL that handles backups for you automatically.

The "Enterprise Standard": Microsoft SQL Server

If your professional environment is heavily invested in the Windows/Azure ecosystem:

  • Integration: It integrates perfectly with C#, .NET, and Power BI.

  • In-Memory OLTP: It has an advanced engine that can process millions of transactions per second by keeping the most active data in the computer's RAM.



Summary Comparison

To choose the right one, ask yourself where the database will live:

  • Inside a .jar file or Desktop app? Use H2.

  • On a Mobile Phone? Use SQLite.

  • On a Web Server with 100+ users? Use PostgreSQL.

  • Analyzing massive CSV/Parquet files locally? Use DuckDB.



Comments

Popular posts from this blog

Kerstin's Fate developer diary

Exploring LLMs with Ollama and Llama3

Containers & Kubernetes in Windows Server 2025 or RedHat EL(RHEL)