Skip to content

Fix for issue #394 - cache stampede#397

Open
pmonks wants to merge 4 commits intospdx:masterfrom
pmonks:master
Open

Fix for issue #394 - cache stampede#397
pmonks wants to merge 4 commits intospdx:masterfrom
pmonks:master

Conversation

@pmonks
Copy link
Collaborator

@pmonks pmonks commented Mar 3, 2026

This PR fixes issue #394 (a cache stampede).

While synchronizing on a method argument isn't recommended, there didn't seem to be an obvious / low code way to come up with a per-URL critical section in this code. Instead I updated the JavaDoc to clearly state that the argument will be used for synchronization and that callers should take care not to separately synchronize on the same URL objects.

Also, the synchronization / virtual thread pinning issue that existed in JVM v24 was addressed in JVM v25 (the current LTS release - v24 was a FRS), so I felt comfortable using a synchronized block here rather than java.util.concurrent lock objects. Happy to take a deeper look at that if there are concerns.

@pmonks pmonks requested review from bact and goneall March 3, 2026 19:42
@pmonks
Copy link
Collaborator Author

pmonks commented Mar 3, 2026

Stand by - I believe this only works by coincidence, since it doesn't consider different URL objects that point to the same actual URL.

I think a proper solution requires a ConcurrentHashmap of Semaphores (one per URL value).

@bact bact added the enhancement New feature or request label Mar 3, 2026
@bact bact linked an issue Mar 3, 2026 that may be closed by this pull request
@pmonks
Copy link
Collaborator Author

pmonks commented Mar 3, 2026

Ok @bact @goneall I think this approach using ReentrantLocks is more robust. I'm slightly nervous about the potential unbounded growth of the map containing the per-URL locks, but I don't see a thread-safe way to evict map entries.

@pmonks pmonks added the performance Speed, responsiveness, and memory efficiency label Mar 4, 2026
Copy link
Member

@goneall goneall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One suggestion

pmonks and others added 2 commits March 3, 2026 17:32
Only construct new `ReentrantLock` if needed.

Co-authored-by: Gary O'Neall <gary@sourceauditor.com>
Signed-off-by: Peter Monks <pmonks@gmail.com>
@bact
Copy link
Collaborator

bact commented Mar 4, 2026

Ok @bact @goneall I think this approach using ReentrantLocks is more robust. I'm slightly nervous about the potential unbounded growth of the map containing the per-URL locks, but I don't see a thread-safe way to evict map entries.

Not related to this PR but this reminds me of how "field knowledge" on estimated number of something can be useful when designing something, especially heuristic one.

During yesterday's Tech call discussion about showing subclasses of a class in spec, some people might worried about will that make the Element page exploded. It turns out that Element has "only" 16 direct subclasses and they can fit well in current page layout.

Or the knowledge about which licenses are more popular / likely to be found (maybe represented in order of magnitude) that could help matching/filtering optimization.

From the example of number of URLs above, if we can known that normally the number will be 10,000 but lower than 100,000, will it help choosing approach?

Not sure where can we document this kind of knowledge. Will be valuable for both implementors and spec designers alike.

@pmonks
Copy link
Collaborator Author

pmonks commented Mar 4, 2026

From the example of number of URLs above, if we can known that normally the number will be 10,000 but lower than 100,000, will it help choosing approach?

It'll be approximately equivalent to however many listed license and exceptions there are (plus any other ancillary files Spdx-Java-Library happens to download through the cache). As of SPDX license list v3.28.0 the library is downloading 813 URLs in total, so I'm not super concerned about the size of that map.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request performance Speed, responsiveness, and memory efficiency

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Web store: downloading all assets is suspiciously slow

3 participants