I've been thinking about how to redesign one of my Java projects to be data-oriented so it can take advantage of the CPU L1/L2/L3 cache. It looks like we actually might be seeing Project Valhalla relatively soon (although relatively still means years from now), which allows us to create flat, contiguous array structures for fixed-size classes, which can exploit the CPU caches. However, this won't help with classes that necessarily require variable sizes such as String's, and unfortunately, the domain I work with requires a lot of String allocations.
Are there ways to influence the Java runtime so when you allocate memory, it controls the locality of where the memory is allocated? For example, let's say you have a List<String>, and you have a function that frequently iterates over the String's in the List. Can you influence the runtime so when memory is allocated for the Strings, the addresses for the Strings are allocated close to each other so when you read one String, you'll be more likely to have the next String already stored in the CPU cache?
I could theoretically create my own CharSequence implementation that allocates a giant array of bytes for storing UTF-8 characters and allocate substring's from that array, but that's the kind of low-level work that I want to avoid doing.