3

I'd like to have a random millisecond value from an (inverse) linear distribution of values (if I get the term right).

In essence I want to have a random point-in-time t (Date in my case) between two time points early and late where a t towards early has a much greater probability then those towards late. late itself may have a probability of 0.0.

My current java code just uses uniform distribution, so I plan to modify this to a (inverse) linear distribution:

public Date getRandomDate(Date early, Date late) {
  long diff = late.getTime() - early.getTime();
  final int randVal = rand.nextInt((int) diff);
  Calendar cal = Calendar.getInstance();
  cal.setTime(early);
  cal.add(Calendar.MILLISECOND, randVal);
  return cal.getTime();
}
towi
  • 21,587
  • 28
  • 106
  • 187
  • Does this solve your problem: http://stackoverflow.com/questions/5969447/java-random-integer-with-non-uniform-distribution ? – GhostCat Feb 24 '17 at 13:12
  • 1
    I'm not quite sure what you're asking. Is it about how to get a linear distribution of random numbers or do you have problems applying that to a date? – Thomas Feb 24 '17 at 13:17
  • 2
    And by the way, you can just [construct `Date` from `long`](https://docs.oracle.com/javase/7/docs/api/java/sql/Date.html#Date(long)). No need to use Calendar. Like `return new Date(early.getTime()+randVal);` – rustyx Feb 24 '17 at 13:20

3 Answers3

3

Piggy-backing off of this answer to a similar question, you could just take the minimum of two rand calls:

final int randVal = Math.min(rand.nextInt((int) diff), rand.nextInt((int) diff));

Finally, here is another more complex way that solves for x using the cumulative distribution function (x^2):

int randVal = (int) Math.floor(diff * (1.0 - Math.sqrt(rand.nextDouble())));
if(randVal >= diff) randVal = 0; // handle the edge case

To meet your specified requirements, the square root has been subtracted from 1.0 to invert the distribution, i.e. putting the greater density at the bottom of the range.

Community
  • 1
  • 1
Patrick Parker
  • 4,863
  • 4
  • 19
  • 51
  • `min` was better. I wanted to be linear, not squared. – towi Feb 24 '17 at 20:38
  • @towi believe it or not, the CDF for a linear distribution `[0,1)` is `x^2`. See the wikipedia link I provided. Or check on math StackExchange if you want to double check me. ;) – Patrick Parker Feb 24 '17 at 22:05
2

The accepted Answer by Parker seems to be correct and well-done.

Using java.time

The Question uses outmoded troublesome date-time classes that are now legacy, supplanted by the java.time classes. Here is the same kind of code, along with Parker’s solution, rewritten in java.time.

Instant

First, if you must work with java.util.Date objects, convert to/from Instant. The Instant class represents a moment on the timeline in UTC with a resolution of nanoseconds (up to nine (9) digits of a decimal fraction). To convert, look to new methods added to the old classes.

Instant instant = myJavaUtilDate.toInstant();  // From legacy to modern class.
java.util.Date myJavaUtilDate = java.util.Date.from( instant ) ;  // From modern class to legacy.

Let's rewrite the method signature but passing and returning Instant objects.

public Instant getRandomDate( Instant early , Instant late) {

Verify the early argument is indeed earlier than the later argument. Alternatively, assert that Duration seen below is not negative ( ! duration.isNegative() ).

    if( early.isAfter( late) ) { … }  // Assert `early` is not after `late`.

Half-Open

Calculate the delta between the earliest and latest moments. This is done in the Half-Open approach often used to define spans of time, where the beginning is inclusive and the ending is exclusive.

Duration

The Duration class represents such a span in terms of a total number of seconds plus a fractional second in nanoseconds.

    Duration duration = Duration.between( early , late ) ;

To do our random math, we want a single integer. To handle nanoseconds resolution, we need a 64-bit long rather than a 32-bit int.

ThreadLocalRandom

Tip: If generating these values across threads, use the class ThreadLocalRandom. To quote the doc:

When applicable, use of ThreadLocalRandom rather than shared Random objects in concurrent programs will typically encounter much less overhead and contention.

We can specify the range in Half-Opened style with the origin being inclusive and the bound being exclusive by calling ThreadLocalRandom::nextLong( origin , bound ).

    long bound = duration.toNanos() ;
    long nanos1 = ThreadLocalRandom.current().nextLong( 0 , bound ); 
    long nanos2 = ThreadLocalRandom.current().nextLong( 0 , bound ); 
    long nanos = Math.min( nanos1 , nanos2 );  // Select the lesser number.
    Instant instant = early.plusNanos( nanos );
    return instant ;
}

Live example

See the code below run live at IdeOne.com.

We extract the number of date-time values generated for each date-only (LocalDate) as a casual way to survey the results to verify our desired results skewed towards earlier dates.

The test harness shows you how to assign a time zone (ZoneId) to an Instant to get a ZonedDateTime object, and from there extract a LocalDate. Use that as a guide if you wish to view the Instant objects through the lens of some particular region’s wall-clock time rather than in UTC.

/* package whatever; // don't place package name! */

import java.util.*;
import java.lang.*;
import java.io.*;

import java.util.concurrent.ThreadLocalRandom ;
import java.util.TreeMap ;

import java.time.*;
import java.time.format.*;
import java.time.temporal.*;

/* Name of the class has to be "Main" only if the class is public. */
class Ideone
{
    public static void main (String[] args) throws java.lang.Exception
    {
        Ideone app = new Ideone();
        app.doIt();
    }

    public void doIt() {
        ZoneId z = ZoneId.of( "America/Montreal" ) ;
        int count = 10 ;
        LocalDate today = LocalDate.now( z );
        LocalDate laterDate = today.plusDays( count );
        Instant start = today.atStartOfDay( z ).toInstant();
        Instant stop = laterDate.atStartOfDay( z ).toInstant();

        // Collect the frequency of each date. We want to see bias towards earlier dates.
        List<LocalDate> dates = new ArrayList<>( count );
        Map<LocalDate , Integer > map = new TreeMap<LocalDate , Integer >();
        for( int i = 0 ; i <= count ; i ++ ) {
            LocalDate localDate = today.plusDays( i ) ; 
            dates.add( localDate );  // Increment to next date and remember.
            map.put( localDate , new Integer( 0 ) ); // Prepopulate the map with all dates.
        }
        for( int i = 1 ; i <= 10_000 ; i ++ ) {
            Instant instant = this.getRandomInstantBetween( start , stop );
            LocalDate localDate = instant.atZone( z ).toLocalDate();
            Integer integer = map.get( localDate );
            map.put( localDate , integer +  1);  // Increment to count each time get a hit on this date.
        }
        System.out.println( map );
    }

    public Instant getRandomInstantBetween( Instant early , Instant late) {

        Duration duration = Duration.between( early , late ) ;
        // Assert the duration is positive or zero: ( ! duration.isNegative() )

        long bound = duration.toNanos() ;
        ThreadLocalRandom random = ThreadLocalRandom.current() ;
        long nanos1 = random.nextLong( 0 , bound ); // Zero means the `early` date is inclusive, while `bound` here is exclusive.
        long nanos2 = random.nextLong( 0 , bound ); 
        long nanos = Math.min( nanos1 , nanos2 );  // Select the lesser number.
        Instant instant = early.plusNanos( nanos );

        return instant;
    }
}

Here are some sample results. These look good to me, but I'm no statistician. Use at your own risk.

{2017-02-24=1853, 2017-02-25=1697, 2017-02-26=1548, 2017-02-27=1255, 2017-02-28=1130, 2017-03-01=926, 2017-03-02=706, 2017-03-03=485, 2017-03-04=299, 2017-03-05=101, 2017-03-06=0}

{2017-02-25=930, 2017-02-26=799, 2017-02-27=760, 2017-02-28=657, 2017-03-01=589, 2017-03-02=470, 2017-03-03=342, 2017-03-04=241, 2017-03-05=163, 2017-03-06=49, 2017-03-07=0}

{2017-02-25=878, 2017-02-26=875, 2017-02-27=786, 2017-02-28=676, 2017-03-01=558, 2017-03-02=440, 2017-03-03=370, 2017-03-04=236, 2017-03-05=140, 2017-03-06=41, 2017-03-07=0}


About java.time

The java.time framework is built into Java 8 and later. These classes supplant the troublesome old legacy date-time classes such as java.util.Date, Calendar, & SimpleDateFormat.

The Joda-Time project, now in maintenance mode, advises migration to the java.time classes.

To learn more, see the Oracle Tutorial. And search Stack Overflow for many examples and explanations. Specification is JSR 310.

Where to obtain the java.time classes?

The ThreeTen-Extra project extends java.time with additional classes. This project is a proving ground for possible future additions to java.time. You may find some useful classes here such as Interval, YearWeek, YearQuarter, andfz more.

Community
  • 1
  • 1
Basil Bourque
  • 303,325
  • 100
  • 852
  • 1,154
  • Thanks for the intro to new java time classes, which I am aware of but have not been use widespread enough, I agree. Maybe your post will help to correct that. In only fear the inherent Zawinski *"now we have one problem more"*, here. – towi Feb 25 '17 at 12:52
  • @towi Not sure what you mean by quoting [Zawinski](https://en.wikiquote.org/wiki/Jamie_Zawinski). If you are concerned about the quality or reliability of java.time, don't be. The java.time classes are built by the same people, led by Stephen Colebourne “jodastephen”, who put many years into the development of the incredibly successful Joda-Time library. Taking what was learned there, they re-engineered to create java.time. Now java.time is a built-in part of Java, and the extensive testing that goes with a Java release. The only bugs I know of have been corner cases in parsing strings. – Basil Bourque Feb 27 '17 at 06:02
  • No, I am not concerned about the quality. I am just concerned about the multitude of date/time-APIs; it's confusing. And you will always have different 3rd-party-libs that require the different data/time-APIs. Thus you are always converting back-and-forth. That is because old APIs will likely never disappear. I work daily with `Date`, `Calendar`, `XMLGrCal`, my own `WDate` and sometimes `Jodatime`. 5*4=20 possible API-conversions already. Now I will have 6*5=30. Don't misunderstand me, I see the reason behind the need of different Date/Time-APIs. Its just not as simple as I wished it were. – towi Feb 28 '17 at 20:57
  • @towi Yes, date-time is a mess. How the entire information technology industry ignored this crucial topic for so many decades is astounding. Joda-Time & java.time are the first serious attempt I know of to tackle this topic is a comprehensive manner. Now that we have java.time built-in, Joda-Time use will be phased out. Yes, the legacy date-time classes in Java will be with us for many years. I advise you to use the new methods added to those old classes to convert into java.time. Do your business logic and data storage and data exchange all in java.time. Convert back only where necessary. – Basil Bourque Feb 28 '17 at 23:31
0

Perhaps you could apply analogy to Date as shown in this answer.

Java: random integer with non-uniform distribution

Community
  • 1
  • 1