0

Update:

What I am looking for is some kind of auto trigger mechanism that will do the job as I desire.

Use case:

A ranking based on the scores of posts. The total score of a post is a).the score of this post plus b).the sum of score of all its comments. Both a and b are have initial score 10, and will decrease one point every 24 hours since creation until it becomes 0. When the total score of a post becomes 0, it is regarded as inactive thereafter, removed from the rank but remains in the database for other purpose. This inactive post will no longer active even users comment on it in the future.

I've got a problem implementing a timer for each record in neo4j database.

Assume we have 1 million records, they are either a post or a comment of a post. They both have 10 points when created, and will decrease one point every 24 hours since creation until it becomes 0 and is regarded as inactive (set off the timer).

My naive thought is that we create an attribute of score and update the attribute every 24 hours. But given the size of the data (is growing when users create posts and comments), the intensive database operations are really slow.

Is there any way we could use to implement a timer, or other methods that could meet the needs?

Thanks.

Community
  • 1
  • 1
  • You might want to take a look at the neo4j-expire module, your use case is not handled out of the box but can serve as a good foundation for building your module in Java. https://github.com/graphaware/neo4j-expire – Christophe Willemsen Mar 20 '16 at 21:50
  • 1
    I don't know your case,but why do you need timer,isn't creation time sufficient? You can calculate remaining time whenever you call some node. Sorry if offtopic) – Evgen Mar 21 '16 at 04:34
  • @Evgen, I just update the use case. Hope it clarifies my needs. –  Mar 21 '16 at 19:13
  • @ChristopheWillemsen, thank you. It's a good starting point. –  Mar 21 '16 at 19:14

2 Answers2

0

As @Evgen suggests, one way would be to just store creation time and calculate the score as you query.

For example:

Create a comment:

CREATE (c:Comment {creationTime: timestamp(), text: {some_text}})

The timestamp() Cypher function will evaluate to the current server time, using milliseconds since the epoch format.

Query for all active comments:

MATCH (c:Comment) WHERE c.creationTime < (timestamp() + 86400000)
RETURN c

Find all Comments that were created within 24 hours. Be sure to create an index on the creationTime property:

CREATE INDEX ON Comment(creationTime);

Calculate and return points for all active comments:

Let's say you want to find all active comments for a given post and return the comments, ordered by score.

MATCH (p:Post {name: "somepost"})<-[:IS_ABOUT]-(c:Comment) 
WHERE c.creationTime < (timestamp() + 86400000)
RETURN c, 10 - ((timestamp() - c.creationTime) / 3600000) AS points
William Lyon
  • 8,371
  • 1
  • 17
  • 22
  • Sorry, I didn't make it clear enough. What I am looking for is some kind of auto trigger mechanism that will do the job as I desire. I'm not sure whether neo4j (or its plugins) provides this functionality or not. To my understanding, we have to trigger these queries but how. And do we need to bind every record with this trigger if it existed? –  Mar 22 '16 at 07:06
0

My naive thought is that we create an attribute of score and update the attribute every 24 hours. But given the size of the data (is growing when users create posts and comments), the intensive database operations are really slow.

I think the solution here is to change label of Comment and Post after the first time you understood that it's no more active. Thus you won't have to run through all the comments and posts each time and you will have only relevant. The implementing might be different. You can still run queries like

    MATCH (c:Comment) Set c.points = c.points-1 with c 
    Match(c) where c.points = 0 remove c:Comment set c:InactiveComment

Or you can use creation time.

updated:

If you don't want to remove label Comment. Whenever you create new Post or Comment you can set to them labels :New:Comment, :New:Post. And after post , comment became inactive you remove label New. You will still have good performance as :New:Comment << :Comment thus you get benefits from querying only through relevant data whenever you need.

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Evgen
  • 1,278
  • 3
  • 13
  • 25
  • Thanks. But I don't understand the first query, MATCH (c:Comment) Set c.points = c.points-1 with c. What does it do here ? –  Mar 22 '16 at 06:21
  • Sorry, I didn't make it clear enough. What I am looking for is some kind of auto trigger mechanism that will do the job as I desire. I'm not sure whether neo4j (or its plugins) provides this functionality or not. To my understanding, we have to trigger these queries but how. And do we need to bind every record with this trigger if it existed? –  Mar 22 '16 at 07:05
  • I assumed that you already have tool to triger programm once for 24 hours so this query decreases points on comment. And when points reaches 0 it changes label, so in future you don't get irrelevant data. If you don't have this tool try to look for example how to run neo4j query in java and than look here http://stackoverflow.com/questions/7577620/linux-start-up-script-for-java-application Also i know that it's possible to configure bush script to run query, so you might want to look for this possibility – Evgen Mar 22 '16 at 08:20
  • Does it mean that we have to register for every record so each of them can be updated in time? –  Mar 22 '16 at 22:52
  • I really don't get what you mean by "Does it mean that we have to register for every record" as I asume that you will have all records in your database anyway. – Evgen Mar 23 '16 at 04:21
  • The main difference from @William Lyon advice is that you should exclude from search every record that is no more active. The way you will calculate is it active or not is up to you. Time is a good option storing points in record is also an option (but you might get bed consequences if for some reason you fail to update data every day). So thinking about it you should use creation time as creteria to mark node as inactive and exclude from search query. – Evgen Mar 23 '16 at 04:43
  • For example you can initally mark all the nodes with label new and than run query: MATCH (c:New:Comment) WHERE c.creationTime < (timestamp() + 86400000) remove c:New RETURN c – Evgen Mar 23 '16 at 04:43