1

I'm running a program in spark and opening a CSV-file and creating instances in parallel. I have a similar problem as the code-snippet below (from http://spark.apache.org/docs/latest/sql-programming-guide.html).

JavaRDD<Person> people = sc.textFile("examples/src/main/resources/people.txt").map(
  new Function<String, Person>() {
    public Person call(String line) throws Exception {
      String[] parts = line.split(",");

      Person person = new Person();
      person.setName(parts[0]);
      person.setAge(Integer.parseInt(parts[1].trim()));

      return person;
      }
    }
);

If i wanted to assign unique ID's to all these persons, how would I go about since it's done in parallel?

0 Answers0